GitHub - leenabuomar712/FifaPlayersDataAnalysisProjectUsingSpark

FIFA Players Analysis

This is a Spark application that demonstrates data preprocessing and analysis using the Apache Spark framework. The application processes FIFA player data and performs various analytical tasks on the data.

Getting Started

To run the application, follow these steps:

Clone the repository to your local machine: git clone https://github.com/your-username/SparkExample.git
Make sure you have Apache Spark installed on your machine.
Update the paths to your dataset files in the code:
- playerData: Update the path to the FIFA player dataset CSV file.
- countriesData: Update the path to the countries' continent dataset CSV file.
- newDatasetPath: Update the path to the updated salary dataset CSV file.
- s3OutputPath: specify an S3 output path. don't forget to create a free tier AWS account
- updatedSalaryOutputPath: Specify the directory where the processed data will be saved.
Open a terminal and navigate to the project directory.
Compile and run the application using the following command: spark-submit --class application.SparkExample --master local path/to/SparkExample.jar
The application will preprocess the dataset, perform various analytical tasks, and save the results in the specified output directory.

Preprocessing and Analysis

The application performs the following preprocessing steps on the dataset:

Removes euro signs from value columns.
Converts 'M' to 1000000 and 'K' to 1000 in value and salary columns.
Removes special characters from value and salary columns.
Converts name, nationality, and club columns to lowercase.
Removes spaces from various columns.

The application then performs the following analytical tasks:

Aggregates and ranks countries by player income.
Determines the most valuable club based on player value.
Ranks clubs by total salary spending.
Identifies the continent with the highest average FIFA score.

Dependencies

Apache Spark

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
src/main/scala		src/main/scala
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

leenabuomar712/FifaPlayersDataAnalysisProjectUsingSpark

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages