Skip to content

This is a data wrangling project where the main objective was to parse data from multiple large JSON files in usable format for further analysis. In this project, several tools were developed and applied on datasets related to the Spanish League for demonstration purpose. The resulting datasets can be used for further exploratory analysis, clust…

Notifications You must be signed in to change notification settings

MuhammadMarufSazed/Data_wrangling_analysis_soccer_data

Repository files navigation

Data_wrangling_soccer_with_Python

This is a data wrangling and analysis project where the main objective was to parse data from multiple large JSON files in usable format for further analysis. In this project, several tools were developed and applied on datasets related to the Spanish League for demonstration purpose. The resulting datasets can be used for further exploratory analysis, clustering, machine learning, and developing statistical models.

These datasets contain information about events in every soccer match that took place in leagues (2017-18) in England, Germany, Spain, France, Italy, UEFA Champions League, and the 2018 World Cup.

The original datasets are available in JSON and csv formats in here: https://springernature.figshare.com/articles/Metadata_record_for_A_public_data_set_of_spatio-temporal_match_events_in_soccer_competitions/9711164

Relevant paper could be found from here: https://www.nature.com/articles/s41597-019-0247-7

The file utils_football.py contains the tools that can be used for data wrangling. Both the utils_football.py and the main files contain comments to facilitate easy understanding.

A walkthrough of the data wrangling part of the project can be found at https://marufsazed.medium.com/data-wrangling-project-with-python-eee40b460fed

About

This is a data wrangling project where the main objective was to parse data from multiple large JSON files in usable format for further analysis. In this project, several tools were developed and applied on datasets related to the Spanish League for demonstration purpose. The resulting datasets can be used for further exploratory analysis, clust…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published