Tools:
- pandas
- seaborn
- numpy
- matplotlib
Data source: https://www.kaggle.com/secareanualin/football-events
Introduction
This project explores sports data related to Football Events. The dataset comes from Kaggle, including a granular view of 9,074 living games, from the biggest 5 European soccer leagues: England, Spain, Germany, Italy, and France, for the 2011 to 2016 seasons.
File Descriptions
- events.csv contains event data about each game. Text commentary was scraped from: bbc.com, espn.com and onefootball.com
- ginf.csv - contains metadata and market odds about each game. Odds were collected from oddsportal.com
- assist_method.csv, bodypart.csv, event_type.xlsx, event_type2.xlsx, location.csv, shot_outcome.csv, shot_place.csv, side.csv, and situation.csv contain dictionaries with the textual description of each categorical variable coded with integers
- Write A Data Science Blog Project.ipynb contains detailed EDA steps
Summary
- Spain has had the most goals over the term of this data.
- For the statistical distribution of goals in all leagues for body part, right foot dominates over left foot and head.
- Pie-chart shows Pass is the best assist method.
Acknowledgements
Kaggle
Here is the link to the corresponding Blog: https://medium.com/@jingxianlin/football-events-c2da9fc4a738