Skip to content

knaggita/Football-Dataset-Analysis

Repository files navigation

Football Dataset Analysis project

What is this project?

Football Dataset Analysis is a group project meant to study, analyse and extract information from the kaggle football dataset.

What does Kaggle mean?

Kaggle, according to Wikipedia "is an online community of data scientists and machine learners, owned by Google, Inc that allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges."

What is Football Dataset Analysis project about?

The project is aimed at studying the kaggle football dataset, to analyse, extract information from it and make predictions based on the data.

The main goal is to find the weaknesses and strengths of the team and assess the ways of measurement and improvement of the team performance.

We got the most effective events and capitalised on their characteristics in order to achieve the set goal.

Examples of events used to determine the extent of the teams' weaknesses:

  1. Yellow and red cards served
  2. Fouls against the team
  3. Shoots off target
  4. Penalty against the team
  5. Offsides
  6. Shot place
  7. Too high
  8. High and wide
  9. Goals against the team

Examples of events used to determine the extent of the teams' strengths:

  1. Goals scored
  2. Penalty
  3. Cornes
  4. Shoots on target
  5. Shoots in bar
  6. Shot_place
  7. Location
  8. Penalty spot
  9. Very close range
  10. Difficult angle and long range
  11. Difficult angle on the left
  12. Difficult angle on the right

Below are the tasks we have accomplised:

  1. Determined the probability of the team winning in a league
  2. Determined the best combination of players and strategies to increase the probability of winning
  3. Determined the players that make a great team based on features like attempts, strikes, assists, goals scored and so on
  4. Predicted the highly paid player based on performance in the teams and league at large.
  5. Predicted cards both yellow and red served to a team
  6. Determined the effect of cards on the team's performance
  7. Found the relationship between receiving cards in the first half and performance in the second half.
  8. Determined the teams that are likely to attack from a given flank
  9. Determined the correlation between strategies, ball possession and probabilility of winning.
  10. Determined the qualitites that make a player crucial to the team's success

How can I contribute to the Football Dataset Analysis project?

Steps to follow

  1. Fork the project to your respository
  2. Clone the project repository and run the project, refer to the "How to setup section" to know how to do that.
  3. Study the project code, suggest changes, rectificiations and any inquiries you might have about the project
  4. To suggest these changes, you can either
    1. Open an issue on the project where you highlight what you think needs to be changed and how you would like it to be after the change
    2. Make the necessary changes you might want to see in the project, then push your your new changes to abranch on your repository. Thereafter, request for your changes to be reviewed by opening a pull request on the main repository. You can review the GitHub artice and other resources to learn to make a pull request.

Tools used

Tools and librariesused for development;

  • Editor: Jupyter Notebook
  • Programming language: Python 3
  • Libraries:
    1. numpy
    2. zipfile
    3. warning
    4. matplotlib
    5. pandas
    6. seaborn
    7. sklearn
    8. venn
    9. pytorch

How to set up the enviroment?

  • Assuming you have python3 and all the dependencies installed.
  • Create a directory and change to that directory
  > mkdir Data analysis
  > cd Data analysis
  • Clone this repository.
> git clone https://github.com/knaggita/Football-Dataset-Analysis
  • Run Jupyter notebook from commandline
> jupyter notebook
  • Run the application.
> Open the application and run