Intro to AI mid-sem project.
- John Anatsui Edem.
- David Saah.
In sports prediction, large numbers of factors including the historical performance of the teams, results of matches, and data on players, have to be accounted for to help different stakeholders understand the odds of winning or losing.
In this project, we are tasked to build a model(s) that predict a player's overall rating given the player's profile.
- Data collection and labelling.
- Acquire data.
- Data cleaning.
- Imputing missing values.
- Data processing.
- Feature selection.
- Feature subsetting.
- Normalising data.
- Scaling data.
- Get training & testing data.
- Train the model with cross-validation.
- Test the accuracy of the model.
- Fine tune model (optimisation).
- Use different models.
- Train 3 models.
- Perform ensembling.
- app: Source code for model deployment.
- data: Datasets
- players_21.csv -> training data.
- players_22.csv -> testing data.
- demo: Demo video.
- models: Saved models.
- src: Source codes for model training. (.py and .ipynb files)
- potential
- wage_eur
- passing
- dribbling
- attacking_short_passing
- movement_reactions
- power_shot_power
- mentality_vision
- mentality_composure
- XGBoost Regressor
- Random Forest Regressor
- AdaBoost Regressor
- Random forest model is very large compared to XGBoost and AdaBoost.
- XGBoost and AdaBoost have similar performance, but XGBoost is performs better.
- R-squared score for XGBoost is 0.94 while that of AdaBoost is 0.86.
- XGBoost is the best model for this dataset.
- Website link: https://01-sportsprediction.streamlit.app
Youtube link: https://youtu.be/mU940v4Ysko