The goal of this project is to build machine learning models to predict the winners of 2015 Super Bowl and the College Football Championship using historical game data.
We have predicted the outcome of football matches entirely using the knowledge of previous game statistics. We have used three different models to do this:
- Baseline model:
Point Score Difference Model
. In this model we use the score difference to predict winners of future games. - Linear Regression Model: In this model, we use linear regression to predict the point difference for each game.
- PageRank Model: Here, we model the game data as a graph with nodes as teams and edges as score differences between the teams. We then use PageRank on this game graph to rank all the teams. This ranking is used to predict winners of future games.
- Super Bowl: Seattle Seahawks
- College Champions: Oregon
Most files are IPython notebooks (.ipynb
extension with JSON data).
The following modules are used in at least one example:
- Python 2.7
- NumPy
- Pandas
- Scipy
- scikit-learn
- NetworkX
- nltk
- seaborn
- Matplotlib
- IPython 0.13+
- cPickle
You can view the notebooks in the IPython notebook viewer (see links below).
- Baseline Models:
- Linear Regression Models:
- Page Rank Models:
- Initial Models: Notebooks
- Chetan Naik
- Ankit Arun
- Sindhuri Mamidi
- Beijie Li