Skip to content
Predicting recently drafted NFL quarterbacks' respective chances at success
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
python
r
LICENSE.md
README.md

README.md

Assigning Success Likelihoods to Recently Drafted Quarterbacks

This project attempts to predict quarterbacks' NFL success - both for recently drafted quarterbacks, and for those whose career outcomes we already know. We first develop an objective definition of "success", and use this definition to classify quarterbacks as either "successful" or "unsuccessful." Then, using the players' college statistics and physical attributes, we use random forest modeling to predict the quarterbacks' respective chances at success. Ultimately, we end up with a series of success likelihood projection distributions for a number of quarterbacks drafted within the last two years, and analyze these results to determine the recent draftees most likely to succeed.

A full description of the project can be found at saisenberg.com.

Getting started

Prerequisite software

  • Python (suggested install through Anaconda)

  • R

Prerequisite libraries

  • Python:

    • imblearn (!pip install imblearn)
    • matplotlib, numpy, os, pandas, seaborn, sklearn, warnings (all installed with Anaconda)
  • R:

lib <- c('data.table', 'dplyr', 'ggplot2', 'gtools', 'htmltab', 'jsonlite', 'rvest', 'stringr')
install_packages(lib)

Instructions for use

1. Run /r/cfb_stats.R

This program collects, cleans, and aggregates college football quarterback data from sports-reference.com.

The output of /r/cfb_stats.R can also be found at /data/cfb_stats.csv.

2. Run /r/nfl_stats.R

This program collects, cleans, and aggregates NFL quarterback data from pro-football-reference.com, and classifies quarterbacks as "successes" or "non-successes." Additionally, the program merges these classifications with the previously-scraped college football quarterback data.

The output of /r/nfl_stats.R can also be found at /data/prev_drafted_qb.csv and /data/recent_drafted_qb.csv.

3. Run the code contained in /python/quarterbacks.ipynb

This code attempts predict quarterback success using the previously-scraped college football quarterback data. After first iterating through the random forest algorithm twice, the code repeats the process ten thousand times, each time generating different success likelihood predictions for recently drafted quarterbacks. Results for each quarterback are then aggregated and analyzed.

The output of /python/quarterbacks.ipynb has been included within the iPython Notebook file.

Author

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Acknowledgments

You can’t perform that action at this time.