# NFL Data Models and Results

This notebook will go through the methodology of data collection, cleaning, and return a finished dataset of a narrowed-down analysis to passing/running plays in the 2018 NFL season. The data is obtained from the NFL's 2020 and 2021 Big Data Bowl:

*There is potential to combine the 2020 data bowl data, which contains similar info to 2021 data bowl data except about rushing plays 2017-2019. Combining these sources to produce a similar notebook to the original tendency analysis. Less data, but more information in our columns.*

The main focus of this analysis is to see how offensive / defensive personnel and the formation and defensive players on the field affect the decision to run or pass the ball. This can be a unique opportunity to utilize tracking / location data of players as well (which may be explored in a separate notebook).

Data bowls for reference:
- https://www.kaggle.com/c/nfl-big-data-bowl-2020: Forecast yardage gained on the run plays
- https://www.kaggle.com/c/nfl-big-data-bowl-2021/: Forecast yardage gained on pass plays
- https://www.kaggle.com/c/nfl-big-data-bowl-2022: Analyze special teams data
- https://github.com/nfl-football-ops/Big-Data-Bowl: Inaugural data bowl from 2019, useful R code on animation of tracking

In [8]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
import xgboost as xgb
import os
pd.set_option('display.max_columns', None)
pd.options.mode.chained_assignment = None  # default='warn'

In [10]:
# read in cleaned dataset
bdb = pd.read_csv('bdb_2018.csv')
bdb

Unnamed: 0,Quarter,Down,Distance,DefendersInTheBox,Week,Temperature,Humidity,WindSpeed,Type,Time Left in Quarter,Distance for TD,ScoreDifference,PosTeamHome,Redzone,Under2Min,NumQB,NumRB,NumWR,NumTE,NumOL,NumDL,NumLB,NumDB,NumDOther,OffTeam_ARI,OffTeam_ATL,OffTeam_BAL,OffTeam_BUF,OffTeam_CAR,OffTeam_CHI,OffTeam_CIN,OffTeam_CLE,OffTeam_DAL,OffTeam_DEN,OffTeam_DET,OffTeam_GB,OffTeam_HOU,OffTeam_IND,OffTeam_JAX,OffTeam_KC,OffTeam_LA,OffTeam_LAC,OffTeam_MIA,OffTeam_MIN,OffTeam_NE,OffTeam_NO,OffTeam_NYG,OffTeam_NYJ,OffTeam_OAK,OffTeam_PHI,OffTeam_PIT,OffTeam_SEA,OffTeam_SF,OffTeam_TB,OffTeam_TEN,OffTeam_WAS,OffenseFormation_EMPTY,OffenseFormation_I_FORM,OffenseFormation_JUMBO,OffenseFormation_PISTOL,OffenseFormation_SHOTGUN,OffenseFormation_SINGLEBACK,OffenseFormation_WILDCAT,StadiumType_Indoor,StadiumType_Outdoor,Turf_Grass,Turf_Turf,GameWeather_Clear,GameWeather_Cloudy,GameWeather_Indoor,GameWeather_Rain,GameWeather_Snow,DefTeam_ARI,DefTeam_ATL,DefTeam_BAL,DefTeam_BUF,DefTeam_CAR,DefTeam_CHI,DefTeam_CIN,DefTeam_CLE,DefTeam_DAL,DefTeam_DEN,DefTeam_DET,DefTeam_GB,DefTeam_HOU,DefTeam_IND,DefTeam_JAX,DefTeam_KC,DefTeam_LA,DefTeam_LAC,DefTeam_MIA,DefTeam_MIN,DefTeam_NE,DefTeam_NO,DefTeam_NYG,DefTeam_NYJ,DefTeam_OAK,DefTeam_PHI,DefTeam_PIT,DefTeam_SEA,DefTeam_SF,DefTeam_TB,DefTeam_TEN,DefTeam_WAS
0,1,2,5,7.0,1.0,81.0,71.0,8.0,run,14.366667,70,0.0,0,0,0,1,2,2,1,5,4,2,5,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
1,1,1,10,7.0,1.0,81.0,71.0,8.0,run,13.766667,59,0.0,0,0,0,1,1,3,1,5,4,2,5,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
2,1,1,6,7.0,1.0,81.0,71.0,8.0,run,12.250000,6,0.0,0,1,0,1,1,3,1,5,4,2,5,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
3,1,2,1,10.0,1.0,81.0,71.0,8.0,run,11.683333,1,0.0,0,1,0,1,2,0,3,5,6,3,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
4,1,4,1,11.0,1.0,81.0,71.0,8.0,run,10.916667,1,0.0,0,1,0,1,2,0,3,5,6,3,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
29770,4,2,2,6.0,16.0,60.0,89.0,5.0,pass,2.316667,67,-12.0,1,0,1,1,1,3,1,5,1,5,5,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
29771,4,1,10,4.0,16.0,60.0,89.0,5.0,pass,2.000000,60,-12.0,1,0,1,1,1,3,1,5,1,4,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
29772,4,1,10,5.0,16.0,60.0,89.0,5.0,pass,1.683333,43,-12.0,1,0,1,1,1,3,1,5,1,4,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
29773,4,2,10,4.0,16.0,60.0,89.0,5.0,pass,1.616667,43,-12.0,1,0,1,1,1,3,1,5,1,4,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


# Models

# Results