# NFL Play Prediction in Any Down / Situation

We have had plenty of papers address when to go for it on 4th down, or the most optimal play to call on 3rd down. In this notebook, we will explore tendency analysis across the league and individual teams.

This notebook will go through the methodology of data collection, analysis, and results through multiple lenses. Once I explore the play-by-play data I have and any augmentations to the data, I will determine next steps and perhaps find a way to find some insights on the best plays to call given a game situation (i.e. 1st down, 2nd and short, 2nd and long, 3rd and short, 3rd and long, 4th). This will be quantified by the target variable of yards gained per play*. We may consider rushing and passing plays -- we do not consider fake punts / kicks.

*Or perhaps EPA? Does EPA take in post-snap information? Look into this further.

In [16]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os
pd.set_option('display.max_columns', None)

## Data Scraping via nflfastR Package

We collect data for the past decade, 2010-2020. This is an arbitrary amount, but it reflects the shift of the NFL to a more pass-heavy league. Perhaps in subsamples of the data this will hold even more true, since there are considerable rule amendments that can affect the style of play each season. I used the following R code to pull the data. It was incredibly fast and easy to obtain via the nflFastR package and saved lots of time in scraping.

https://www.nflfastr.com/articles/nflfastR.html

```
require(nflfastR)
decade <- load_pbp(2010:2020)
write.csv(decade, file='nfl_pbp_2010-2020.csv')
```

This file is imported below, and is 983 MB. There is a considerable amount of data that can be chopped down to reduce the dimensionality of the data as well.

In [17]:
pbp = pd.read_csv('nfl_pbp_2010-2020.csv')
pbp = pbp.drop('Unnamed: 0', axis=1)

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


In [18]:
pbp

Unnamed: 0,play_id,game_id,old_game_id,home_team,away_team,season_type,week,posteam,posteam_type,defteam,side_of_field,yardline_100,game_date,quarter_seconds_remaining,half_seconds_remaining,game_seconds_remaining,game_half,quarter_end,drive,sp,qtr,down,goal_to_go,time,yrdln,ydstogo,ydsnet,desc,play_type,yards_gained,shotgun,no_huddle,qb_dropback,qb_kneel,qb_spike,qb_scramble,pass_length,pass_location,air_yards,yards_after_catch,run_location,run_gap,field_goal_result,kick_distance,extra_point_result,two_point_conv_result,home_timeouts_remaining,away_timeouts_remaining,timeout,timeout_team,td_team,td_player_name,td_player_id,posteam_timeouts_remaining,defteam_timeouts_remaining,total_home_score,total_away_score,posteam_score,defteam_score,score_differential,posteam_score_post,defteam_score_post,score_differential_post,no_score_prob,opp_fg_prob,opp_safety_prob,opp_td_prob,fg_prob,safety_prob,td_prob,extra_point_prob,two_point_conversion_prob,ep,epa,total_home_epa,total_away_epa,total_home_rush_epa,total_away_rush_epa,total_home_pass_epa,total_away_pass_epa,air_epa,yac_epa,comp_air_epa,comp_yac_epa,total_home_comp_air_epa,total_away_comp_air_epa,total_home_comp_yac_epa,total_away_comp_yac_epa,total_home_raw_air_epa,total_away_raw_air_epa,total_home_raw_yac_epa,total_away_raw_yac_epa,wp,def_wp,home_wp,away_wp,wpa,vegas_wpa,vegas_home_wpa,home_wp_post,away_wp_post,vegas_wp,vegas_home_wp,total_home_rush_wpa,total_away_rush_wpa,total_home_pass_wpa,total_away_pass_wpa,air_wpa,yac_wpa,comp_air_wpa,comp_yac_wpa,total_home_comp_air_wpa,total_away_comp_air_wpa,total_home_comp_yac_wpa,total_away_comp_yac_wpa,total_home_raw_air_wpa,total_away_raw_air_wpa,total_home_raw_yac_wpa,total_away_raw_yac_wpa,punt_blocked,first_down_rush,first_down_pass,first_down_penalty,third_down_converted,third_down_failed,fourth_down_converted,fourth_down_failed,incomplete_pass,touchback,interception,punt_inside_twenty,punt_in_endzone,punt_out_of_bounds,punt_downed,punt_fair_catch,kickoff_inside_twenty,kickoff_in_endzone,kickoff_out_of_bounds,kickoff_downed,kickoff_fair_catch,fumble_forced,fumble_not_forced,fumble_out_of_bounds,solo_tackle,safety,penalty,tackled_for_loss,fumble_lost,own_kickoff_recovery,own_kickoff_recovery_td,qb_hit,rush_attempt,pass_attempt,sack,touchdown,pass_touchdown,rush_touchdown,return_touchdown,extra_point_attempt,two_point_attempt,field_goal_attempt,kickoff_attempt,punt_attempt,fumble,complete_pass,assist_tackle,lateral_reception,lateral_rush,lateral_return,lateral_recovery,passer_player_id,passer_player_name,passing_yards,receiver_player_id,receiver_player_name,receiving_yards,rusher_player_id,rusher_player_name,rushing_yards,lateral_receiver_player_id,lateral_receiver_player_name,lateral_receiving_yards,lateral_rusher_player_id,lateral_rusher_player_name,lateral_rushing_yards,lateral_sack_player_id,lateral_sack_player_name,interception_player_id,interception_player_name,lateral_interception_player_id,lateral_interception_player_name,punt_returner_player_id,punt_returner_player_name,lateral_punt_returner_player_id,lateral_punt_returner_player_name,kickoff_returner_player_name,kickoff_returner_player_id,lateral_kickoff_returner_player_id,lateral_kickoff_returner_player_name,punter_player_id,punter_player_name,kicker_player_name,kicker_player_id,own_kickoff_recovery_player_id,own_kickoff_recovery_player_name,blocked_player_id,blocked_player_name,tackle_for_loss_1_player_id,tackle_for_loss_1_player_name,tackle_for_loss_2_player_id,tackle_for_loss_2_player_name,qb_hit_1_player_id,qb_hit_1_player_name,qb_hit_2_player_id,qb_hit_2_player_name,forced_fumble_player_1_team,forced_fumble_player_1_player_id,forced_fumble_player_1_player_name,forced_fumble_player_2_team,forced_fumble_player_2_player_id,forced_fumble_player_2_player_name,solo_tackle_1_team,solo_tackle_2_team,solo_tackle_1_player_id,solo_tackle_2_player_id,solo_tackle_1_player_name,solo_tackle_2_player_name,assist_tackle_1_player_id,assist_tackle_1_player_name,assist_tackle_1_team,assist_tackle_2_player_id,assist_tackle_2_player_name,assist_tackle_2_team,assist_tackle_3_player_id,assist_tackle_3_player_name,assist_tackle_3_team,assist_tackle_4_player_id,assist_tackle_4_player_name,assist_tackle_4_team,tackle_with_assist,tackle_with_assist_1_player_id,tackle_with_assist_1_player_name,tackle_with_assist_1_team,tackle_with_assist_2_player_id,tackle_with_assist_2_player_name,tackle_with_assist_2_team,pass_defense_1_player_id,pass_defense_1_player_name,pass_defense_2_player_id,pass_defense_2_player_name,fumbled_1_team,fumbled_1_player_id,fumbled_1_player_name,fumbled_2_player_id,fumbled_2_player_name,fumbled_2_team,fumble_recovery_1_team,fumble_recovery_1_yards,fumble_recovery_1_player_id,fumble_recovery_1_player_name,fumble_recovery_2_team,fumble_recovery_2_yards,fumble_recovery_2_player_id,fumble_recovery_2_player_name,sack_player_id,sack_player_name,half_sack_1_player_id,half_sack_1_player_name,half_sack_2_player_id,half_sack_2_player_name,return_team,return_yards,penalty_team,penalty_player_id,penalty_player_name,penalty_yards,replay_or_challenge,replay_or_challenge_result,penalty_type,defensive_two_point_attempt,defensive_two_point_conv,defensive_extra_point_attempt,defensive_extra_point_conv,safety_player_name,safety_player_id,season,cp,cpoe,series,series_success,series_result,order_sequence,start_time,time_of_day,stadium,weather,nfl_api_id,play_clock,play_deleted,play_type_nfl,special_teams_play,st_play_type,end_clock_time,end_yard_line,fixed_drive,fixed_drive_result,drive_real_start_time,drive_play_count,drive_time_of_possession,drive_first_downs,drive_inside20,drive_ended_with_score,drive_quarter_start,drive_quarter_end,drive_yards_penalized,drive_start_transition,drive_end_transition,drive_game_clock_start,drive_game_clock_end,drive_start_yard_line,drive_end_yard_line,drive_play_id_started,drive_play_id_ended,away_score,home_score,location,result,total,spread_line,total_line,div_game,roof,surface,temp,wind,home_coach,away_coach,stadium_id,game_stadium,aborted_play,success,passer,passer_jersey_number,rusher,rusher_jersey_number,receiver,receiver_jersey_number,pass,rush,first_down,special,play,passer_id,rusher_id,receiver_id,name,jersey_number,id,fantasy_player_name,fantasy_player_id,fantasy,fantasy_id,out_of_bounds,home_opening_kickoff,qb_epa,xyac_epa,xyac_mean_yardage,xyac_median_yardage,xyac_success,xyac_fd,xpass,pass_oe
0,1,2010_01_ARI_STL,2010091208,LA,ARI,REG,1,,,,,,2010-09-12,900.0,1800.0,3600.0,Half1,0,,0,1,,0,15:00,LA 30,0,,GAME,,,0,0,,0,0,0,,,,,,,,,,,3,3,,,,,,,,0,0,,,,,,,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,,,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,,,,,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.422024,0.577976,0.577976,0.422024,0.000000,0.000000,0.000000e+00,,,0.604065,0.395935,0.000000,0.000000,0.000000,0.000000,,,,,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,,,,,,,,,,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,,,,,,,,,2010,,,1,1,First down,1.0,16:15:00,,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,GAME_START,0,,,,1,Turnover,,,,,,,,,,,,,,,,,,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,,,,,,,,0,0,,0,0,,,,,,,,,,,0,0,,,,,,,,
1,36,2010_01_ARI_STL,2010091208,LA,ARI,REG,1,ARI,away,LA,LA,30.0,2010-09-12,900.0,1800.0,3600.0,Half1,0,1.0,0,1,,0,15:00,LA 30,0,53.0,3-Josh.Brown kicks 70 yards from STL 30 to ARI...,kickoff,0.0,0,0,0.0,0,0,0,,,,,,,,,,,3,3,0.0,,,,,3.0,3.0,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.004885,0.164107,0.005645,0.300111,0.198696,0.003165,0.323388,0.0,0.0,0.261746,0.043582,-0.043582,0.043582,0.000000,0.000000,0.000000,0.000000,,,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.422024,0.577976,0.577976,0.422024,0.011885,0.010446,-1.044601e-02,0.566091,0.433909,0.604065,0.395935,0.000000,0.000000,0.000000,0.000000,,,0.00000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,L.Stephens-Howling,00-0026956,,,,,Josh.Brown,00-0021940,,,,,,,,,,,,,,,,,,,LA,,00-0025626,,C.Ah You,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,ARI,22.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2010,,,1,1,First down,36.0,16:15:00,20:15:48,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,KICK_OFF,1,,,ARI 22,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,1.0,,,,,,,0,0,0.0,1,0,,,,,,,,,,,0,0,0.043582,,,,,,,
2,58,2010_01_ARI_STL,2010091208,LA,ARI,REG,1,ARI,away,LA,ARI,78.0,2010-09-12,895.0,1795.0,3595.0,Half1,0,1.0,0,1,1.0,0,14:55,ARI 22,10,53.0,(14:55) 3-D.Anderson pass short right to 83-S....,pass,0.0,0,0,1.0,0,0,0,short,right,0.0,0.0,,,,,,,3,3,0.0,,,,,3.0,3.0,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.004996,0.163042,0.005487,0.296447,0.203153,0.003407,0.323469,0.0,0.0,0.305328,-0.564644,0.521062,-0.521062,0.000000,0.000000,0.564644,-0.564644,-0.564644,0.000000,-0.564644,0.000000,0.564644,-0.564644,0.000000,0.000000,0.564644,-0.564644,0.000000,0.000000,0.433909,0.566091,0.566091,0.433909,-0.017920,-0.016377,1.637697e-02,0.584011,0.415989,0.614511,0.385489,0.000000,0.000000,0.017920,-0.017920,-0.01792,0.000000,-0.01792,0.000000,0.017920,-0.017920,0.000000,0.000000,0.017920,-0.017920,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,00-0023645,D.Anderson,0.0,00-0023108,S.Spach,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,LA,,00-0027011,,J.Laurinaitis,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2010,0.708026,29.197353,1,1,First down,58.0,16:15:00,20:16:37,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,PASS,0,,,ARI 22,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,0.0,D.Anderson,3.0,,,S.Spach,83.0,1,0,0.0,0,1,00-0023645,,00-0023108,D.Anderson,3.0,00-0023645,S.Spach,00-0023108,S.Spach,00-0023108,0,0,-0.564644,0.900138,6.992027,6.0,0.690780,0.224250,0.502033,49.796659
3,82,2010_01_ARI_STL,2010091208,LA,ARI,REG,1,ARI,away,LA,ARI,78.0,2010-09-12,864.0,1764.0,3564.0,Half1,0,1.0,0,1,2.0,0,14:24,ARI 22,10,53.0,(14:24) 34-T.Hightower left end to ARI 27 for ...,run,5.0,0,0,0.0,0,0,0,,,,,left,end,,,,,3,3,0.0,,,,,3.0,3.0,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.004976,0.181179,0.005923,0.328268,0.186679,0.003384,0.289591,0.0,0.0,-0.259316,-0.022353,0.543414,-0.543414,0.022353,-0.022353,0.564644,-0.564644,,,0.000000,0.000000,0.564644,-0.564644,0.000000,0.000000,0.564644,-0.564644,0.000000,0.000000,0.415989,0.584011,0.584011,0.415989,-0.012694,-0.004178,4.177511e-03,0.596705,0.403295,0.598135,0.401865,0.012694,-0.012694,0.017920,-0.017920,,,0.00000,0.000000,0.017920,-0.017920,0.000000,0.000000,0.017920,-0.017920,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,00-0026289,T.Hightower,5.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,LA,,00-0027041,,B.Fletcher,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2010,,,1,1,First down,82.0,16:15:00,20:17:13,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,RUSH,0,,,ARI 27,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,0.0,,,T.Hightower,34.0,,,0,1,0.0,0,1,,00-0026289,,T.Hightower,34.0,00-0026289,T.Hightower,00-0026289,T.Hightower,00-0026289,0,0,-0.022353,,,,,,0.499817,-49.981695
4,103,2010_01_ARI_STL,2010091208,LA,ARI,REG,1,ARI,away,LA,ARI,73.0,2010-09-12,823.0,1723.0,3523.0,Half1,0,1.0,0,1,3.0,0,13:43,ARI 27,5,53.0,(13:43) (Shotgun) 3-D.Anderson pass short righ...,pass,18.0,1,0,1.0,0,0,0,short,right,7.0,11.0,,,,,,,3,3,0.0,,,,,3.0,3.0,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.005572,0.179726,0.005620,0.328602,0.194630,0.003174,0.282675,0.0,0.0,-0.281668,2.207573,-1.664158,1.664158,0.022353,-0.022353,-1.642929,1.642929,1.448659,0.758914,1.448659,0.758914,-0.884015,0.884015,-0.758914,0.758914,-0.884015,0.884015,-0.758914,0.758914,0.403295,0.596705,0.596705,0.403295,0.059132,0.047694,-4.769439e-02,0.537573,0.462427,0.593957,0.406043,0.012694,-0.012694,-0.041212,0.041212,0.00000,0.059132,0.00000,0.059132,0.017920,-0.017920,-0.059132,0.059132,0.017920,-0.017920,-0.059132,0.059132,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,00-0023645,D.Anderson,18.0,00-0022921,L.Fitzgerald,18.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,LA,,00-0023501,,O.Atogwe,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2010,0.639793,36.020684,1,1,First down,103.0,16:15:00,20:17:53,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,PASS,0,,,ARI 45,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,1.0,D.Anderson,3.0,,,L.Fitzgerald,11.0,1,0,1.0,0,1,00-0023645,,00-0022921,D.Anderson,3.0,00-0023645,L.Fitzgerald,00-0022921,L.Fitzgerald,00-0022921,1,0,2.207573,0.226114,3.390657,1.0,0.998045,0.998045,0.962868,3.713167
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
531510,4280,2020_21_KC_TB,2021020700,TB,KC,POST,21,KC,away,TB,TB,10.0,2021-02-07,100.0,100.0,100.0,Half2,0,21.0,0,4,2.0,0,01:40,TB 10,6,48.0,"(1:40) (No Huddle, Shotgun) 15-P.Mahomes pass ...",pass,0.0,1,1,1.0,0,0,0,short,right,10.0,,,,,,,,3,0,0.0,,,,,0.0,3.0,31,9,9.0,31.0,-22.0,9.0,31.0,-22.0,0.207054,0.003707,0.000240,0.025470,0.220636,0.001242,0.541650,0.0,0.0,4.266047,-4.266047,21.832680,-21.832680,-6.075673,6.075673,22.489200,-22.489200,2.733953,-7.000000,0.000000,0.000000,-2.733228,2.733228,3.253570,-3.253570,-23.116100,23.116100,42.707620,-42.707620,0.001261,0.998739,0.998739,0.001261,-0.001219,-0.000255,2.551418e-04,0.999959,0.000041,0.000275,0.999725,-0.127071,0.127071,0.406613,-0.406613,0.00000,-0.001219,0.00000,0.000000,0.068036,-0.068036,0.214313,-0.214313,0.068036,-0.068036,0.381718,-0.381718,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,00-0033873,P.Mahomes,,00-0030506,T.Kelce,,,,,,,,,,,,,00-0035707,D.White,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,00-0035707,D.White,,,,,,,,,,,,,,,,,,,,,,,TB,0.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2020,0.351499,-35.149910,65,0,Turnover,4280.0,18:30:00,03:09:08,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,26.0,0,PASS,0,,01:33,TB 0,21,Turnover,,9.0,1:57,2.0,1.0,0.0,4.0,4.0,-10.0,PUNT,INTERCEPTION,03:30,01:33,KC 42,TB 10,4033.0,4280.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,0.0,P.Mahomes,15.0,,,T.Kelce,87.0,1,0,0.0,0,1,00-0033873,,00-0030506,P.Mahomes,15.0,00-0033873,T.Kelce,00-0030506,T.Kelce,00-0030506,0,1,-4.266047,,,,,,0.904422,9.557813
531511,4307,2020_21_KC_TB,2021020700,TB,KC,POST,21,TB,home,KC,TB,80.0,2021-02-07,93.0,93.0,93.0,Half2,0,22.0,0,4,1.0,0,01:33,TB 20,10,-2.0,(1:33) 12-T.Brady kneels to TB 19 for -1 yards.,qb_kneel,-1.0,0,0,0.0,1,0,0,,,,,,,,,,,3,0,0.0,,,,,3.0,0.0,31,9,31.0,9.0,22.0,31.0,9.0,22.0,1.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,21.832680,-21.832680,-6.075673,6.075673,22.489200,-22.489200,,,0.000000,0.000000,-2.733228,2.733228,3.253570,-3.253570,-23.116100,23.116100,42.707620,-42.707620,0.999959,0.000041,0.999959,0.000041,,,5.483627e-06,,,0.999980,0.999980,-0.127071,0.127071,0.406613,-0.406613,,,0.00000,0.000000,0.068036,-0.068036,0.214313,-0.214313,0.068036,-0.068036,0.381718,-0.381718,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,00-0019596,T.Brady,-1.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2020,,,66,0,QB kneel,4307.0,18:30:00,03:10:24,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,13.0,0,RUSH,0,,,TB 19,22,End of half,,3.0,1:33,0.0,0.0,0.0,4.0,4.0,0.0,INTERCEPTION,END_GAME,01:33,00:00,TB 20,TB 19,4307.0,4370.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,0.0,,,T.Brady,,,,0,0,0.0,0,0,,00-0019596,,T.Brady,,00-0019596,T.Brady,00-0019596,T.Brady,00-0019596,0,1,0.000000,,,,,,,
531512,4328,2020_21_KC_TB,2021020700,TB,KC,POST,21,TB,home,KC,TB,81.0,2021-02-07,50.0,50.0,50.0,Half2,0,22.0,0,4,2.0,0,00:50,TB 19,11,-2.0,(:50) 12-T.Brady kneels to TB 19 for no gain.,qb_kneel,0.0,0,0,0.0,1,0,0,,,,,,,,,,,3,0,0.0,,,,,3.0,0.0,31,9,31.0,9.0,22.0,31.0,9.0,22.0,1.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,21.832680,-21.832680,-6.075673,6.075673,22.489200,-22.489200,,,0.000000,0.000000,-2.733228,2.733228,3.253570,-3.253570,-23.116100,23.116100,42.707620,-42.707620,0.999946,0.000054,0.999946,0.000054,,,-8.344650e-07,,,0.999986,0.999986,-0.127071,0.127071,0.406613,-0.406613,,,0.00000,0.000000,0.068036,-0.068036,0.214313,-0.214313,0.068036,-0.068036,0.381718,-0.381718,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,00-0019596,T.Brady,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2020,,,66,0,QB kneel,4328.0,18:30:00,03:11:07,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,1.0,0,RUSH,0,,,TB 19,22,End of half,,3.0,1:33,0.0,0.0,0.0,4.0,4.0,0.0,INTERCEPTION,END_GAME,01:33,00:00,TB 20,TB 19,4307.0,4370.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,0.0,,,T.Brady,,,,0,0,0.0,0,0,,00-0019596,,T.Brady,,00-0019596,T.Brady,00-0019596,T.Brady,00-0019596,0,1,0.000000,,,,,,,
531513,4349,2020_21_KC_TB,2021020700,TB,KC,POST,21,TB,home,KC,TB,81.0,2021-02-07,30.0,30.0,30.0,Half2,0,22.0,0,4,3.0,0,00:30,TB 19,11,-2.0,(:30) 12-T.Brady kneels to TB 18 for -1 yards.,qb_kneel,-1.0,0,0,0.0,1,0,0,,,,,,,,,,,3,0,0.0,,,,,3.0,0.0,31,9,31.0,9.0,22.0,31.0,9.0,22.0,1.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.000000,0.000000,21.832680,-21.832680,-6.075673,6.075673,22.489200,-22.489200,,,0.000000,0.000000,-2.733228,2.733228,3.253570,-3.253570,-23.116100,23.116100,42.707620,-42.707620,0.999933,0.000067,0.999933,0.000067,,,1.513958e-05,,,0.999985,0.999985,-0.127071,0.127071,0.406613,-0.406613,,,0.00000,0.000000,0.068036,-0.068036,0.214313,-0.214313,0.068036,-0.068036,0.381718,-0.381718,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,00-0019596,T.Brady,-1.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.0,,,,,0,,,0.0,0.0,0.0,0.0,,,2020,,,66,0,QB kneel,4349.0,18:30:00,03:11:27,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,23.0,0,RUSH,0,,,TB 18,22,End of half,,3.0,1:33,0.0,0.0,0.0,4.0,4.0,0.0,INTERCEPTION,END_GAME,01:33,00:00,TB 20,TB 19,4307.0,4370.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,0.0,,,T.Brady,,,,0,0,0.0,0,0,,00-0019596,,T.Brady,,00-0019596,T.Brady,00-0019596,T.Brady,00-0019596,0,1,0.000000,,,,,,,


We are provided with 372 columns in this data. We can supplement this dataset with offensive and defensive rankings for each team, win and score probability features, and PFF (Pro Football Focus) grades. I do not want to pay for a PFF subscription, so I will hold that thought for now and perhaps return to it later if fitted models do not perform well or could use additional information.

Before we do that, though, we must clean and augment our original data. First, we remove all features relating to individual player names or statistics on the play, since the intent is to do a team-based analysis. We also will look to remove post-snap features. Any information relating to the outcome of each play will defeat the purpose of our analysis when validating the models and thus the model will be useless if implemented in practice.

In [45]:
# drop columns all at once about:
## specific players
## timeouts
## jersey
to_drop_init = list(pbp.filter(regex='player')) + list(pbp.filter(regex='timeout')) + list(pbp.filter(regex='jersey'))
pbp_drop_init = pbp[pbp.columns.drop(to_drop_init)]

In [53]:
# after perusing other columns, I figured these were not important
to_drop = ['old_game_id', 'play_id', 'season_type', 'posteam_type',
           'week', 'side_of_field', 'game_date', 'quarter_end', 'time',
           'yrdln']
pbp = pbp_drop_init[pbp_drop_init.columns.drop(to_drop)]
pbp[pbp['play_type']=='pass']

Unnamed: 0,game_id,home_team,away_team,posteam,defteam,yardline_100,quarter_seconds_remaining,half_seconds_remaining,game_seconds_remaining,game_half,drive,sp,qtr,down,goal_to_go,ydstogo,ydsnet,desc,play_type,yards_gained,shotgun,no_huddle,qb_dropback,qb_kneel,qb_spike,qb_scramble,pass_length,pass_location,air_yards,yards_after_catch,run_location,run_gap,field_goal_result,kick_distance,extra_point_result,two_point_conv_result,td_team,total_home_score,total_away_score,posteam_score,defteam_score,score_differential,posteam_score_post,defteam_score_post,score_differential_post,no_score_prob,opp_fg_prob,opp_safety_prob,opp_td_prob,fg_prob,safety_prob,td_prob,extra_point_prob,two_point_conversion_prob,ep,epa,total_home_epa,total_away_epa,total_home_rush_epa,total_away_rush_epa,total_home_pass_epa,total_away_pass_epa,air_epa,yac_epa,comp_air_epa,comp_yac_epa,total_home_comp_air_epa,total_away_comp_air_epa,total_home_comp_yac_epa,total_away_comp_yac_epa,total_home_raw_air_epa,total_away_raw_air_epa,total_home_raw_yac_epa,total_away_raw_yac_epa,wp,def_wp,home_wp,away_wp,wpa,vegas_wpa,vegas_home_wpa,home_wp_post,away_wp_post,vegas_wp,vegas_home_wp,total_home_rush_wpa,total_away_rush_wpa,total_home_pass_wpa,total_away_pass_wpa,air_wpa,yac_wpa,comp_air_wpa,comp_yac_wpa,total_home_comp_air_wpa,total_away_comp_air_wpa,total_home_comp_yac_wpa,total_away_comp_yac_wpa,total_home_raw_air_wpa,total_away_raw_air_wpa,total_home_raw_yac_wpa,total_away_raw_yac_wpa,punt_blocked,first_down_rush,first_down_pass,first_down_penalty,third_down_converted,third_down_failed,fourth_down_converted,fourth_down_failed,incomplete_pass,touchback,interception,punt_inside_twenty,punt_in_endzone,punt_out_of_bounds,punt_downed,punt_fair_catch,kickoff_inside_twenty,kickoff_in_endzone,kickoff_out_of_bounds,kickoff_downed,kickoff_fair_catch,fumble_forced,fumble_not_forced,fumble_out_of_bounds,solo_tackle,safety,penalty,tackled_for_loss,fumble_lost,own_kickoff_recovery,own_kickoff_recovery_td,qb_hit,rush_attempt,pass_attempt,sack,touchdown,pass_touchdown,rush_touchdown,return_touchdown,extra_point_attempt,two_point_attempt,field_goal_attempt,kickoff_attempt,punt_attempt,fumble,complete_pass,assist_tackle,lateral_reception,lateral_rush,lateral_return,lateral_recovery,passing_yards,receiving_yards,rushing_yards,lateral_receiving_yards,lateral_rushing_yards,solo_tackle_1_team,solo_tackle_2_team,assist_tackle_1_team,assist_tackle_2_team,assist_tackle_3_team,assist_tackle_4_team,tackle_with_assist,tackle_with_assist_1_team,tackle_with_assist_2_team,fumbled_1_team,fumbled_2_team,fumble_recovery_1_team,fumble_recovery_1_yards,fumble_recovery_2_team,fumble_recovery_2_yards,return_team,return_yards,penalty_team,penalty_yards,replay_or_challenge,replay_or_challenge_result,penalty_type,defensive_two_point_attempt,defensive_two_point_conv,defensive_extra_point_attempt,defensive_extra_point_conv,season,cp,cpoe,series,series_success,series_result,order_sequence,start_time,time_of_day,stadium,weather,nfl_api_id,play_clock,play_deleted,play_type_nfl,special_teams_play,st_play_type,end_clock_time,end_yard_line,fixed_drive,fixed_drive_result,drive_real_start_time,drive_play_count,drive_time_of_possession,drive_first_downs,drive_inside20,drive_ended_with_score,drive_quarter_start,drive_quarter_end,drive_yards_penalized,drive_start_transition,drive_end_transition,drive_game_clock_start,drive_game_clock_end,drive_start_yard_line,drive_end_yard_line,drive_play_id_started,drive_play_id_ended,away_score,home_score,location,result,total,spread_line,total_line,div_game,roof,surface,temp,wind,home_coach,away_coach,stadium_id,game_stadium,aborted_play,success,passer,rusher,receiver,pass,rush,first_down,special,play,passer_id,rusher_id,receiver_id,name,id,fantasy,fantasy_id,out_of_bounds,home_opening_kickoff,qb_epa,xyac_epa,xyac_mean_yardage,xyac_median_yardage,xyac_success,xyac_fd,xpass,pass_oe
2,2010_01_ARI_STL,LA,ARI,ARI,LA,78.0,895.0,1795.0,3595.0,Half1,1.0,0,1,1.0,0,10,53.0,(14:55) 3-D.Anderson pass short right to 83-S....,pass,0.0,0,0,1.0,0,0,0,short,right,0.0,0.0,,,,,,,,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.004996,0.163042,0.005487,0.296447,0.203153,0.003407,0.323469,0.0,0.0,0.305328,-0.564644,0.521062,-0.521062,0.000000,0.000000,0.564644,-0.564644,-0.564644,0.000000,-0.564644,0.000000,0.564644,-0.564644,0.000000,0.000000,0.564644,-0.564644,0.000000,0.000000,0.433909,0.566091,0.566091,0.433909,-0.017920,-0.016377,0.016377,0.584011,0.415989,0.614511,0.385489,0.000000,0.000000,0.017920,-0.017920,-0.017920,0.000000,-0.017920,0.000000,0.017920,-0.017920,0.000000,0.000000,0.017920,-0.017920,0.000000,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,LA,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2010,0.708026,29.197353,1,1,First down,58.0,16:15:00,20:16:37,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,PASS,0,,,ARI 22,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,0.0,D.Anderson,,S.Spach,1,0,0.0,0,1,00-0023645,,00-0023108,D.Anderson,00-0023645,S.Spach,00-0023108,0,0,-0.564644,0.900138,6.992027,6.0,0.690780,0.224250,0.502033,49.796659
4,2010_01_ARI_STL,LA,ARI,ARI,LA,73.0,823.0,1723.0,3523.0,Half1,1.0,0,1,3.0,0,5,53.0,(13:43) (Shotgun) 3-D.Anderson pass short righ...,pass,18.0,1,0,1.0,0,0,0,short,right,7.0,11.0,,,,,,,,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.005572,0.179726,0.005620,0.328602,0.194630,0.003174,0.282675,0.0,0.0,-0.281668,2.207573,-1.664158,1.664158,0.022353,-0.022353,-1.642929,1.642929,1.448659,0.758914,1.448659,0.758914,-0.884015,0.884015,-0.758914,0.758914,-0.884015,0.884015,-0.758914,0.758914,0.403295,0.596705,0.596705,0.403295,0.059132,0.047694,-0.047694,0.537573,0.462427,0.593957,0.406043,0.012694,-0.012694,-0.041212,0.041212,0.000000,0.059132,0.000000,0.059132,0.017920,-0.017920,-0.059132,0.059132,0.017920,-0.017920,-0.059132,0.059132,0.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,18.0,18.0,,,,LA,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2010,0.639793,36.020684,1,1,First down,103.0,16:15:00,20:17:53,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,PASS,0,,,ARI 45,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,1.0,D.Anderson,,L.Fitzgerald,1,0,1.0,0,1,00-0023645,,00-0022921,D.Anderson,00-0023645,L.Fitzgerald,00-0022921,1,0,2.207573,0.226114,3.390657,1.0,0.998045,0.998045,0.962868,3.713167
5,2010_01_ARI_STL,LA,ARI,ARI,LA,55.0,797.0,1697.0,3497.0,Half1,1.0,0,1,1.0,0,10,53.0,(13:17) (Shotgun) 3-D.Anderson pass short righ...,pass,17.0,1,0,1.0,0,0,0,short,right,0.0,17.0,,,,,,,,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.005242,0.108447,0.002357,0.198770,0.281381,0.004683,0.399121,0.0,0.0,1.925904,1.344403,-3.008562,3.008562,0.022353,-0.022353,-2.987332,2.987332,-0.471865,1.816268,-0.471865,1.816268,-0.412150,0.412150,-2.575182,2.575182,-0.412150,0.412150,-2.575182,2.575182,0.462427,0.537573,0.537573,0.462427,0.067985,0.033516,-0.033516,0.469588,0.530412,0.641651,0.358349,0.012694,-0.012694,-0.109197,0.109197,0.000000,0.067985,0.000000,0.067985,0.017920,-0.017920,-0.127117,0.127117,0.017920,-0.017920,-0.127117,0.127117,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,17.0,17.0,,,,LA,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2010,0.703441,29.655892,2,1,First down,132.0,16:15:00,20:18:36,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,PASS,0,,,LA 38,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,1.0,D.Anderson,,T.Hightower,1,0,1.0,0,1,00-0023645,,00-0026289,D.Anderson,00-0023645,T.Hightower,00-0026289,1,0,1.344403,0.853427,7.082316,6.0,0.692088,0.232673,0.477316,52.268356
7,2010_01_ARI_STL,LA,ARI,ARI,LA,36.0,727.0,1627.0,3427.0,Half1,1.0,0,1,2.0,0,8,53.0,(12:07) (Shotgun) 3-D.Anderson pass short righ...,pass,12.0,1,0,1.0,0,0,0,short,right,12.0,0.0,,,,,,,,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.003657,0.070950,0.001583,0.124296,0.392002,0.003218,0.404294,0.0,0.0,2.926410,1.438969,-4.103633,4.103633,0.366250,-0.366250,-4.426301,4.426301,1.438969,0.000000,1.438969,0.000000,-1.851119,1.851119,-2.575182,2.575182,-1.851119,1.851119,-2.575182,2.575182,0.514886,0.485114,0.485114,0.514886,0.036240,0.034314,-0.034314,0.448873,0.551127,0.654869,0.345131,0.028219,-0.028219,-0.145437,0.145437,0.036240,0.000000,0.036240,0.000000,-0.018320,0.018320,-0.127117,0.127117,-0.018320,0.018320,-0.127117,0.127117,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,12.0,12.0,,,,LA,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2010,0.602719,39.728147,3,1,First down,177.0,16:15:00,20:20:00,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,PASS,0,,,LA 24,1,Turnover,,8.0,4:10,3.0,0.0,0.0,1.0,1.0,0.0,KICKOFF,FUMBLE,15:00,10:50,ARI 22,LA 22,36.0,222.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,1.0,D.Anderson,,S.Breaston,1,0,1.0,0,1,00-0023645,,00-0025529,D.Anderson,00-0023645,S.Breaston,00-0025529,0,0,1.438969,0.179422,2.888769,1.0,1.000000,0.999149,0.680533,31.946689
10,2010_01_ARI_STL,LA,ARI,LA,ARI,32.0,650.0,1550.0,3350.0,Half1,2.0,0,1,1.0,0,10,16.0,(10:50) 8-S.Bradford pass short left to 89-M.C...,pass,13.0,0,0,1.0,0,0,0,short,left,13.0,0.0,,,,,,,,0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.003292,0.048256,0.000992,0.069164,0.366438,0.003338,0.508520,0.0,0.0,4.034723,0.770142,5.066611,-5.066611,8.766352,-8.766352,-3.656159,3.656159,0.770142,0.000000,0.770142,0.000000,-1.080977,1.080977,-2.575182,2.575182,-1.080977,1.080977,-2.575182,2.575182,0.698840,0.301160,0.698840,0.301160,0.021333,0.031345,0.031345,0.720173,0.279827,0.521188,0.521188,0.278186,-0.278186,-0.124104,0.124104,0.021333,0.000000,0.021333,0.000000,0.003013,-0.003013,-0.127117,0.127117,0.003013,-0.003013,-0.127117,0.127117,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,13.0,13.0,,,,ARI,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2010,0.617479,38.252097,5,1,First down,254.0,16:15:00,20:22:17,,,10160000-0548-7489-d2f7-f17689f7e9ba,,0,PASS,0,,,ARI 19,2,Missed field goal,,5.0,1:19,1.0,1.0,0.0,1.0,1.0,0.0,FUMBLE,BLOCKED_FG,10:50,09:31,ARI 32,ARI 16,254.0,350.0,17,13,Home,-4,30,-3.0,39.5,1,dome,astroplay,,,Steve Spagnuolo,Ken Whisenhunt,STL00,Edward Jones Dome,0,1.0,S.Bradford,,M.Clayton,1,0,1.0,0,1,00-0027854,,00-0023457,S.Bradford,00-0027854,M.Clayton,00-0023457,1,0,0.770142,0.216104,2.800977,1.0,1.000000,0.998657,0.434337,56.566343
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
531505,2020_21_KC_TB,TB,KC,KC,TB,26.0,141.0,141.0,141.0,Half2,21.0,0,4,2.0,0,4,48.0,"(2:21) (No Huddle, Shotgun) 15-P.Mahomes pass ...",pass,0.0,1,1,1.0,0,0,0,deep,left,21.0,,,,,,,,,31,9,9.0,31.0,-22.0,9.0,31.0,-22.0,0.281819,0.011616,0.000248,0.025908,0.214988,0.001252,0.464169,0.0,0.0,3.679951,-0.446031,18.598760,-18.598760,-6.075673,6.075673,19.255280,-19.255280,1.697329,-2.143360,0.000000,0.000000,-2.433932,2.433932,3.986400,-3.986400,-20.082850,20.082850,36.440450,-36.440450,0.000449,0.999551,0.999551,0.000449,-0.000083,-0.000068,0.000068,0.999635,0.000365,0.000237,0.999763,-0.127071,0.127071,0.406289,-0.406289,0.000000,-0.000083,0.000000,0.000000,0.068622,-0.068622,0.214622,-0.214622,0.068622,-0.068622,0.380808,-0.380808,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2020,,,64,1,First down,4164.0,18:30:00,03:03:58,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,28.0,0,PASS,0,,,TB 26,21,Turnover,,9.0,1:57,2.0,1.0,0.0,4.0,4.0,-10.0,PUNT,INTERCEPTION,03:30,01:33,KC 42,TB 10,4033.0,4280.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,0.0,P.Mahomes,,,1,0,0.0,0,1,00-0033873,,,P.Mahomes,00-0033873,,,0,1,-0.446031,,,,,,0.850396,14.960440
531506,2020_21_KC_TB,TB,KC,KC,TB,26.0,132.0,132.0,132.0,Half2,21.0,0,4,3.0,0,4,48.0,(2:12) (Shotgun) 15-P.Mahomes pass short left ...,pass,1.0,1,0,1.0,0,0,0,short,left,-5.0,6.0,,,,,,,,31,9,9.0,31.0,-22.0,9.0,31.0,-22.0,0.300224,0.014432,0.000254,0.030592,0.270820,0.001268,0.382410,0.0,0.0,3.233920,-1.224870,19.823630,-19.823630,-6.075673,6.075673,20.480150,-20.480150,-1.765566,0.540696,-1.765566,0.540696,-0.668366,0.668366,3.445704,-3.445704,-18.317290,18.317290,35.899760,-35.899760,0.000365,0.999635,0.999635,0.000365,-0.000076,-0.000049,0.000049,0.999710,0.000289,0.000168,0.999832,-0.127071,0.127071,0.406365,-0.406365,0.000000,-0.000076,0.000000,-0.000076,0.068622,-0.068622,0.214698,-0.214698,0.068622,-0.068622,0.380884,-0.380884,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,,,,TB,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2020,0.846163,15.383680,64,1,First down,4186.0,18:30:00,03:04:41,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,8.0,0,PASS,0,,,TB 25,21,Turnover,,9.0,1:57,2.0,1.0,0.0,4.0,4.0,-10.0,PUNT,INTERCEPTION,03:30,01:33,KC 42,TB 10,4033.0,4280.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,0.0,P.Mahomes,,Darr.Williams,1,0,0.0,0,1,00-0033873,,00-0034301,P.Mahomes,00-0033873,Darr.Williams,00-0034301,1,1,-1.224870,1.809247,9.628739,8.0,0.491400,0.491400,0.926345,7.365543
531507,2020_21_KC_TB,TB,KC,KC,TB,25.0,127.0,127.0,127.0,Half2,21.0,0,4,4.0,0,3,48.0,(2:07) (Shotgun) 15-P.Mahomes pass short right...,pass,11.0,1,0,1.0,0,0,0,short,right,7.0,4.0,,,,,,,,31,9,9.0,31.0,-22.0,9.0,31.0,-22.0,0.364957,0.025257,0.000329,0.040763,0.400389,0.001695,0.166609,0.0,0.0,2.009050,2.739866,17.083760,-17.083760,-6.075673,6.075673,17.740280,-17.740280,2.547732,0.192134,2.547732,0.192134,-3.216098,3.216098,3.253570,-3.253570,-20.865020,20.865020,35.707620,-35.707620,0.000289,0.999710,0.999710,0.000289,0.000385,0.000207,-0.000207,0.999325,0.000675,0.000119,0.999881,-0.127071,0.127071,0.405979,-0.405979,0.000000,0.000385,0.000000,0.000385,0.068622,-0.068622,0.214313,-0.214313,0.068622,-0.068622,0.380499,-0.380499,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,11.0,11.0,,,,TB,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2020,0.645270,35.472950,64,1,First down,4215.0,18:30:00,03:05:22,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,6.0,0,PASS,0,,,TB 14,21,Turnover,,9.0,1:57,2.0,1.0,0.0,4.0,4.0,-10.0,PUNT,INTERCEPTION,03:30,01:33,KC 42,TB 10,4033.0,4280.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,1.0,P.Mahomes,,D.Robinson,1,0,1.0,0,1,00-0033873,,00-0032775,P.Mahomes,00-0033873,D.Robinson,00-0032775,0,1,2.739866,0.217080,3.086721,1.0,0.999480,0.999480,0.971725,2.827460
531509,2020_21_KC_TB,TB,KC,KC,TB,14.0,120.0,120.0,120.0,Half2,21.0,0,4,1.0,0,10,48.0,(2:00) (Shotgun) 15-P.Mahomes pass short right...,pass,4.0,1,0,1.0,0,0,0,short,right,4.0,0.0,,,,,,,,31,9,9.0,31.0,-22.0,9.0,31.0,-22.0,0.184954,0.004177,0.000216,0.017178,0.166729,0.001057,0.625689,0.0,0.0,4.748916,-0.482870,17.566630,-17.566630,-6.075673,6.075673,18.223150,-18.223150,-0.482870,0.000000,-0.482870,0.000000,-2.733228,2.733228,3.253570,-3.253570,-20.382150,20.382150,35.707620,-35.707620,0.000675,0.999325,0.999325,0.000675,0.000586,-0.000051,0.000051,0.998739,0.001261,0.000326,0.999674,-0.127071,0.127071,0.405394,-0.405394,0.000586,0.000000,0.000586,0.000000,0.068036,-0.068036,0.214313,-0.214313,0.068036,-0.068036,0.380499,-0.380499,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,4.0,4.0,,,,TB,,,,,,0.0,,,,,,,,,,0.0,,,0,,,0.0,0.0,0.0,0.0,2020,0.753743,24.625670,65,0,Turnover,4256.0,18:30:00,03:08:48,Raymond James Stadium,"Clear Temp: 63� F, Humidity: 78%, Wind: NW 9 mph",10160000-0585-01aa-36fc-5a38a4f1dbb9,18.0,0,PASS,0,,,TB 10,21,Turnover,,9.0,1:57,2.0,1.0,0.0,4.0,4.0,-10.0,PUNT,INTERCEPTION,03:30,01:33,KC 42,TB 10,4033.0,4280.0,9,31,Neutral,22,40,-3.0,55.0,0,outdoors,grass,63.0,9.0,Bruce Arians,Andy Reid,TAM00,Raymond James Stadium,0,0.0,P.Mahomes,,T.Hill,1,0,0.0,0,1,00-0033873,,00-0033040,P.Mahomes,00-0033873,T.Hill,00-0033040,0,1,-0.482870,0.870086,3.174811,2.0,0.588944,0.216243,0.861989,13.801130


In [51]:
# remove rows for kickoffs, FGs, punts, start of game (nan)
pbp['play_type'].unique()
pbp

array([nan, 'kickoff', 'pass', 'run', 'field_goal', 'punt', 'no_play',
       'extra_point', 'qb_kneel', 'qb_spike'], dtype=object)

# Features to Include

The main features I think will be useful in this model will be:

- Drive number
- Quarter of the game
- Down number
- Time left in a quarter
- Yards to gain for a first down
- Yards to gain for a touchdown
- Current score in the game (for the possessing team and defensive team, which would be modeled as a difference)
- Offensive Team
- Defensive Team
- Offensive Ranking thru Season (offensive team)
- Defensive Ranking thru Season (defensive team)
- Week of season
- Augment with PFF data? Aggregate individual player rankings?
- ...

Depending on the outcome I want to focus on, some of these may n

# Potential Outcomes

- Predict probability of first down
     - Can we judge whether a team will get a first down if they run or pass?
     - Which is more likely to result in success?
     - Would fit models separated by play type
- Predict yards gained per play
    - Similar problem to above but with a continuous outcome
    - Different methods and statistical techniques
    - Need to examine assumptions of each team
- Predict play type (run/pass) to predict playcalling tendencies
    - Goal is to see how predictable a team is with their play calling
    - Does being predictable in play calling (i.e. if a machine can learn their tendencies) lead to fewer wins?
    - Quantify how advantageous it is to be creative
    - Would need to fit models to each team

# Models

Depends on continous or categorical outcome..

# Preliminary Results