# Play by Play Analysis
The following notebook uses play by play level data to compute the career stats of all players found in the data. 

The goal is to compute metrics in such a way as to take advantage of the play by play data. This will provide information for many analyses: 

- Player Lifetime Performance Model: A model that describes how a player's performance evolves as they age. First considering that they need to earn more opportunities as a function of their efficiency 
- player Performance Dynamics: A model that describes how a player performs conditional on their prior week, season, etc. 
- Player Lifetime Optimization Model: Using many inputs from other models, this one will help decide which players to draft when considering projections at the season level, and which ones to start considering projections at the weekly level. For the former, the lifetime performance model is a critical input because it determines how much volume a player will get given the conditions they find themselves in at the beggining of the season. The weekly dynamic model will do something similar, except that it is able to consider very recent information to adjust expectations for a player. 

In [1]:
import pandas as pd

In [12]:
DataFrame = pd.read_csv("NFL_PBP_1999_2019.csv", nrows = 100)
DataFrame.head()

Unnamed: 0.1,Unnamed: 0,air_epa,air_wpa,air_yards,assist_tackle,assist_tackle_1_player_id,assist_tackle_1_player_name,assist_tackle_1_team,assist_tackle_2_player_id,assist_tackle_2_player_name,...,wp,wpa,yac_epa,yac_wpa,yardline_100,yards_after_catch,yards_gained,ydsnet,ydstogo,yrdln
0,0,,,,0.0,,,,,,...,0.000306,0.54176,,,30.0,,0.0,6.0,0,ARI 30
1,1,,,,0.0,,,,,,...,0.542066,-0.035199,,,77.0,,0.0,6.0,10,PHI 23
2,2,,,,0.0,,,,,,...,0.506867,-0.010762,,,77.0,,1.0,6.0,10,PHI 23
3,3,,,,0.0,,,,,,...,0.496105,-0.019362,,,76.0,,0.0,6.0,9,PHI 24
4,4,,,,0.0,,,,,,...,0.476743,0.002985,,,81.0,,10.0,6.0,14,PHI 19


In [63]:
DataFrame[['pass','passer','receiver','rush','rusher','yards_gained','yards_after_catch','air_yards']]

Unnamed: 0,pass,passer,receiver,rush,rusher,yards_gained,yards_after_catch,air_yards
0,0,,,0,,0.0,,
1,1,D.Pederson,D.Staley,0,,0.0,,
2,0,,,1,D.Staley,1.0,,
3,0,,,0,,0.0,,
4,1,D.Pederson,B.Finneran,0,,10.0,,
...,...,...,...,...,...,...,...,...
95,0,,,1,J.Plummer,10.0,,
96,0,,,0,,0.0,,
97,1,J.Plummer,F.Sanders,0,,0.0,,
98,1,J.Plummer,R.Moore,0,,0.0,,


In [74]:
pd.concat([DataFrame.iloc[:,DataFrame.columns.str.contains('player_id')],DataFrame.iloc[:,DataFrame.columns.str.contains('player_id')]])

Unnamed: 0,assist_tackle_1_player_id,assist_tackle_2_player_id,assist_tackle_3_player_id,assist_tackle_4_player_id,blocked_player_id,forced_fumble_player_1_player_id,forced_fumble_player_2_player_id,fumble_recovery_1_player_id,fumble_recovery_2_player_id,fumbled_1_player_id,...,punt_returner_player_id,punter_player_id,qb_hit_1_player_id,qb_hit_2_player_id,receiver_player_id,rusher_player_id,solo_tackle_1_player_id,solo_tackle_2_player_id,tackle_for_loss_1_player_id,tackle_for_loss_2_player_id
0,,,,,,,,,,,...,,,,,,,00-0014357,,,
1,,,,,,,,,,,...,,,,,00-0015523,,,,,
2,,,,,,,,,,,...,,,,,,00-0015523,00-0013644,,,
3,,,,,,,,,,,...,,,,,,,,,,
4,,,,,,,,,,,...,,,,,00-0005231,,00-0009360,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,,,,,,,,,,,...,,,,,,00-0013042,00-0016624,,,
96,,,,,,,,,,,...,,,,,,,,,,
97,,,,,,,,,,,...,,,,,00-0014328,,,,,
98,,,,,,,,,,,...,,,,,00-0011601,,00-0007660,,,


In [73]:
DataFrame[['passer_player_id','passer_player_name']].drop_duplicates().dropna()

Unnamed: 0,passer_player_id,passer_player_name
1,00-0012726,D.Pederson
8,00-0013042,J.Plummer
99,00-0015523,D.Staley


In [66]:
pd.DataFrame({"Column":DataFrame.columns}).to_excel("Columns.xlsx")

In [61]:
pd.DataFrame(DataFrame.columns[DataFrame.columns.str.contains('rush')].str.split("_").tolist())

Unnamed: 0,0,1,2,3
0,first,down,rush,
1,lateral,rush,,
2,lateral,rusher,player,id
3,lateral,rusher,player,name
4,rush,,,
5,rush,attempt,,
6,rush,touchdown,,
7,rusher,,,
8,rusher,id,,
9,rusher,player,id,


In [None]:
pd.DataFrame(DataFrame.columns[DataFrame.columns.str.contains('player_name')].str.split