# Data Cleanup

## Necessary Steps

1. Understand the data structure: Look through the database, understand the main structure, i.e., what each row represents, and what columns and types of information are available
2. Merge game file to add in week identifier to the new dataset (games.csv is in '1. Additional Data File')
3. Create column categorizations to filter dataset for relevant purposes
4. Break down receiver into its individual row
5. Create playmaker column, and check id uniqueness - check that one ID corresponds to one name
6. Flag non-relevant plays - add a binary column that flags 1 for run, pass, reception, FG/XP, 0 for all others
7. Add any additional stat needed - reception (for plays that fall under 'reception'), target ('reception', 'pass')
8. Add position based off of the highest stat of a player. Position will be refined later with web scraping
9. Ensure that stat are correctly represented for a given position
10. Verify top 50 stats against the reported ones


In [3]:
'''
Remove all of my comments, and comment your code
remove all unused code
reformat titles - i.e., remove the 'Steven' or S1 stuff for part 5
if youve created a list automatically, delete the manual one
There is still so much missing in th main list
Like i said, in the passer list, we need to have all information that pertains to the passer and the points they score. 
That includes, id, name, attemps, touchdown, fumble lost, compltion interception, same with rushing and receiving. 
this is th third time im mentioning this, please get to it
Please have another look at S5.4 - as it is, this code does nothing for dataframe
need to add reception and two point conversion for flag
'''

"\nRemove all of my comments, and comment your code\nremove all unused code\nreformat titles - i.e., remove the 'Steven' or S1 stuff for part 5\nif youve created a list automatically, delete the manual one\nThere is still so much missing in th main list\nLike i said, in the passer list, we need to have all information that pertains to the passer and the points they score. \nThat includes, id, name, attemps, touchdown, fumble lost, compltion interception, same with rushing and receiving. \nthis is th third time im mentioning this, please get to it\nPlease have another look at S5.4 - as it is, this code does nothing for dataframe\nneed to add reception and two point conversion for flag\n"

## Setup Environment

In [307]:
import pandas as pd
import numpy as np
from pandas import ExcelWriter
pd.set_option('display.max_columns', None)

## 1. Upload Data

In [None]:
'''
Not sure why your folder names were changes. The working directory of this file is where its source code is located
so you can specify directories relatively
'''

In [271]:
# Store your files in the same folder as the source code, so you don't have to specify the directory
df_file_2019 = "../../1. Raw-Data/data2019.csv"
df_file_game = "../1. Additional-Data/games.csv"

# Use the convention df for dataframes
df = pd.read_csv(df_file_2019)
df_games = pd.read_csv(df_file_game)

## 2. Add in Weekly Identifier

In [None]:
'''
In each play row the week of that game will be displayed as a new column.
This will be done through a merge along the game_id column from the game.csv file. 
'''

In [None]:
df = df.merge(df_games[["game_id", "week"]], on = 'game_id')

## 3. Create Categorization Lists

In [None]:
'''
The columns in the 2019 data file are arranged into categorical lists.
Many of the lists can be automated due to the presence of key words in their title.
'''

### 3.1 Main Lists

In [378]:
# Create a "key" list that contains elements that will always be needed - game id, week, play type
key = ['Flag', 'Playmaker_id', 'Playmaker_name', 'play_id', 'game_id', 'home_team', 'away_team', 'week', 'game_date', 'posteam', 'posteam_type', 'defteam', 'side_of_field', 'play_type']

# There are so many elements here that are missing here
# touchdown, passing toucdown, rushing touchdown, pass attemps, rush attempts, interceptions, fumble lost, 2 point attempts and conversions and so on
# all elements that contribute to points in fantasy should be in the main lists

pass_play = ['pass_length','pass_location','air_yards']
run_play = ['run_location', 'run_gap']
yard_info = ['yrdln','ydstogo','ydsnet','yards_gained', "fumble_recovery_1_yards", "fumble_recovery_2_yards", "return_yards"]
receiver_stats = ["receiver_player_id", "receiver_player_name", "lateral_receiver_player_name", "lateral_receiver_player_name", "yards_after_catch"]
# Two point conv should not be in xp
# xp + fg
xp = ['field_goal_result', 'kick_distance', 'extra_point_result']








### 3.2 Other Lists

In [289]:
game_info = ['play_id','game_id','home_team','away_team','posteam','posteam_type', 'defteam', 'side_of_field', 'yardline_100','game_date', "year"]
game_time_info = ['quarter_seconds_remaining', 'half_seconds_remaining', 'game_seconds_remaining', 'game_half', 'quarter_end', 'time']

# Keep play type in the main lists
gen_play_info = ['drive', 'sp', 'down', 'goal_to_go','desc','play_type','shotgun','no_huddle','qb_dropback','qb_kneel','qb_spike','qb_scramble',]

# automate - see below for best example
timeout_info = ['home_timeouts_remaining','away_timeouts_remaining','timeout','timeout_team']

team_info = ["return_team", 'td_team', 'posteam_time', 'defteam_time', 'total_home_score','total_away_score', 'posteam_score_post','defteam_score_post', 'score_differential', "forced_fumble_player_1_team", "forced_fumble_player_2_team", "solo_tackle_1_team", "solo_tackle_2_team", "assist_tackle_1_team", "assist_tackle_2_team", "assist_tackle_3_team", "assist_tackle_4_team", "fumbled_1_team", "fumbled_2_team", "fumble_recovery_1_team", "fumble_recovery_2_team"]

# automate
probability_info = ['no_score_prob','opp_fg_prob', 'opp_safety_prob', 'opp_td_prob', 'fg_prob', 'safety_prob', 'td_prob', 'extra_point_prob', 'two_point_conversion_prob', 'ep', 'epa', 'total_home_epa','total_away_epa', 'total_home_rush_epa','total_away_rush_epa', 'total_home_pass_epa', 'total_away_pass_epa', 'air_epa', 'yac_epa', 'comp_air_epa', 'total_home_comp_air_epa', 'total_away_comp_air_epa', 'total_home_comp_yac_epa', 'total_away_comp_yac_epa', 'total_home_raw_air_epa', 'total_away_raw_air_epa', 'total_home_raw_yac_epa', 'total_away_raw_yac_epa', 'wp', 'def_wp', 'home_wp', 'away_wp', 'wpa', 'home_wp_post', 'away_wp_post', 'total_home_rush_wpa', 'total_away_rush_wpa', 'total_home_pass_wpa', 'total_away_pass_wpa', 'air_wpa', 'yac_wpa', 'comp_air_wpa', 'comp_yac_wpa', 'total_home_comp_air_wpa', 'total_away_comp_air_wpa', 'total_home_comp_yac_wpa', 'total_away_comp_yac_wpa', 'total_home_raw_air_wpa', 'total_away_raw_air_wpa', 'total_home_raw_yac_wpa', 'total_away_raw_yac_wpa']

# a lot of these need to be re categorized
# you can easily create a "down" list
# there's a bunch of defensive stats in there that you can add to the defensive column, samewith punts, safety and what not
miscellaneous_plays = ['punt_blocked', 'first_down_rush', 'first_down_pass', 'first_down_penalty', 'third_down_converted', 'third_down_failed', 'fourth_down_converted', 'fourth_down_failed', 'incomplete_pass', 'touchback', 'interception', 'fumble_forced', 'fumble_not_forced', 'fumble_out_of_bounds', 'solo_tackle', 'safety', 'penalty', 'tackled_for_loss', 'fumble_lost', 'qb_hit', 'rush_attempt', 'pass_attempt', 'sack', 'touchdown', 'pass_touchdown', 'rush_touchdown', 'return_touchdown', 'two_point_attempt', 'field_goal_attempt', 'kickoff_attempt', 'punt_attempt', 'fumble', "complete_pass", "assisted_tackle", "lateral_reception", "lateral_rush", "lateral_return", "lateral_recovery"]

# automate
# kickoff_punt_info = ['punt_inside_twenty', 'punt_in_endzone', 'punt_out_of_bounds', 'punt_downed', 'punt_fair_catch', 'kickoff_inside_twenty', 'kickoff_in_endzone', 'kickoff_out_of_bounds', 'kickoff_downed', 'kickoff_fair_catch', 'own_kickoff_recovery', 'own_kickoff_recovery_td']

# Add to passer/rusher/reception/kicker and so on columns
player_info = ["passer_player_id", "passer_player_name", "receiver_player_id", "receiver_player_name", "rusher_player_id", "rusher_player_name", "lateral_receiver_player_id", "lateral_receiver_player_name", "lateral_rusher_player_id", "lateral_rusher_player_name", "lateral_sack_player_id", "lateral_sack_player_name", "lateral_sack_player_name", "interception_player_id", "interception_player_name", "lateral_interception_player_id", "lateral_interception_player_name", "punt_returner_player_id", "punt_returner_player_name", "lateral_punt_returner_player_id", "lateral_punt_returner_player_name", "kickoff_returner_player_name", "kickoff_returner_player_id", "lateral_kickoff_returner_player_id", "lateral_kickoff_returner_player_name", "punter_player_id", "punter_player_name", "kicker_player_name", "kicker_player_id", "own_kickoff_recovery_player_id", "own_kickoff_recovery_player_name", "blocked_player_id", "tackle_for_loss_1_player_id", "tackle_for_loss_1_player_name", "tackle_for_loss_2_player_id", "tackle_for_loss_2_player_name", "qb_hit_1_player_id", "qb_hit_1_player_name", "qb_hit_2_player_id", "qb_hit_2_player_name", "forced_fumble_player_1_player_id", "forced_fumble_player_1_player_name", "forced_fumble_player_2_player_id", "forced_fumble_player_2_player_name", "solo_tackle_1_player_id", "solo_tackle_2_player_id", "solo_tackle_1_player_name", "solo_tackle_2_player_name", "assist_tackle_1_player_id", "assist_tackle_1_player_name", "assist_tackle_2_player_id", "assist_tackle_2_player_name", "assist_tackle_3_player_id", "assist_tackle_3_player_name",  "assist_tackle_4_player_id", "assist_tackle_4_player_name",  "pass_defense_1_player_id", "pass_defense_1_player_name", "pass_defense_2_player_id", "pass_defense_2_player_name", "fumbled_1_player_id", "fumbled_1_player_name", "fumbled_2_player_id", "fumbled_2_player_name", "fumble_recovery_1_player_id", "fumble_recovery_1_player_name",  "fumble_recovery_2_player_id", "fumble_recovery_2_player_name"]

# automate, why is there an empty column
# penalty_info = ["penalty_team", "penalty_player_id", "penalty_player_name", "penalty_yards", "replay_or_challenge", "replay_or_challenge_result", "penalty_type"]

# automate
defensive_points = ["defensive_two_point_attempt", "defensive_extra_point_attempt", "defensive_extra_point_conv"]

# Create a final list which includes all elements not currently grouped. Then examine the list and see if you
# can recategorize some elements

# giant list = sum of all list
# df.columns
# remaining = []

In [380]:
# We often use list comprehensions when building out list out of conditions
# They have a better performance that for loops, and provide for neater code

prob_cols = [col for col in df.columns if 'prob' in col]

penalty_info = [col for col in df.columns if 'penalty' in col]

# ask how to add or statement for punts
kickoff_punt_info = [col for col in df.columns if 'kickoff' in col]
kickoff_punt_info += [col for col in df.columns if 'kicker' in col]
kickoff_punt_info += [col for col in df.columns if 'punt' in col]


defensive_info = [col for col in df.columns if 'defensive' in col]
defensive_info += [col for col in df.columns if 'fumble' in col]
defensive_info += [col for col in df.columns if 'sack' in col]
defensive_info += [col for col in df.columns if 'interception' in col]
defensive_info += [col for col in df.columns if 'defense' in col]

epa_info = [col for col in df.columns if 'epa' in col]

wpa_info = [col for col in df.columns if 'wpa' in col]

touchdown_info = [col for col in df.columns if 'touchdown' in col]

timeout_info = [col for col in df.columns if 'timeout' in col]

passer_info = [col for col in df.columns if 'passer' in col]

receiver_info = [col for col in df.columns if 'receiver' in col]

rusher_info = [col for col in df.columns if 'rusher' in col]

score_info = [col for col in df.columns if 'score' in col]

tackle_info = [col for col in df.columns if 'tackle' in col]

all_lists = [xp, wpa_info, epa_info, touchdown_info, yard_info, run_play, pass_play, key, prob_cols, penalty_info, kickoff_punt_info, defensive_info, timeout_info, passer_info, receiver_info, rusher_info, score_info, tackle_info]

big_list = [item for elem in all_lists for item in elem]

remainder_list = [col for col in df.columns if col not in big_list]

# all_lists

print(len(remainder_list))
remainder_list
# final_list



# The above line does the same thing as the code blow below

62


['Unnamed: 0',
 'away_wp',
 'away_wp_post',
 'blocked_player_id',
 'blocked_player_name',
 'complete_pass',
 'def_wp',
 'desc',
 'down',
 'drive',
 'ep',
 'extra_point_attempt',
 'field_goal_attempt',
 'first_down_pass',
 'first_down_rush',
 'fourth_down_converted',
 'fourth_down_failed',
 'game_half',
 'game_seconds_remaining',
 'goal_to_go',
 'half_seconds_remaining',
 'home_wp',
 'home_wp_post',
 'incomplete_pass',
 'lateral_reception',
 'lateral_recovery',
 'lateral_return',
 'lateral_rush',
 'no_huddle',
 'pass_attempt',
 'qb_dropback',
 'qb_hit',
 'qb_hit_1_player_id',
 'qb_hit_1_player_name',
 'qb_hit_2_player_id',
 'qb_hit_2_player_name',
 'qb_kneel',
 'qb_scramble',
 'qb_spike',
 'qtr',
 'quarter_end',
 'quarter_seconds_remaining',
 'replay_or_challenge',
 'replay_or_challenge_result',
 'return_team',
 'rush_attempt',
 'safety',
 'shotgun',
 'sp',
 'td_team',
 'third_down_converted',
 'third_down_failed',
 'time',
 'touchback',
 'two_point_attempt',
 'two_point_conv_result',
 

## Step 4

In [None]:
'''
For each pass play, creat an additional row for the recepetion
so that there can be two play makers (the passer and the receiver).
'''

In [409]:
# Modify


new_df = pd.DataFrame()
new_df = data[(data['play_type']=="pass")]

new_df['play_type'] = 'reception'

# new_df['play_type'].replace({'pass': 'reception'}, inplace=True)

# new_df['play_type'] = np.where(new_df['play_type']=='pass', 'reception', 0)

# new_df['play_type'] = 'reception'
df = pd.concat([data, new_df], sort= True)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  import sys


## Step 5

In [None]:
'''
Step 5 creates a playmaker column, and also checks if each player 
id matchs only one name and corrects those that do not.

Steven's approach starts by establishing a playmaker id and name column and uses these columns to identify 
which ids have more than one name.

Using grouby you can cross check the playmake_id column and the playmaker_name column and isolate that id that have 
more than one name.

'''

### Roy's Step 5

In [None]:

data["Playmaker_id"] = ""

play_ls = []
id_dict = {}
row_num = 0

for i in data["play_type"]:
    
    if i == "pass":
        curr_id = data["passer_player_id"][row_num]
        if (data["passer_player_id"][row_num] not in play_ls):
            id_dict[curr_id] = data["passer_player_name"][row_num]
        play_ls.append(data["passer_player_id"][row_num])
        
        
    elif i == "kickoff":
        curr_id = data["kicker_player_id"][row_num]
        if (data["kicker_player_id"][row_num] not in play_ls):
            id_dict[curr_id] = data["kicker_player_name"][row_num]
        play_ls.append(data["kicker_player_id"][row_num])
        
        
    elif i == "run" or i == "qb_kneel":
        curr_id = data["rusher_player_id"][row_num]
        if (data["rusher_player_id"][row_num] not in play_ls):
            id_dict[curr_id] = data["rusher_player_name"][row_num]
        play_ls.append(data["rusher_player_id"][row_num])
        
        
    elif i == "punt":
        curr_id = data["punter_player_id"][row_num]
        if (data["punter_player_id"][row_num] not in play_ls):
            id_dict[curr_id] = data["punter_player_name"][row_num]
        play_ls.append(data["punter_player_id"][row_num])
        
        
    elif i == "field_goal":
        curr_id = data["kicker_player_id"][row_num]
        if (data["kicker_player_id"][row_num] not in play_ls):
            id_dict[curr_id] = data["kicker_player_name"][row_num]
        play_ls.append(data["kicker_player_id"][row_num])
        
        
    elif i == "extra_point":
        curr_id = data["kicker_player_id"][row_num]
        if (data["kicker_player_id"][row_num] not in play_ls):
            id_dict[curr_id] = data["kicker_player_name"][row_num]
        play_ls.append(data["kicker_player_id"][row_num])
        
        
    else:
        play_ls.append("N/A")
    
           
    row_num += 1


data["Playmaker_id"] = play_ls

data




In [None]:
# list of all the player ids in specified columns
passer_ls = []
kickers_ls = []
rusher_ls = []
punter_ls = []

# appends all the player id in the column to corresponding list
for passers in data["passer_player_id"]:
    passer_ls.append(passers)
    
for kickers in data["kicker_player_id"]:
    kickers_ls.append(kickers)
    
for rusher in data["rusher_player_id"]:
    rusher_ls.append(rusher)
    
for punter in data["punter_player_id"]:
    punter_ls.append(punter)

# these lists will contain the names of the players in the order they appear in the dataframe
passer_names = []
kickers_names = []
rusher_names = []
punter_names = []

# the player IDs are used to retrieve the corresponding name from the dictionary
# in the dictionary one name corresponds to one ID which ensures uniqueness

for i in passer_ls:
    passer_names.append(id_dict[i])

for i in kickers_ls:
    kickers_names.append(id_dict[i])

for i in rusher_ls:
    rusher_names.append(id_dict[i])

for i in punter_ls:
    punter_names.append(id_dict[i])

# the list of names formed above are used unique as only one name is assigned per player, and the list is assigned to each column
data["passer_player_name"] = passer_names
data["kicker_player_name"] = kickers_names
data["rusher_player_name"] = rusher_names
data["punter_player_name"] = punter_names





In [None]:
check_list = []
player_count = 0
ID_count = 0
for i in data["passer_player_name"].unique():
    player_count += 1
    check_list.append(i)
for i in data["kicker_player_name"].unique():
#     print(i)
    player_count += 1
    check_list.append(i)
for i in data["rusher_player_name"].unique():
    player_count += 1
    check_list.append(i)
for i in data["punter_player_name"].unique():
    player_count += 1
    check_list.append(i)

# print(player_count)
print(len(id_dict))
# print(len(check_list))
new_frame = pd.DataFrame()
new_frame['name'] = check_list

for i in new_frame['name'].unique():
#     print(i)
    ID_count += 1

print(ID_count)


# print(check_list)



# for i in new_frame['ID'].unique():
#     ID_count += 1

# print(ID_count)

new_frame
    

### Steven's Step 5

#### S5.1 Define Play Maker

In [374]:
# Create play maker column
# you will need to add a receiver segments as well
df['play_maker'] = np.where(df['play_type']=='rush',df['rusher_player_id'],np.NaN)
df['play_maker'] = np.where(df['play_type']=='pass', df['passer_player_id'], df['play_maker'])
df['play_maker'] = np.where(df['play_type']=='reception',df['receiver_player_id'],df['play_maker'])
df['play_maker'] = np.where((df['play_type']=='extra_point')|(df['play_type']=='field_goal'),df['kicker_player_id'],df['play_maker'])



In [375]:
# Add in corresponding play maker ID
df['play_maker_id'] = np.where(df['play_type']=='rush',df['rusher_player_id'],np.NaN)
df['play_maker_id'] = np.where(df['play_type']=='pass',df['passer_player_id'],df['play_maker_id'])
df['play_maker_id'] = np.where((df['play_type']=='extra_point')|(df['play_type']=='field_goal'),df['kicker_player_id'],df['play_maker_id'])

# Now that we have a single column to identify play makers, it is a lot easier to check for ID uniqueness

#### S5.2 Identify Non Unique Player Names

In [377]:
# Create a data frame that contains the unique count of each player name under a given ID
# Filter on the IDs that correspond to more than one name

nunique_id = df[df.groupby(['play_maker_id'])['play_maker'].transform('nunique') > 1]['play_maker_id'].unique()


array([], dtype=object)

In [None]:
# Identify all the duplicate names

df[df['play_maker_id'].isin(nunique_id)]['play_maker'].unique()

In [None]:
# Create a dictionary of what the corrected names should be

name_corrections = {'D.Chark Jr.': 'D.Chark',
'Jos.Allen':'J.Allen',
'M.Ingram II': 'M.Ingram',
'A.Levine Sr.': 'A.Levine',
'R.Griffin III': 'R.Griffin',
'G.Minshew II':'G.Minshew',
'B.Snell Jr.':'B.Snell', 
'Tr.Edmunds':'T.Edmunds',
'R.James Jr.': 'R.James',
'J.Ross III':'J.Ross',
'W.Snead IV':'W.Snead', 
'M.Jones Jr.': 'M.Jones', 
'M.Sanu Sr.':'M.Sanu', 
'O.Beckham Jr.':'O.Beckham', 
'P.Dorsett II':'P.Dorsett'}

#### S5.3 Correct Name Uniqueness

In [None]:
# Create a function to correct the typos

def typo_correction(name):
    if name in name_corrections.keys():
        return name_corrections[name]
    else:
        return name

In [None]:
# apply the function to the dataframe

df['play_maker'] = df['play_maker'].apply(typo_correction)

#### S5.4 New Attempt

In [53]:
test = df[df['passer_player_id'].isin(nunique_id)].groupby(['passer_player_id','passer_player_name']).size().reset_index()
test.set_index('passer_player_id', inplace=True)

In [56]:
d = {player_id:(test.loc[player_id,'passer_player_name'][0],test.loc[player_id,'passer_player_name'][1]) for player_id in nunique_id}

In [63]:
test2 = df.copy()

In [64]:
test2.set_index('passer_player_id', inplace=True)

In [65]:
for dup in d.keys():
    test2.loc[dup,'passer_player_name'] = d[dup][0]

## Step 6

In [None]:
'''
Using a where statement use a binary indicator to Flag relevant plays 
such as passes, receptions, runs, field goals, extra points, and qb_kneels. 
'''

In [None]:
# # step 6:
relevance_ls = []
data["Flag"] = ""
# row_num = 1
for i in data["play_type"]:
    if i == "run" or i == "pass":
        relevance_ls.append(1)
#         data["Play_relevance"] = 1
    else:
        relevance_ls.append(0)
#         data["Play_relevance"] = 0

# Add field goal, extra points, qb_kneels and reception when done with step 4
data["Flag"] = relevance_ls


    

In [384]:
df['Flag'] = np.where((df['play_type'] == 'pass') | (df['play_type'] == 'run') | (df['play_type'] == 'field_goal') | (df['play_type'] == 'qb_kneel') | (df['play_type'] == 'extra_point') , 1, 0)

df


Unnamed: 0.1,Flag,Playmaker_id,Playmaker_name,Unnamed: 0,air_epa,air_wpa,air_yards,assist_tackle,assist_tackle_1_player_id,assist_tackle_1_player_name,assist_tackle_1_team,assist_tackle_2_player_id,assist_tackle_2_player_name,assist_tackle_2_team,assist_tackle_3_player_id,assist_tackle_3_player_name,assist_tackle_3_team,assist_tackle_4_player_id,assist_tackle_4_player_name,assist_tackle_4_team,away_team,away_timeouts_remaining,away_wp,away_wp_post,blocked_player_id,blocked_player_name,comp_air_epa,comp_air_wpa,comp_yac_epa,comp_yac_wpa,complete_pass,def_wp,defensive_extra_point_attempt,defensive_extra_point_conv,defensive_two_point_attempt,defensive_two_point_conv,defteam,defteam_score,defteam_score_post,defteam_timeouts_remaining,desc,down,drive,ep,epa,extra_point_attempt,extra_point_prob,extra_point_result,fg_prob,field_goal_attempt,field_goal_result,first_down_pass,first_down_penalty,first_down_rush,forced_fumble_player_1_player_id,forced_fumble_player_1_player_name,forced_fumble_player_1_team,forced_fumble_player_2_player_id,forced_fumble_player_2_player_name,forced_fumble_player_2_team,fourth_down_converted,fourth_down_failed,fumble,fumble_forced,fumble_lost,fumble_not_forced,fumble_out_of_bounds,fumble_recovery_1_player_id,fumble_recovery_1_player_name,fumble_recovery_1_team,fumble_recovery_1_yards,fumble_recovery_2_player_id,fumble_recovery_2_player_name,fumble_recovery_2_team,fumble_recovery_2_yards,fumbled_1_player_id,fumbled_1_player_name,fumbled_1_team,fumbled_2_player_id,fumbled_2_player_name,fumbled_2_team,game_date,game_half,game_id,game_seconds_remaining,goal_to_go,half_seconds_remaining,home_team,home_timeouts_remaining,home_wp,home_wp_post,incomplete_pass,interception,interception_player_id,interception_player_name,kick_distance,kicker_player_id,kicker_player_name,kickoff_attempt,kickoff_downed,kickoff_fair_catch,kickoff_in_endzone,kickoff_inside_twenty,kickoff_out_of_bounds,kickoff_returner_player_id,kickoff_returner_player_name,lateral_interception_player_id,lateral_interception_player_name,lateral_kickoff_returner_player_id,lateral_kickoff_returner_player_name,lateral_punt_returner_player_id,lateral_punt_returner_player_name,lateral_receiver_player_id,lateral_receiver_player_name,lateral_reception,lateral_recovery,lateral_return,lateral_rush,lateral_rusher_player_id,lateral_rusher_player_name,lateral_sack_player_id,lateral_sack_player_name,no_huddle,no_score_prob,opp_fg_prob,opp_safety_prob,opp_td_prob,own_kickoff_recovery,own_kickoff_recovery_player_id,own_kickoff_recovery_player_name,own_kickoff_recovery_td,pass_attempt,pass_defense_1_player_id,pass_defense_1_player_name,pass_defense_2_player_id,pass_defense_2_player_name,pass_length,pass_location,pass_touchdown,passer_player_id,passer_player_name,penalty,penalty_player_id,penalty_player_name,penalty_team,penalty_type,penalty_yards,play_id,play_type,posteam,posteam_score,posteam_score_post,posteam_timeouts_remaining,posteam_type,punt_attempt,punt_blocked,punt_downed,punt_fair_catch,punt_in_endzone,punt_inside_twenty,punt_out_of_bounds,punt_returner_player_id,punt_returner_player_name,punter_player_id,punter_player_name,qb_dropback,qb_hit,qb_hit_1_player_id,qb_hit_1_player_name,qb_hit_2_player_id,qb_hit_2_player_name,qb_kneel,qb_scramble,qb_spike,qtr,quarter_end,quarter_seconds_remaining,receiver_player_id,receiver_player_name,replay_or_challenge,replay_or_challenge_result,return_team,return_touchdown,return_yards,run_gap,run_location,rush_attempt,rush_touchdown,rusher_player_id,rusher_player_name,sack,safety,safety_prob,score_differential,score_differential_post,shotgun,side_of_field,solo_tackle,solo_tackle_1_player_id,solo_tackle_1_player_name,solo_tackle_1_team,solo_tackle_2_player_id,solo_tackle_2_player_name,solo_tackle_2_team,sp,tackle_for_loss_1_player_id,tackle_for_loss_1_player_name,tackle_for_loss_2_player_id,tackle_for_loss_2_player_name,tackled_for_loss,td_prob,td_team,third_down_converted,third_down_failed,time,timeout,timeout_team,total_away_comp_air_epa,total_away_comp_air_wpa,total_away_comp_yac_epa,total_away_comp_yac_wpa,total_away_epa,total_away_pass_epa,total_away_pass_wpa,total_away_raw_air_epa,total_away_raw_air_wpa,total_away_raw_yac_epa,total_away_raw_yac_wpa,total_away_rush_epa,total_away_rush_wpa,total_away_score,total_home_comp_air_epa,total_home_comp_air_wpa,total_home_comp_yac_epa,total_home_comp_yac_wpa,total_home_epa,total_home_pass_epa,total_home_pass_wpa,total_home_raw_air_epa,total_home_raw_air_wpa,total_home_raw_yac_epa,total_home_raw_yac_wpa,total_home_rush_epa,total_home_rush_wpa,total_home_score,touchback,touchdown,two_point_attempt,two_point_conv_result,two_point_conversion_prob,week,wp,wpa,yac_epa,yac_wpa,yardline_100,yards_after_catch,yards_gained,ydsnet,ydstogo,year,yrdln,play_maker_id,play_maker
0,0,00-0034173,,1.0,,,,0.0,,,,,,,,,,,,,GB,3.0,,,,,0.000000,0.000000,0.000000,0.000000,0.0,,0.0,0.0,0.0,0.0,CHI,,0.0,3.0,E.Pineiro kicks 65 yards from CHI 35 to end zo...,,1.0,0.814998,0.000000,0.0,0.0,,0.233081,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-09-05,Half1,2.019090e+09,3600.0,0.0,1800.0,CHI,3.0,,,0.0,0.0,,,,00-0034173,E.Pineiro,1.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.001374,0.162632,0.004441,0.254179,0.0,,,0.0,0.0,,,,,,,0.0,,,0.0,,,,,,35.0,kickoff,GB,,0.0,3.0,away,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,0.0,0.0,,,,,0.0,0.0,0.0,1.0,0.0,900.0,,,0.0,,,0.0,0.0,,,0.0,0.0,,,0.0,0.0,0.003656,,0.0,0.0,CHI,0.0,,,,,,,0.0,,,,,0.0,0.340639,,0.0,0.0,15:00,0.0,,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,1.0,0.0,0.0,,0.0,1.0,,,,,35.0,,0.0,-10.0,0.0,2019.0,CHI 35,,
1,1,00-0033293,,2.0,,,,0.0,,,,,,,,,,,,,GB,3.0,0.500007,0.479346,,,0.000000,0.000000,0.000000,0.000000,0.0,0.499993,0.0,0.0,0.0,0.0,CHI,0.0,0.0,3.0,(15:00) A.Jones left tackle to GB 25 for no ga...,1.0,1.0,0.814998,-0.764363,0.0,0.0,,0.233081,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-09-05,Half1,2.019090e+09,3600.0,0.0,1800.0,CHI,3.0,0.499993,0.520654,0.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.001374,0.162632,0.004441,0.254179,0.0,,,0.0,0.0,,,,,,,0.0,,,0.0,,,,,,50.0,run,GB,0.0,0.0,3.0,away,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,0.0,0.0,,,,,0.0,0.0,0.0,1.0,0.0,900.0,,,0.0,,,0.0,0.0,tackle,left,1.0,0.0,00-0033293,A.Jones,0.0,0.0,0.003656,0.0,0.0,0.0,GB,1.0,00-0034874,R.Smith,CHI,,,,0.0,,,,,0.0,0.340639,,0.0,0.0,15:00,0.0,,0.000000,0.000000,0.000000,0.000000,-0.764363,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,-0.764363,-0.020660,0.0,0.000000,0.000000,0.000000,0.000000,0.764363,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.764363,0.020660,0.0,0.0,0.0,0.0,,0.0,1.0,0.500007,-0.020660,,,75.0,,0.0,-10.0,10.0,2019.0,GB 25,,
2,1,00-0023459,,3.0,-1.095212,-0.031647,-1.0,0.0,,,,,,,,,,,,,GB,3.0,0.479346,0.453258,,,-1.095212,-0.031647,0.107477,0.005559,1.0,0.520654,0.0,0.0,0.0,0.0,CHI,0.0,0.0,3.0,(14:33) A.Rodgers pass short left to A.Jones t...,2.0,1.0,0.050636,-0.987734,0.0,0.0,,0.213057,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-09-05,Half1,2.019090e+09,3573.0,0.0,1773.0,CHI,3.0,0.520654,0.546742,0.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.001569,0.188484,0.005696,0.295010,0.0,,,0.0,1.0,,,,,short,left,0.0,00-0023459,A.Rodgers,0.0,,,,,,71.0,pass,GB,0.0,0.0,3.0,away,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,1.0,0.0,,,,,0.0,0.0,0.0,1.0,0.0,873.0,00-0033293,A.Jones,0.0,,,0.0,0.0,,,0.0,0.0,,,0.0,0.0,0.003982,0.0,0.0,0.0,GB,1.0,00-0034874,R.Smith,CHI,,,,0.0,,,,,0.0,0.292202,,0.0,0.0,14:33,0.0,,-1.095212,-0.031647,0.107477,0.005559,-1.752097,-0.987734,-0.026088,-1.095212,-0.031647,0.107477,0.005559,-0.764363,-0.020660,0.0,1.095212,0.031647,-0.107477,-0.005559,1.752097,0.987734,0.026088,1.095212,0.031647,-0.107477,-0.005559,0.764363,0.020660,0.0,0.0,0.0,0.0,,0.0,1.0,0.479346,-0.026088,0.107477,0.005559,75.0,1.0,0.0,-10.0,10.0,2019.0,GB 25,00-0023459,00-0023459
3,1,00-0023459,,4.0,,,,0.0,,,,,,,,,,,,,GB,3.0,0.453258,0.386327,,,0.000000,0.000000,0.000000,0.000000,0.0,0.546742,0.0,0.0,0.0,0.0,CHI,0.0,0.0,3.0,(13:45) (Shotgun) A.Rodgers sacked at GB 15 fo...,3.0,1.0,-0.937099,-2.221273,0.0,0.0,,0.171216,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-09-05,Half1,2.019090e+09,3525.0,0.0,1725.0,CHI,3.0,0.546742,0.613673,0.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.001863,0.226273,0.006642,0.349574,0.0,,,0.0,1.0,,,,,,,0.0,00-0023459,A.Rodgers,0.0,,,,,,95.0,pass,GB,0.0,0.0,3.0,away,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,1.0,1.0,00-0032667,R.Robertson-Harris,,,0.0,0.0,0.0,1.0,0.0,825.0,,,0.0,,,0.0,0.0,,,0.0,0.0,,,1.0,0.0,0.004530,0.0,0.0,1.0,GB,1.0,00-0032667,R.Robertson-Harris,CHI,,,,0.0,00-0032667,R.Robertson-Harris,,,0.0,0.239902,,0.0,1.0,13:45,0.0,,-1.095212,-0.031647,0.107477,0.005559,-3.973370,-3.209007,-0.093020,-1.095212,-0.031647,0.107477,0.005559,-0.764363,-0.020660,0.0,1.095212,0.031647,-0.107477,-0.005559,3.973370,3.209007,0.093020,1.095212,0.031647,-0.107477,-0.005559,0.764363,0.020660,0.0,0.0,0.0,0.0,,0.0,1.0,0.453258,-0.066931,,,75.0,,-10.0,-10.0,10.0,2019.0,GB 25,00-0023459,00-0023459
4,0,00-0034162,,5.0,,,,0.0,,,,,,,,,,,,,GB,3.0,0.386327,0.443890,,,0.000000,0.000000,0.000000,0.000000,0.0,0.613673,0.0,0.0,0.0,0.0,CHI,0.0,0.0,3.0,(13:15) (Punt formation) J.Scott punts 53 yard...,4.0,1.0,-3.158372,0.714739,0.0,0.0,,0.054465,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-09-05,Half1,2.019090e+09,3495.0,0.0,1695.0,CHI,3.0,0.613673,0.556110,0.0,0.0,,,53.0,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.002114,0.327285,0.010957,0.466627,0.0,,,0.0,0.0,,,,,,,0.0,,,0.0,,,,,,125.0,punt,GB,0.0,0.0,3.0,away,1.0,0.0,0.0,0.0,0.0,0.0,0.0,00-0033556,T.Cohen,00-0034162,J.Scott,0.0,0.0,,,,,0.0,0.0,0.0,1.0,0.0,795.0,,,0.0,,CHI,0.0,11.0,,,0.0,0.0,,,0.0,0.0,0.004293,0.0,0.0,0.0,GB,1.0,00-0031584,A.Amos,GB,,,,0.0,,,,,0.0,0.134258,,0.0,0.0,13:15,0.0,,-1.095212,-0.031647,0.107477,0.005559,-3.258631,-3.209007,-0.093020,-1.095212,-0.031647,0.107477,0.005559,-0.764363,-0.020660,0.0,1.095212,0.031647,-0.107477,-0.005559,3.258631,3.209007,0.093020,1.095212,0.031647,-0.107477,-0.005559,0.764363,0.020660,0.0,0.0,0.0,0.0,,0.0,1.0,0.386327,0.057563,,,85.0,,0.0,-10.0,20.0,2019.0,GB 15,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
45541,1,00-0029263,,45542.0,2.888636,0.702366,5.0,0.0,,,,,,,,,,,,,SF,1.0,0.764104,0.931299,,,0.000000,0.000000,0.000000,0.000000,0.0,0.764104,0.0,0.0,0.0,0.0,SF,26.0,26.0,1.0,(:22) (Shotgun) R.Wilson pass incomplete short...,2.0,16.0,4.111364,-0.661444,0.0,0.0,,0.378260,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-12-29,Half2,2.019123e+09,22.0,1.0,22.0,SEA,0.0,0.235896,0.068701,1.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.174034,0.008484,0.000037,0.004921,0.0,,,0.0,1.0,00-0034730,M.Harris,,,short,left,0.0,00-0029263,R.Wilson,0.0,,,,,,3955.0,pass,SEA,21.0,21.0,0.0,home,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,1.0,1.0,00-0032378,D.Buckner,,,0.0,0.0,0.0,4.0,0.0,22.0,00-0032211,T.Lockett,0.0,,,0.0,0.0,,,0.0,0.0,,,0.0,0.0,0.000657,-5.0,-5.0,1.0,SF,0.0,,,,,,,0.0,,,,,0.0,0.433607,,0.0,0.0,00:22,0.0,,-5.247535,-0.708619,-0.817163,0.053939,3.064260,-0.644493,-0.041199,-23.272013,-3.265681,23.801664,3.277452,6.785425,0.263874,26.0,5.247535,0.708619,0.817163,-0.053939,-3.064260,0.644493,0.041199,23.272013,3.265681,-23.801664,-3.277452,-6.785425,-0.263874,21.0,0.0,0.0,0.0,,0.0,17.0,0.235896,-0.167196,-3.550080,-0.869562,5.0,,0.0,72.0,5.0,2019.0,SF 5,00-0029263,00-0029263
45542,1,00-0029263,,45543.0,3.550080,0.899287,5.0,0.0,,,,,,,,,,,,,SF,1.0,0.931299,0.973859,,,0.000000,0.000000,0.000000,0.000000,0.0,0.931299,0.0,0.0,0.0,0.0,SF,26.0,26.0,1.0,(:15) (Shotgun) R.Wilson pass incomplete short...,3.0,16.0,3.449920,-0.912051,0.0,0.0,,0.496176,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-12-29,Half2,2.019123e+09,15.0,1.0,15.0,SEA,0.0,0.068701,0.026141,1.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.197189,0.009679,0.000068,0.005978,0.0,,,0.0,1.0,,,,,short,middle,0.0,00-0029263,R.Wilson,0.0,,,,,,3977.0,pass,SEA,21.0,21.0,0.0,home,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,1.0,0.0,,,,,0.0,0.0,0.0,4.0,0.0,15.0,00-0033387,J.Hollister,0.0,,,0.0,0.0,,,0.0,0.0,,,0.0,0.0,0.000792,-5.0,-5.0,1.0,SF,0.0,,,,,,,0.0,,,,,0.0,0.290118,,0.0,1.0,00:15,0.0,,-5.247535,-0.708619,-0.817163,0.053939,3.976311,0.267558,0.001360,-26.822093,-4.164968,28.263795,4.219298,6.785425,0.263874,26.0,5.247535,0.708619,0.817163,-0.053939,-3.976311,-0.267558,-0.001360,26.822093,4.164968,-28.263795,-4.219298,-6.785425,-0.263874,21.0,0.0,0.0,0.0,,0.0,17.0,0.068701,-0.042559,-4.462131,-0.941846,5.0,,0.0,72.0,5.0,2019.0,SF 5,00-0029263,00-0029263
45543,1,00-0029263,,45544.0,-2.636491,0.010686,4.0,0.0,,,,,,,,,,,,,SF,1.0,0.973859,0.963173,,,-2.636491,0.010686,0.000000,0.000000,1.0,0.973859,0.0,0.0,0.0,0.0,SF,26.0,26.0,1.0,(:12) (Shotgun) R.Wilson pass short middle to ...,4.0,16.0,2.537869,-2.636491,0.0,0.0,,0.713258,0.0,,0.0,0.0,0.0,,,,,,,0.0,1.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-12-29,Half2,2.019123e+09,12.0,1.0,12.0,SEA,0.0,0.026141,0.036827,0.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.206216,0.008679,0.000180,0.005227,0.0,,,0.0,1.0,,,,,short,middle,0.0,00-0029263,R.Wilson,0.0,,,,,,3999.0,pass,SEA,21.0,21.0,0.0,home,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,1.0,0.0,,,,,0.0,0.0,0.0,4.0,0.0,12.0,00-0033387,J.Hollister,1.0,upheld,,0.0,0.0,,,0.0,0.0,,,0.0,0.0,0.000800,-5.0,-5.0,1.0,SF,1.0,00-0034982,D.Greenlaw,SF,,,,0.0,,,,,0.0,0.065640,,0.0,0.0,00:12,0.0,,-2.611044,-0.719304,-0.817163,0.053939,6.612802,2.904049,-0.009326,-24.185602,-4.175654,28.263795,4.219298,6.785425,0.263874,26.0,2.611044,0.719304,0.817163,-0.053939,-6.612802,-2.904049,0.009326,24.185602,4.175654,-28.263795,-4.219298,-6.785425,-0.263874,21.0,0.0,0.0,0.0,,0.0,17.0,0.026141,0.010686,0.000000,0.000000,5.0,0.0,4.0,72.0,5.0,2019.0,SF 5,00-0029263,00-0029263
45544,1,00-0031345,,45545.0,,,,0.0,,,,,,,,,,,,,SF,1.0,0.963173,1.000000,,,0.000000,0.000000,0.000000,0.000000,0.0,0.036827,0.0,0.0,0.0,0.0,SEA,21.0,21.0,0.0,(:09) J.Garoppolo up the middle to SF 3 for 2 ...,1.0,17.0,0.098622,,0.0,0.0,,0.051277,0.0,,0.0,0.0,0.0,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,2019-12-29,Half2,2.019123e+09,9.0,0.0,9.0,SEA,0.0,0.036827,0.000000,0.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,0.0,0.0,0.0,0.0,,,,,0.0,0.814185,0.044958,0.009874,0.032654,0.0,,,0.0,0.0,,,,,,,0.0,,,0.0,,,,,,4080.0,run,SF,26.0,26.0,1.0,away,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,0.0,0.0,,,,,0.0,0.0,0.0,4.0,0.0,9.0,,,0.0,,,0.0,0.0,,middle,1.0,0.0,00-0031345,J.Garoppolo,0.0,0.0,0.000275,5.0,5.0,0.0,SF,1.0,00-0034831,R.Green,SEA,,,,0.0,,,,,0.0,0.046777,,0.0,0.0,00:09,0.0,,-2.611044,-0.719304,-0.817163,0.053939,6.612802,2.904049,-0.009326,-24.185602,-4.175654,28.263795,4.219298,6.785425,0.300701,26.0,2.611044,0.719304,0.817163,-0.053939,-6.612802,-2.904049,0.009326,24.185602,4.175654,-28.263795,-4.219298,-6.785425,-0.300701,21.0,0.0,0.0,0.0,,0.0,17.0,0.963173,0.036827,,,99.0,,2.0,2.0,10.0,2019.0,SF 1,,


In [None]:
# step 7

In [None]:
# step 8