# Overview

This "DataCleaning_2" code combines the output from "DataCleaning_1" with play-by-play data to create a 'df_all' dataframe ready for model building in "XGBoost_and_SHAP". In addition, "DataCleaning_2" creates the 'all_players_sorted_df_(1-9)' and 'players_within_3_5_wks_7_9' dataframes that are used for analysis of the SHAP values and insights in "Data_Viz_and_insights".

The main steps in this code are the following:

- A) Replicate some of the same code from "DataCleaning_1"
- B) Create new columns as indicators for a player being within 3 or 5 yards of the ball carrier, filter dataframe to closest 15 players, reduce dataframe to the ball carrier and 3 closest defenders, and create dataframe 'players_within_3_5_wks_7_9' to be used for later insights
- C) Get the positional coordinates of the 7 closest offensive players to the ball carrier, clean up by deleting old variables, calculate the distance and angle of the 7 closest offensive players to the ball carrier, calculate the distance and angle of the two closest offensive players (besides QB and the closest offensive player to the ball carrier) to each defender, and reposition data so it's the same for plays that went both left and right
- D) Shift data so attributes are columns by flattening and reshaping them
- E) drop duplicates and save 'df_all' and 'all_players_sorted_df_(1-9)'


In [8]:
import pandas as pd
import numpy as np

In [9]:
games = pd.read_csv("games.csv")
players = pd.read_csv("players.csv")
plays = pd.read_csv("plays.csv")
tackles = pd.read_csv("tackles.csv")
tracking_week_1 = pd.read_csv("tracking_week_1.csv")
tracking_week_2 = pd.read_csv("tracking_week_2.csv")
tracking_week_3 = pd.read_csv("tracking_week_3.csv")
tracking_week_4 = pd.read_csv("tracking_week_4.csv")
tracking_week_5 = pd.read_csv("tracking_week_5.csv")
tracking_week_6 = pd.read_csv("tracking_week_6.csv")
tracking_week_7 = pd.read_csv("tracking_week_7.csv")
tracking_week_8 = pd.read_csv("tracking_week_8.csv")
tracking_week_9 = pd.read_csv("tracking_week_9.csv")

In [10]:
# bringing sorted_dfs from DataCleaning_1

sorted_df_1 = pd.read_pickle("sorted_df_1 (1).pkl")
sorted_df_2 = pd.read_pickle("sorted_df_2 (1).pkl")
sorted_df_3 = pd.read_pickle("sorted_df_3 (1).pkl")
sorted_df_4 = pd.read_pickle("sorted_df_4 (1).pkl")
sorted_df_5 = pd.read_pickle("sorted_df_5 (1).pkl")
sorted_df_6 = pd.read_pickle("sorted_df_6 (1).pkl")
sorted_df_7 = pd.read_pickle("sorted_df_7 (1).pkl")
sorted_df_8 = pd.read_pickle("sorted_df_8 (1).pkl")
sorted_df_9 = pd.read_pickle("sorted_df_9 (1).pkl")

**A) Replicate some of the same code from "DataCleaning_1"**

In [11]:
p = plays[['gameId', 'playId', 'passResult', 'playResult', 'quarter', 'down', 'playDescription', 'ballCarrierId', 'ballCarrierDisplayName']]
p_p = p.merge(players[['nflId', 'position']], left_on='ballCarrierId', right_on='nflId', how='inner').drop(columns=['nflId'])
p_p_rushes = p_p[p_p['passResult'].isna()].drop(columns=['passResult'])

a1 = tracking_week_1[(tracking_week_1['displayName'] == 'football') &
 ((tracking_week_1['event'] == 'handoff') | (tracking_week_1['event'] == 'pass_outcome_caught') | (tracking_week_1['event'] == 'pass_arrived'))]
a2 = tracking_week_2[(tracking_week_2['displayName'] == 'football') &
 ((tracking_week_2['event'] == 'handoff') | (tracking_week_2['event'] == 'pass_outcome_caught') | (tracking_week_2['event'] == 'pass_arrived'))]
a3 = tracking_week_3[(tracking_week_3['displayName'] == 'football') &
 ((tracking_week_3['event'] == 'handoff') | (tracking_week_3['event'] == 'pass_outcome_caught') | (tracking_week_3['event'] == 'pass_arrived'))]
a4 = tracking_week_4[(tracking_week_4['displayName'] == 'football') &
 ((tracking_week_4['event'] == 'handoff') | (tracking_week_4['event'] == 'pass_outcome_caught') | (tracking_week_4['event'] == 'pass_arrived'))]
a5 = tracking_week_5[(tracking_week_5['displayName'] == 'football') &
 ((tracking_week_5['event'] == 'handoff') | (tracking_week_5['event'] == 'pass_outcome_caught') | (tracking_week_5['event'] == 'pass_arrived'))]
a6 = tracking_week_6[(tracking_week_6['displayName'] == 'football') &
 ((tracking_week_6['event'] == 'handoff') | (tracking_week_6['event'] == 'pass_outcome_caught') | (tracking_week_6['event'] == 'pass_arrived'))]
a7 = tracking_week_7[(tracking_week_7['displayName'] == 'football') &
 ((tracking_week_7['event'] == 'handoff') | (tracking_week_7['event'] == 'pass_outcome_caught') | (tracking_week_7['event'] == 'pass_arrived'))]
a8 = tracking_week_8[(tracking_week_8['displayName'] == 'football') &
 ((tracking_week_8['event'] == 'handoff') | (tracking_week_8['event'] == 'pass_outcome_caught') | (tracking_week_8['event'] == 'pass_arrived'))]
a9 = tracking_week_9[(tracking_week_9['displayName'] == 'football') &
 ((tracking_week_9['event'] == 'handoff') | (tracking_week_9['event'] == 'pass_outcome_caught') | (tracking_week_9['event'] == 'pass_arrived'))]

# Group by 'gameId' and 'playId', then apply a lambda function to count 'handoff' occurrences
handoff_counts_1 = a1.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
# Find game/play combinations with more than one 'handoff' event
multiple_handoffs_1 = handoff_counts_1[handoff_counts_1 > 1].reset_index()
# Rename the columns for clarity
multiple_handoffs_1.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_2 = a2.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_2 = handoff_counts_2[handoff_counts_2 > 1].reset_index()
multiple_handoffs_2.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_3 = a3.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_3 = handoff_counts_3[handoff_counts_3 > 1].reset_index()
multiple_handoffs_3.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_4 = a4.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_4 = handoff_counts_4[handoff_counts_4 > 1].reset_index()
multiple_handoffs_4.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_5 = a5.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_5 = handoff_counts_5[handoff_counts_5 > 1].reset_index()
multiple_handoffs_5.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_6 = a6.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_6 = handoff_counts_6[handoff_counts_6 > 1].reset_index()
multiple_handoffs_6.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_7 = a7.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_7 = handoff_counts_7[handoff_counts_7 > 1].reset_index()
multiple_handoffs_7.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_8 = a8.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_8 = handoff_counts_8[handoff_counts_8 > 1].reset_index()
multiple_handoffs_8.columns = ['gameId', 'playId', 'handoff_count']

handoff_counts_9 = a9.groupby(['gameId', 'playId'])['event'].apply(lambda events: ((events == 'handoff') | (events == 'pass_outcome_caught') | (events == 'pass_arrived')).sum())
multiple_handoffs_9 = handoff_counts_9[handoff_counts_9 > 1].reset_index()
multiple_handoffs_9.columns = ['gameId', 'playId', 'handoff_count']

In [12]:
# Merge the DataFrames with an indicator, and then filter out the rows with a match
df_filtered_1 = a1.merge(multiple_handoffs_1, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_1 = df_filtered_1[df_filtered_1['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_2 = a2.merge(multiple_handoffs_2, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_2 = df_filtered_2[df_filtered_2['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_3 = a3.merge(multiple_handoffs_3, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_3 = df_filtered_3[df_filtered_3['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_4 = a4.merge(multiple_handoffs_4, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_4 = df_filtered_4[df_filtered_4['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_5 = a5.merge(multiple_handoffs_5, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_5 = df_filtered_5[df_filtered_5['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_6 = a6.merge(multiple_handoffs_6, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_6 = df_filtered_6[df_filtered_6['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_7 = a7.merge(multiple_handoffs_7, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_7 = df_filtered_7[df_filtered_7['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_8 = a8.merge(multiple_handoffs_8, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_8 = df_filtered_8[df_filtered_8['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

df_filtered_9 = a9.merge(multiple_handoffs_9, on=['gameId', 'playId'], how='left', indicator=True)
df_filtered_9 = df_filtered_9[df_filtered_9['_merge'] == 'left_only'].drop(columns='_merge').reset_index(drop=True)

w1_rushes = p_p_rushes.merge(df_filtered_1[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w2_rushes = p_p_rushes.merge(df_filtered_2[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w3_rushes = p_p_rushes.merge(df_filtered_3[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w4_rushes = p_p_rushes.merge(df_filtered_4[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w5_rushes = p_p_rushes.merge(df_filtered_5[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w6_rushes = p_p_rushes.merge(df_filtered_6[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w7_rushes = p_p_rushes.merge(df_filtered_7[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w8_rushes = p_p_rushes.merge(df_filtered_8[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
w9_rushes = p_p_rushes.merge(df_filtered_9[['gameId', 'playId', 'playDirection']], left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')

filtered_events = ['tackle', 'out_of_bounds', 'touchdown', 'fumble']
filtered_plays_1 = tracking_week_1[(tracking_week_1['displayName'] == 'football') & (tracking_week_1['event'].isin(filtered_events))]
filtered_plays_2 = tracking_week_2[(tracking_week_2['displayName'] == 'football') & (tracking_week_2['event'].isin(filtered_events))]
filtered_plays_3 = tracking_week_3[(tracking_week_3['displayName'] == 'football') & (tracking_week_3['event'].isin(filtered_events))]
filtered_plays_4 = tracking_week_4[(tracking_week_4['displayName'] == 'football') & (tracking_week_4['event'].isin(filtered_events))]
filtered_plays_5 = tracking_week_5[(tracking_week_5['displayName'] == 'football') & (tracking_week_5['event'].isin(filtered_events))]
filtered_plays_6 = tracking_week_6[(tracking_week_6['displayName'] == 'football') & (tracking_week_6['event'].isin(filtered_events))]
filtered_plays_7 = tracking_week_7[(tracking_week_7['displayName'] == 'football') & (tracking_week_7['event'].isin(filtered_events))]
filtered_plays_8 = tracking_week_8[(tracking_week_8['displayName'] == 'football') & (tracking_week_8['event'].isin(filtered_events))]
filtered_plays_9 = tracking_week_9[(tracking_week_9['displayName'] == 'football') & (tracking_week_9['event'].isin(filtered_events))]

w_a1 = w1_rushes[['gameId', 'playId']].merge(filtered_plays_1, on=['gameId', 'playId'], how='inner')
w_a2 = w2_rushes[['gameId', 'playId']].merge(filtered_plays_2, on=['gameId', 'playId'], how='inner')
w_a3 = w3_rushes[['gameId', 'playId']].merge(filtered_plays_3, on=['gameId', 'playId'], how='inner')
w_a4 = w4_rushes[['gameId', 'playId']].merge(filtered_plays_4, on=['gameId', 'playId'], how='inner')
w_a5 = w5_rushes[['gameId', 'playId']].merge(filtered_plays_5, on=['gameId', 'playId'], how='inner')
w_a6 = w6_rushes[['gameId', 'playId']].merge(filtered_plays_6, on=['gameId', 'playId'], how='inner')
w_a7 = w7_rushes[['gameId', 'playId']].merge(filtered_plays_7, on=['gameId', 'playId'], how='inner')
w_a8 = w8_rushes[['gameId', 'playId']].merge(filtered_plays_8, on=['gameId', 'playId'], how='inner')
w_a9 = w9_rushes[['gameId', 'playId']].merge(filtered_plays_9, on=['gameId', 'playId'], how='inner')

In [13]:
event_priority = ['fumble', 'out_of_bounds', 'touchdown', 'tackle']

# custom function
def find_prioritized_event(group, event_priority):
  for event in event_priority:
    if event in group['event'].values:
      return group.loc[group['event'] == event].iloc[0]
  return group.iloc[0]

In [14]:
# Group by 'gameId' and 'playId' and apply the custom function
priority_frames_1 = w_a1.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_2 = w_a2.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_3 = w_a3.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_4 = w_a4.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_5 = w_a5.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_6 = w_a6.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_7 = w_a7.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_8 = w_a8.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)
priority_frames_9 = w_a9.groupby(['gameId', 'playId']).apply(find_prioritized_event, event_priority=event_priority).reset_index(drop=True)

params_df_1 = a1[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_1[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_2 = a2[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_2[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_3 = a3[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_3[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_4 = a4[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_4[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_5 = a5[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_5[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_6 = a6[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_6[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_7 = a7[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_7[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_8 = a8[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_8[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')
params_df_9 = a9[['gameId', 'playId', 'frameId', 'event']].merge(priority_frames_9[['gameId', 'playId', 'frameId', 'event']], on=['gameId', 'playId'], how='inner')

w1_rushes_params = params_df_1.merge(w1_rushes, on=['gameId', 'playId'], how= 'inner')
w2_rushes_params = params_df_2.merge(w2_rushes, on=['gameId', 'playId'], how= 'inner')
w3_rushes_params = params_df_3.merge(w3_rushes, on=['gameId', 'playId'], how= 'inner')
w4_rushes_params = params_df_4.merge(w4_rushes, on=['gameId', 'playId'], how= 'inner')
w5_rushes_params = params_df_5.merge(w5_rushes, on=['gameId', 'playId'], how= 'inner')
w6_rushes_params = params_df_6.merge(w6_rushes, on=['gameId', 'playId'], how= 'inner')
w7_rushes_params = params_df_7.merge(w7_rushes, on=['gameId', 'playId'], how= 'inner')
w8_rushes_params = params_df_8.merge(w8_rushes, on=['gameId', 'playId'], how= 'inner')
w9_rushes_params = params_df_9.merge(w9_rushes, on=['gameId', 'playId'], how= 'inner')


# spot to add in other player stats?


t_w1 = tracking_week_1[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w2 = tracking_week_2[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w3 = tracking_week_3[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w4 = tracking_week_4[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w5 = tracking_week_5[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w6 = tracking_week_6[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w7 = tracking_week_7[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w8 = tracking_week_8[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]
t_w9 = tracking_week_9[['gameId', 'playId', 'frameId', 'nflId', 'x', 'y', 'event']]

In [15]:
import sqlite3

# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w1.to_sql('df1', conn, index=False, if_exists='replace')
w1_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_1 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [16]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w2.to_sql('df1', conn, index=False, if_exists='replace')
w2_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_2 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [17]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w3.to_sql('df1', conn, index=False, if_exists='replace')
w3_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_3 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [18]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w4.to_sql('df1', conn, index=False, if_exists='replace')
w4_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_4 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [19]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w5.to_sql('df1', conn, index=False, if_exists='replace')
w5_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_5 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [20]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w6.to_sql('df1', conn, index=False, if_exists='replace')
w6_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_6 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [21]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w7.to_sql('df1', conn, index=False, if_exists='replace')
w7_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_7 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [22]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w8.to_sql('df1', conn, index=False, if_exists='replace')
w8_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_8 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [23]:
# Connect to a SQLite database (in-memory)
conn = sqlite3.connect(':memory:')

# Write the dataframes to the SQLite database
t_w9.to_sql('df1', conn, index=False, if_exists='replace')
w9_rushes_params.to_sql('df2', conn, index=False, if_exists='replace')

# Perform the SQL query to handle the conditional join
query = """
SELECT *
FROM df1
JOIN df2 ON df1.gameId = df2.gameId AND df1.playId = df2.playId AND df1.nflId = df2.ballCarrierId
WHERE df2.frameId_x <= df1.frameId AND df2.frameId_y >= df1.frameId
"""

# Read the query result into a new Pandas DataFrame
filtered_df_9 = pd.read_sql_query(query, conn)

# Close the connection to the database
conn.close()

In [24]:
finished_play_loc_1 = filtered_df_1[filtered_df_1['frameId_y'] == filtered_df_1['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_2 = filtered_df_2[filtered_df_2['frameId_y'] == filtered_df_2['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_3 = filtered_df_3[filtered_df_3['frameId_y'] == filtered_df_3['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_4 = filtered_df_4[filtered_df_4['frameId_y'] == filtered_df_4['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_5 = filtered_df_5[filtered_df_5['frameId_y'] == filtered_df_5['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_6 = filtered_df_6[filtered_df_6['frameId_y'] == filtered_df_6['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_7 = filtered_df_7[filtered_df_7['frameId_y'] == filtered_df_7['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_8 = filtered_df_8[filtered_df_8['frameId_y'] == filtered_df_8['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T
finished_play_loc_9 = filtered_df_9[filtered_df_9['frameId_y'] == filtered_df_9['frameId']][['gameId', 'playId', 'x', 'y', 'playDirection', 'frameId', 'event']].T.drop_duplicates().T

finished_play_loc_1.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_2.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_3.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_4.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_5.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_6.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_7.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_8.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)
finished_play_loc_9.rename(columns={'x': 'final_x', 'y': 'final_y'}, inplace=True)

filtered_df_1 = filtered_df_1.T.drop_duplicates().T
filtered_df_2 = filtered_df_2.T.drop_duplicates().T
filtered_df_3 = filtered_df_3.T.drop_duplicates().T
filtered_df_4 = filtered_df_4.T.drop_duplicates().T
filtered_df_5 = filtered_df_5.T.drop_duplicates().T
filtered_df_6 = filtered_df_6.T.drop_duplicates().T
filtered_df_7 = filtered_df_7.T.drop_duplicates().T
filtered_df_8 = filtered_df_8.T.drop_duplicates().T
filtered_df_9 = filtered_df_9.T.drop_duplicates().T

filtered_df_final_x_1 = filtered_df_1.merge(finished_play_loc_1[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_2 = filtered_df_2.merge(finished_play_loc_2[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_3 = filtered_df_3.merge(finished_play_loc_3[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_4 = filtered_df_4.merge(finished_play_loc_4[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_5 = filtered_df_5.merge(finished_play_loc_5[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_6 = filtered_df_6.merge(finished_play_loc_6[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_7 = filtered_df_7.merge(finished_play_loc_7[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_8 = filtered_df_8.merge(finished_play_loc_8[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')
filtered_df_final_x_9 = filtered_df_9.merge(finished_play_loc_9[['gameId', 'playId', 'final_x', 'final_y']], on=['gameId', 'playId'], how='inner')

filtered_df_final_x_1['adjusted_x'] = np.where(
    (filtered_df_final_x_1['event'] == 'touchdown') & (filtered_df_final_x_1['x'] < 50), 10,
    np.where((filtered_df_final_x_1['event'] == 'touchdown') & (filtered_df_final_x_1['x'] > 50), 110, filtered_df_final_x_1['x'])
)

filtered_df_final_x_2['adjusted_x'] = np.where(
    (filtered_df_final_x_2['event'] == 'touchdown') & (filtered_df_final_x_2['x'] < 50), 10,
    np.where((filtered_df_final_x_2['event'] == 'touchdown') & (filtered_df_final_x_2['x'] > 50), 110, filtered_df_final_x_2['x'])
)

filtered_df_final_x_3['adjusted_x'] = np.where(
    (filtered_df_final_x_3['event'] == 'touchdown') & (filtered_df_final_x_3['x'] < 50), 10,
    np.where((filtered_df_final_x_3['event'] == 'touchdown') & (filtered_df_final_x_3['x'] > 50), 110, filtered_df_final_x_3['x'])
)

filtered_df_final_x_4['adjusted_x'] = np.where(
    (filtered_df_final_x_4['event'] == 'touchdown') & (filtered_df_final_x_4['x'] < 50), 10,
    np.where((filtered_df_final_x_4['event'] == 'touchdown') & (filtered_df_final_x_4['x'] > 50), 110, filtered_df_final_x_4['x'])
)

filtered_df_final_x_5['adjusted_x'] = np.where(
    (filtered_df_final_x_5['event'] == 'touchdown') & (filtered_df_final_x_5['x'] < 50), 10,
    np.where((filtered_df_final_x_5['event'] == 'touchdown') & (filtered_df_final_x_5['x'] > 50), 110, filtered_df_final_x_5['x'])
)

filtered_df_final_x_6['adjusted_x'] = np.where(
    (filtered_df_final_x_6['event'] == 'touchdown') & (filtered_df_final_x_6['x'] < 50), 10,
    np.where((filtered_df_final_x_6['event'] == 'touchdown') & (filtered_df_final_x_6['x'] > 50), 110, filtered_df_final_x_6['x'])
)

filtered_df_final_x_7['adjusted_x'] = np.where(
    (filtered_df_final_x_7['event'] == 'touchdown') & (filtered_df_final_x_7['x'] < 50), 10,
    np.where((filtered_df_final_x_7['event'] == 'touchdown') & (filtered_df_final_x_7['x'] > 50), 110, filtered_df_final_x_7['x'])
)

filtered_df_final_x_8['adjusted_x'] = np.where(
    (filtered_df_final_x_8['event'] == 'touchdown') & (filtered_df_final_x_8['x'] < 50), 10,
    np.where((filtered_df_final_x_8['event'] == 'touchdown') & (filtered_df_final_x_8['x'] > 50), 110, filtered_df_final_x_8['x'])
)

filtered_df_final_x_9['adjusted_x'] = np.where(
    (filtered_df_final_x_9['event'] == 'touchdown') & (filtered_df_final_x_9['x'] < 50), 10,
    np.where((filtered_df_final_x_9['event'] == 'touchdown') & (filtered_df_final_x_9['x'] > 50), 110, filtered_df_final_x_9['x'])
)


filtered_df_final_x_1['max_distance']  = np.where(filtered_df_final_x_1['playDirection'] == 'left', filtered_df_final_x_1['adjusted_x'] - 10, 110 - filtered_df_final_x_1['adjusted_x'])
filtered_df_final_x_2['max_distance']  = np.where(filtered_df_final_x_2['playDirection'] == 'left', filtered_df_final_x_2['adjusted_x'] - 10, 110 - filtered_df_final_x_2['adjusted_x'])
filtered_df_final_x_3['max_distance']  = np.where(filtered_df_final_x_3['playDirection'] == 'left', filtered_df_final_x_3['adjusted_x'] - 10, 110 - filtered_df_final_x_3['adjusted_x'])
filtered_df_final_x_4['max_distance']  = np.where(filtered_df_final_x_4['playDirection'] == 'left', filtered_df_final_x_4['adjusted_x'] - 10, 110 - filtered_df_final_x_4['adjusted_x'])
filtered_df_final_x_5['max_distance']  = np.where(filtered_df_final_x_5['playDirection'] == 'left', filtered_df_final_x_5['adjusted_x'] - 10, 110 - filtered_df_final_x_5['adjusted_x'])
filtered_df_final_x_6['max_distance']  = np.where(filtered_df_final_x_6['playDirection'] == 'left', filtered_df_final_x_6['adjusted_x'] - 10, 110 - filtered_df_final_x_6['adjusted_x'])
filtered_df_final_x_7['max_distance']  = np.where(filtered_df_final_x_7['playDirection'] == 'left', filtered_df_final_x_7['adjusted_x'] - 10, 110 - filtered_df_final_x_7['adjusted_x'])
filtered_df_final_x_8['max_distance']  = np.where(filtered_df_final_x_8['playDirection'] == 'left', filtered_df_final_x_8['adjusted_x'] - 10, 110 - filtered_df_final_x_8['adjusted_x'])
filtered_df_final_x_9['max_distance']  = np.where(filtered_df_final_x_9['playDirection'] == 'left', filtered_df_final_x_9['adjusted_x'] - 10, 110 - filtered_df_final_x_9['adjusted_x'])

In [25]:
filtered_df_final_x_1['actual_dist_from_final'] = np.where(filtered_df_final_x_1['playDirection'] == 'left', filtered_df_final_x_1['adjusted_x'] - filtered_df_final_x_1['final_x'], filtered_df_final_x_1['final_x'] - filtered_df_final_x_1['adjusted_x'])
filtered_df_final_x_2['actual_dist_from_final'] = np.where(filtered_df_final_x_2['playDirection'] == 'left', filtered_df_final_x_2['adjusted_x'] - filtered_df_final_x_2['final_x'], filtered_df_final_x_2['final_x'] - filtered_df_final_x_2['adjusted_x'])
filtered_df_final_x_3['actual_dist_from_final'] = np.where(filtered_df_final_x_3['playDirection'] == 'left', filtered_df_final_x_3['adjusted_x'] - filtered_df_final_x_3['final_x'], filtered_df_final_x_3['final_x'] - filtered_df_final_x_3['adjusted_x'])
filtered_df_final_x_4['actual_dist_from_final'] = np.where(filtered_df_final_x_4['playDirection'] == 'left', filtered_df_final_x_4['adjusted_x'] - filtered_df_final_x_4['final_x'], filtered_df_final_x_4['final_x'] - filtered_df_final_x_4['adjusted_x'])
filtered_df_final_x_5['actual_dist_from_final'] = np.where(filtered_df_final_x_5['playDirection'] == 'left', filtered_df_final_x_5['adjusted_x'] - filtered_df_final_x_5['final_x'], filtered_df_final_x_5['final_x'] - filtered_df_final_x_5['adjusted_x'])
filtered_df_final_x_6['actual_dist_from_final'] = np.where(filtered_df_final_x_6['playDirection'] == 'left', filtered_df_final_x_6['adjusted_x'] - filtered_df_final_x_6['final_x'], filtered_df_final_x_6['final_x'] - filtered_df_final_x_6['adjusted_x'])
filtered_df_final_x_7['actual_dist_from_final'] = np.where(filtered_df_final_x_7['playDirection'] == 'left', filtered_df_final_x_7['adjusted_x'] - filtered_df_final_x_7['final_x'], filtered_df_final_x_7['final_x'] - filtered_df_final_x_7['adjusted_x'])
filtered_df_final_x_8['actual_dist_from_final'] = np.where(filtered_df_final_x_8['playDirection'] == 'left', filtered_df_final_x_8['adjusted_x'] - filtered_df_final_x_8['final_x'], filtered_df_final_x_8['final_x'] - filtered_df_final_x_8['adjusted_x'])
filtered_df_final_x_9['actual_dist_from_final'] = np.where(filtered_df_final_x_9['playDirection'] == 'left', filtered_df_final_x_9['adjusted_x'] - filtered_df_final_x_9['final_x'], filtered_df_final_x_9['final_x'] - filtered_df_final_x_9['adjusted_x'])

handoffs_1 = w1_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_2 = w2_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_3 = w3_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_4 = w4_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_5 = w5_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_6 = w6_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_7 = w7_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_8 = w8_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]
handoffs_9 = w9_rushes_params[['gameId', 'playId', 'frameId_x', 'event_x', 'frameId_y', 'event_y', 'ballCarrierId', 'ballCarrierDisplayName', 'playDirection']]

handoffs_xy_1 = handoffs_1.merge(filtered_df_final_x_1[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_2 = handoffs_2.merge(filtered_df_final_x_2[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_3 = handoffs_3.merge(filtered_df_final_x_3[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_4 = handoffs_4.merge(filtered_df_final_x_4[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_5 = handoffs_5.merge(filtered_df_final_x_5[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_6 = handoffs_6.merge(filtered_df_final_x_6[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_7 = handoffs_7.merge(filtered_df_final_x_7[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_8 = handoffs_8.merge(filtered_df_final_x_8[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')
handoffs_xy_9 = handoffs_9.merge(filtered_df_final_x_9[['gameId', 'playId', 'frameId', 'x', 'y']], left_on=['gameId', 'playId', 'frameId_x'], right_on=['gameId', 'playId', 'frameId'], how='inner').drop(columns='frameId')

handoffs_xy_1.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_2.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_3.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_4.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_5.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_6.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_7.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_8.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)
handoffs_xy_9.rename(columns={'x': 'bc_x', 'y': 'bc_y'}, inplace=True)

t_w1_pos = tracking_week_1[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w2_pos = tracking_week_2[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w3_pos = tracking_week_3[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w4_pos = tracking_week_4[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w5_pos = tracking_week_5[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w6_pos = tracking_week_6[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w7_pos = tracking_week_7[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w8_pos = tracking_week_8[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')
t_w9_pos = tracking_week_9[['gameId', 'playId', 'nflId', 'displayName', 'frameId', 'x', 'y', 's', 'a', 'dis', 'o', 'dir']].merge(players[['nflId', 'position']], on='nflId', how='inner')

In [26]:
position_conditions_1 = [
    t_w1_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w1_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w1_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w1_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w1_pos['position'].isin(['TE']),
    t_w1_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w1_pos['position'].isin(['QB'])
]



position_conditions_2 = [
    t_w2_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w2_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w2_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w2_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w2_pos['position'].isin(['TE']),
    t_w2_pos['position'].isin(['RB', 'FB',  'WR']),
    t_w2_pos['position'].isin(['QB'])
]

position_conditions_3 = [
    t_w3_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w3_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w3_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w3_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w3_pos['position'].isin(['TE']),
    t_w3_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w3_pos['position'].isin(['QB'])
]


position_conditions_4 = [
    t_w4_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w4_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w4_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w4_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w4_pos['position'].isin(['TE']),
    t_w4_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w4_pos['position'].isin(['QB'])
]

position_conditions_5 = [
    t_w5_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w5_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w5_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w5_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w5_pos['position'].isin(['TE']),
    t_w5_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w5_pos['position'].isin(['QB'])
]

position_conditions_6 = [
    t_w6_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w6_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w6_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w6_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w6_pos['position'].isin(['TE']),
    t_w6_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w6_pos['position'].isin(['QB'])
]

position_conditions_7 = [
    t_w7_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w7_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w7_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w7_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w7_pos['position'].isin(['TE']),
    t_w7_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w7_pos['position'].isin(['QB'])
]

position_conditions_8 = [
    t_w8_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w8_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w8_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w8_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w8_pos['position'].isin(['TE']),
    t_w8_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w8_pos['position'].isin(['QB'])
]

position_conditions_9 = [
    t_w9_pos['position'].isin(['DT', 'DE', 'NT']),
    t_w9_pos['position'].isin(['CB', 'DB', 'FS', 'SS']),
    t_w9_pos['position'].isin(['OLB', 'ILB', 'MLB']),
    t_w9_pos['position'].isin(['T', 'G', 'C', 'OL']),
    t_w9_pos['position'].isin(['TE']),
    t_w9_pos['position'].isin(['RB', 'FB', 'WR']),
    t_w9_pos['position'].isin(['QB'])
]



position_choices = ['DL',
                    'DB',
                    'LB',
                    'OL',
                    'TE',
                    'SK',
                    'QB']

In [27]:
t_w1_pos['new_pos'] = np.select(position_conditions_1, position_choices, default='OTHER')
t_w2_pos['new_pos'] = np.select(position_conditions_2, position_choices, default='OTHER')
t_w3_pos['new_pos'] = np.select(position_conditions_3, position_choices, default='OTHER')
t_w4_pos['new_pos'] = np.select(position_conditions_4, position_choices, default='OTHER')
t_w5_pos['new_pos'] = np.select(position_conditions_5, position_choices, default='OTHER')
t_w6_pos['new_pos'] = np.select(position_conditions_6, position_choices, default='OTHER')
t_w7_pos['new_pos'] = np.select(position_conditions_7, position_choices, default='OTHER')
t_w8_pos['new_pos'] = np.select(position_conditions_8, position_choices, default='OTHER')
t_w9_pos['new_pos'] = np.select(position_conditions_9, position_choices, default='OTHER')

t_w1_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w2_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w3_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w4_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w5_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w6_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w7_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w8_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)
t_w9_pos.rename(columns={'x': 'x_nb', 'y': 'y_nb','s': 's_nb','a': 'a_nb','dis': 'dis_nb','o': 'o_nb','dir': 'dir_nb', }, inplace=True)

In [28]:
handoffs_w1 = handoffs_xy_1.merge(t_w1_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w2 = handoffs_xy_2.merge(t_w2_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w3 = handoffs_xy_3.merge(t_w3_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w4 = handoffs_xy_4.merge(t_w4_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w5 = handoffs_xy_5.merge(t_w5_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w6 = handoffs_xy_6.merge(t_w6_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w7 = handoffs_xy_7.merge(t_w7_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w8 = handoffs_xy_8.merge(t_w8_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')
handoffs_w9 = handoffs_xy_9.merge(t_w9_pos, left_on=['gameId', 'playId'], right_on=['gameId', 'playId'], how='inner')



handoffs_w_others_1 = handoffs_w1[(handoffs_w1['frameId'] >= handoffs_w1['frameId_x']) & (handoffs_w1['frameId'] <= handoffs_w1['frameId_y'])]
handoffs_w_others_2 = handoffs_w2[(handoffs_w2['frameId'] >= handoffs_w2['frameId_x']) & (handoffs_w2['frameId'] <= handoffs_w2['frameId_y'])]
handoffs_w_others_3 = handoffs_w3[(handoffs_w3['frameId'] >= handoffs_w3['frameId_x']) & (handoffs_w3['frameId'] <= handoffs_w3['frameId_y'])]
handoffs_w_others_4 = handoffs_w4[(handoffs_w4['frameId'] >= handoffs_w4['frameId_x']) & (handoffs_w4['frameId'] <= handoffs_w4['frameId_y'])]
handoffs_w_others_5 = handoffs_w5[(handoffs_w5['frameId'] >= handoffs_w5['frameId_x']) & (handoffs_w5['frameId'] <= handoffs_w5['frameId_y'])]
handoffs_w_others_6 = handoffs_w6[(handoffs_w6['frameId'] >= handoffs_w6['frameId_x']) & (handoffs_w6['frameId'] <= handoffs_w6['frameId_y'])]
handoffs_w_others_7 = handoffs_w7[(handoffs_w7['frameId'] >= handoffs_w7['frameId_x']) & (handoffs_w7['frameId'] <= handoffs_w7['frameId_y'])]
handoffs_w_others_8 = handoffs_w8[(handoffs_w8['frameId'] >= handoffs_w8['frameId_x']) & (handoffs_w8['frameId'] <= handoffs_w8['frameId_y'])]
handoffs_w_others_9 = handoffs_w9[(handoffs_w9['frameId'] >= handoffs_w9['frameId_x']) & (handoffs_w9['frameId'] <= handoffs_w9['frameId_y'])]

In [29]:
handoffs_w_others_1 = handoffs_w_others_1.merge(tracking_week_1[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_2 = handoffs_w_others_2.merge(tracking_week_2[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_3 = handoffs_w_others_3.merge(tracking_week_3[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_4 = handoffs_w_others_4.merge(tracking_week_4[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_5 = handoffs_w_others_5.merge(tracking_week_5[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_6 = handoffs_w_others_6.merge(tracking_week_6[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_7 = handoffs_w_others_7.merge(tracking_week_7[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_8 = handoffs_w_others_8.merge(tracking_week_8[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)
handoffs_w_others_9 = handoffs_w_others_9.merge(tracking_week_9[['x', 'y', 'nflId', 'gameId', 'playId', 'frameId']],
            left_on=['frameId', 'ballCarrierId', 'gameId', 'playId'],
                          right_on=['frameId', 'nflId', 'gameId', 'playId']).drop(['nflId_y', 'bc_x', 'bc_y'], axis=1)

In [30]:
handoffs_w_others_1 = handoffs_w_others_1.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_2 = handoffs_w_others_2.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_3 = handoffs_w_others_3.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_4 = handoffs_w_others_4.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_5 = handoffs_w_others_5.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_6 = handoffs_w_others_6.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_7 = handoffs_w_others_7.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_8 = handoffs_w_others_8.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})
handoffs_w_others_9 = handoffs_w_others_9.rename(columns={"x": "bc_x", "y": "bc_y", "nflId_x": "nflId"})

In [31]:
import math

# Function to calculate Euclidean distance
def calculate_distance(row):
    return ((row['bc_x'] - row['x_nb']) ** 2 + (row['bc_y'] - row['y_nb']) ** 2) ** 0.5


def normalize_angle(angle):
  return angle % 360

# function to calculate the angle
def calculate_angle(row):
    # Calculate the differences in coordinates
    delta_x = row['x_nb'] - row['bc_x']
    delta_y = row['y_nb'] - row['bc_y']

    # Calculate the angle in radians
    angle_rad = math.atan2(delta_y, delta_x)

    # Convert the angle to degrees (optional)
    angle_deg = math.degrees(angle_rad)

    if row['playDirection'] == 'left':
      angle_deg = (angle_deg + 180) % 360

    return normalize_angle(angle_deg)


In [32]:
handoffs_w_others_1['dist_to_ball'] = handoffs_w_others_1.apply(calculate_distance, axis=1)
handoffs_w_others_2['dist_to_ball'] = handoffs_w_others_2.apply(calculate_distance, axis=1)
handoffs_w_others_3['dist_to_ball'] = handoffs_w_others_3.apply(calculate_distance, axis=1)
handoffs_w_others_4['dist_to_ball'] = handoffs_w_others_4.apply(calculate_distance, axis=1)
handoffs_w_others_5['dist_to_ball'] = handoffs_w_others_5.apply(calculate_distance, axis=1)
handoffs_w_others_6['dist_to_ball'] = handoffs_w_others_6.apply(calculate_distance, axis=1)
handoffs_w_others_7['dist_to_ball'] = handoffs_w_others_7.apply(calculate_distance, axis=1)
handoffs_w_others_8['dist_to_ball'] = handoffs_w_others_8.apply(calculate_distance, axis=1)
handoffs_w_others_9['dist_to_ball'] = handoffs_w_others_9.apply(calculate_distance, axis=1)

handoffs_w_others_1['ang_to_ball'] = handoffs_w_others_1.apply(calculate_angle, axis=1)
handoffs_w_others_2['ang_to_ball'] = handoffs_w_others_2.apply(calculate_angle, axis=1)
handoffs_w_others_3['ang_to_ball'] = handoffs_w_others_3.apply(calculate_angle, axis=1)
handoffs_w_others_4['ang_to_ball'] = handoffs_w_others_4.apply(calculate_angle, axis=1)
handoffs_w_others_5['ang_to_ball'] = handoffs_w_others_5.apply(calculate_angle, axis=1)
handoffs_w_others_6['ang_to_ball'] = handoffs_w_others_6.apply(calculate_angle, axis=1)
handoffs_w_others_7['ang_to_ball'] = handoffs_w_others_7.apply(calculate_angle, axis=1)
handoffs_w_others_8['ang_to_ball'] = handoffs_w_others_8.apply(calculate_angle, axis=1)
handoffs_w_others_9['ang_to_ball'] = handoffs_w_others_9.apply(calculate_angle, axis=1)

**B) Create new columns as indicators for a player being within 3 or 5 yards of the ball carrier, filter dataframe to closest 15 players, reduce dataframe to the ball carrier and 3 closest defenders, and create dataframe 'players_within_3_5_wks_7_9' to be used for later insights**

In [33]:
# Create columns for if the player is within 3 or 5 yards of the ball carrier

dist_to_ball = 3

handoffs_w_others_1[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_1['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_2[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_2['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_3[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_3['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_4[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_4['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_5[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_5['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_6[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_6['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_7[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_7['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_8[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_8['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_9[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_9['dist_to_ball'] <= dist_to_ball, 1, 0)



dist_to_ball = 5

handoffs_w_others_1[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_1['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_2[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_2['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_3[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_3['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_4[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_4['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_5[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_5['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_6[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_6['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_7[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_7['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_8[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_8['dist_to_ball'] <= dist_to_ball, 1, 0)
handoffs_w_others_9[f"within_{dist_to_ball}_yds"] = np.where(handoffs_w_others_9['dist_to_ball'] <= dist_to_ball, 1, 0)

In [34]:
# Convert column to integer

handoffs_w_others_1["frameId_x"] = handoffs_w_others_1["frameId_x"].astype(int)
handoffs_w_others_2["frameId_x"] = handoffs_w_others_2["frameId_x"].astype(int)
handoffs_w_others_3["frameId_x"] = handoffs_w_others_3["frameId_x"].astype(int)
handoffs_w_others_4["frameId_x"] = handoffs_w_others_4["frameId_x"].astype(int)
handoffs_w_others_5["frameId_x"] = handoffs_w_others_5["frameId_x"].astype(int)
handoffs_w_others_6["frameId_x"] = handoffs_w_others_6["frameId_x"].astype(int)
handoffs_w_others_7["frameId_x"] = handoffs_w_others_7["frameId_x"].astype(int)
handoffs_w_others_8["frameId_x"] = handoffs_w_others_8["frameId_x"].astype(int)
handoffs_w_others_9["frameId_x"] = handoffs_w_others_9["frameId_x"].astype(int)


# Create column for frame since ball carrier gets the handoff

handoffs_w_others_1["frame_since_bc"] = handoffs_w_others_1["frameId"] - handoffs_w_others_1["frameId_x"]
handoffs_w_others_2["frame_since_bc"] = handoffs_w_others_2["frameId"] - handoffs_w_others_2["frameId_x"]
handoffs_w_others_3["frame_since_bc"] = handoffs_w_others_3["frameId"] - handoffs_w_others_3["frameId_x"]
handoffs_w_others_4["frame_since_bc"] = handoffs_w_others_4["frameId"] - handoffs_w_others_4["frameId_x"]
handoffs_w_others_5["frame_since_bc"] = handoffs_w_others_5["frameId"] - handoffs_w_others_5["frameId_x"]
handoffs_w_others_6["frame_since_bc"] = handoffs_w_others_6["frameId"] - handoffs_w_others_6["frameId_x"]
handoffs_w_others_7["frame_since_bc"] = handoffs_w_others_7["frameId"] - handoffs_w_others_7["frameId_x"]
handoffs_w_others_8["frame_since_bc"] = handoffs_w_others_8["frameId"] - handoffs_w_others_8["frameId_x"]
handoffs_w_others_9["frame_since_bc"] = handoffs_w_others_9["frameId"] - handoffs_w_others_9["frameId_x"]

In [39]:
# Reduce dataframe to ball carrier and 7 closest offensive and defensive players

filtered_list = [1,2,3,4,9,10,11,12,13,14,15]

reduce_sorted_df_1 = sorted_df_1[sorted_df_1['rank'].isin(filtered_list)]
reduce_sorted_df_2 = sorted_df_2[sorted_df_2['rank'].isin(filtered_list)]
reduce_sorted_df_3 = sorted_df_3[sorted_df_3['rank'].isin(filtered_list)]
reduce_sorted_df_4 = sorted_df_4[sorted_df_4['rank'].isin(filtered_list)]
reduce_sorted_df_5 = sorted_df_5[sorted_df_5['rank'].isin(filtered_list)]
reduce_sorted_df_6 = sorted_df_6[sorted_df_6['rank'].isin(filtered_list)]
reduce_sorted_df_7 = sorted_df_7[sorted_df_7['rank'].isin(filtered_list)]
reduce_sorted_df_8 = sorted_df_8[sorted_df_8['rank'].isin(filtered_list)]
reduce_sorted_df_9 = sorted_df_9[sorted_df_9['rank'].isin(filtered_list)]

In [40]:
# Reduce to ball carrier and 3 closest defenders

reduce_sorted_df_1_4_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] < 4.5]
reduce_sorted_df_1_4_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] < 4.5]
reduce_sorted_df_1_4_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] < 4.5]
reduce_sorted_df_1_4_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] < 4.5]
reduce_sorted_df_1_4_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] < 4.5]
reduce_sorted_df_1_4_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] < 4.5]
reduce_sorted_df_1_4_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] < 4.5]
reduce_sorted_df_1_4_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] < 4.5]
reduce_sorted_df_1_4_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] < 4.5]

In [41]:
reduce_sorted_df_1[reduce_sorted_df_1['rank'] < 10].head(10).T

Unnamed: 0,0,1,2,3,8,22,23,24,25,30
gameId,2022090800,2022090800,2022090800,2022090800,2022090800,2022090800,2022090800,2022090800,2022090800,2022090800
playId,101,101,101,101,101,101,101,101,101,101
frameId_x,19,19,19,19,19,19,19,19,19,19
event_x,handoff,handoff,handoff,handoff,handoff,handoff,handoff,handoff,handoff,handoff
frameId_y,45,45,45,45,45,45,45,45,45,45
event_y,tackle,tackle,tackle,tackle,tackle,tackle,tackle,tackle,tackle,tackle
ballCarrierId,47857,47857,47857,47857,47857,47857,47857,47857,47857,47857
ballCarrierDisplayName,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary
playDirection,left,left,left,left,left,left,left,left,left,left
nflId,47857.0,43335.0,41239.0,47917.0,53079.0,47857.0,43335.0,41239.0,47917.0,53079.0


In [None]:
# Dataframe created and saved for future use in Data Viz Insights, done here to then be able to delete and clear space soon after

players_within_3_5_wks_7_9 = pd.concat([handoffs_w_others_7[['gameId', 'playId', 'frameId', 'nflId', 'displayName', 'new_pos',
                     'within_3_yds', 'within_5_yds']], handoffs_w_others_8[['gameId', 'playId', 'frameId', 'nflId', 'displayName', 'new_pos',
                     'within_3_yds', 'within_5_yds']], handoffs_w_others_9[['gameId', 'playId', 'frameId', 'nflId', 'displayName', 'new_pos',
                     'within_3_yds', 'within_5_yds']]])

players_within_3_5_wks_7_9 = players_within_3_5_wks_7_9.merge(plays[['gameId', 'playId', 'defensiveTeam', 'possessionTeam']])

#players_within_3_5_wks_7_9.to_pickle('players_within_3_5_wks_7_9.pkl')

**C) Get the positional coordinates of the 7 closest offensive players to the ball carrier, clean up by deleting old variables, calculate the distance and angle of the 7 closest offensive players to the ball carrier, calculate the distance and angle of the two closest offensive players (besides QB and the closest offensive player to the ball carrier) to each defender, and reposition data so it's the same for plays that went both left and right**

In [42]:
# Get the coordinates of the 7 closest offensive players to the ball carrier in preparation to find closest players to defenders
# Closest offensive player (non qb to bc is 9)

xy_sorted_df_9_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_1.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_2.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_3.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_4.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_5.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_6.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_7.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_8.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

xy_sorted_df_9_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] == 9.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_9_9.rename(columns={'x_nb': 'x_no_1', 'y_nb': 'y_no_1'}, inplace=True)

In [43]:
# 10th offensive player is 2nd closest all the way to 15th, get their x/y coordinates
xy_sorted_df_10_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_1.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_2.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_3.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_4.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_5.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_6.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_7.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_8.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

xy_sorted_df_10_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] == 10.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_10_9.rename(columns={'x_nb': 'x_no_2', 'y_nb': 'y_no_2'}, inplace=True)

# 11
xy_sorted_df_11_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_1.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_2.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_3.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_4.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_5.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_6.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_7.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_8.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)

xy_sorted_df_11_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] == 11.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_11_9.rename(columns={'x_nb': 'x_no_3', 'y_nb': 'y_no_3'}, inplace=True)


# 12
xy_sorted_df_12_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_1.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_2.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_3.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_4.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_5.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_6.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_7.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_8.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)

xy_sorted_df_12_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] == 12.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_12_9.rename(columns={'x_nb': 'x_no_4', 'y_nb': 'y_no_4'}, inplace=True)



# 13
xy_sorted_df_13_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_1.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_2.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_3.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_4.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_5.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_6.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_7.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_8.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)

xy_sorted_df_13_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] == 13.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_13_9.rename(columns={'x_nb': 'x_no_5', 'y_nb': 'y_no_5'}, inplace=True)


# 14
xy_sorted_df_14_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_1.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_2.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_3.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_4.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_5.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_6.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_7.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_8.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)

xy_sorted_df_14_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] == 14.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_14_9.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)


# 15
xy_sorted_df_15_1 = reduce_sorted_df_1[reduce_sorted_df_1['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_1.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_2 = reduce_sorted_df_2[reduce_sorted_df_2['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_2.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_3 = reduce_sorted_df_3[reduce_sorted_df_3['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_3.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_4 = reduce_sorted_df_4[reduce_sorted_df_4['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_4.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_5 = reduce_sorted_df_5[reduce_sorted_df_5['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_5.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_6 = reduce_sorted_df_6[reduce_sorted_df_6['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_6.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_7 = reduce_sorted_df_7[reduce_sorted_df_7['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_7.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_8 = reduce_sorted_df_8[reduce_sorted_df_8['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_8.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

xy_sorted_df_15_9 = reduce_sorted_df_9[reduce_sorted_df_9['rank'] == 15.0][['gameId', 'playId', 'frameId', 'x_nb', 'y_nb']]
xy_sorted_df_15_9.rename(columns={'x_nb': 'x_no_7', 'y_nb': 'y_no_7'}, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  xy_sorted_df_14_7.rename(columns={'x_nb': 'x_no_6', 'y_nb': 'y_no_6'}, inplace=True)


In [44]:
xy_sorted_df_9_15_1 = xy_sorted_df_9_1.merge(xy_sorted_df_10_1, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_1, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_1, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_1, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_1, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_1, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_2 = xy_sorted_df_9_2.merge(xy_sorted_df_10_2, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_2, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_2, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_2, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_2, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_2, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_3 = xy_sorted_df_9_3.merge(xy_sorted_df_10_3, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_3, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_3, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_3, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_3, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_3, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_4 = xy_sorted_df_9_4.merge(xy_sorted_df_10_4, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_4, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_4, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_4, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_4, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_4, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_5 = xy_sorted_df_9_5.merge(xy_sorted_df_10_5, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_5, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_5, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_5, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_5, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_5, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_6 = xy_sorted_df_9_6.merge(xy_sorted_df_10_6, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_6, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_6, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_6, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_6, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_6, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_7 = xy_sorted_df_9_7.merge(xy_sorted_df_10_7, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_7, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_7, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_7, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_7, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_7, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_8 = xy_sorted_df_9_8.merge(xy_sorted_df_10_8, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_8, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_8, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_8, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_8, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_8, on=['gameId', 'playId', 'frameId'], how='inner')

xy_sorted_df_9_15_9 = xy_sorted_df_9_9.merge(xy_sorted_df_10_9, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_11_9, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_12_9, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_13_9, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_14_9, on=['gameId', 'playId', 'frameId'],
                        how='inner').merge(xy_sorted_df_15_9, on=['gameId', 'playId', 'frameId'], how='inner')



In [45]:
# clean up

import gc

del sorted_df_1, sorted_df_2, sorted_df_3, sorted_df_4, sorted_df_5, sorted_df_6, sorted_df_7, sorted_df_8, sorted_df_9

gc.collect()

0

In [46]:
import gc

del xy_sorted_df_15_1, xy_sorted_df_15_2, xy_sorted_df_15_3, xy_sorted_df_15_4, xy_sorted_df_15_5, xy_sorted_df_15_6,
xy_sorted_df_15_7, xy_sorted_df_15_8, xy_sorted_df_15_9

gc.collect()

0

In [47]:
sorted_df_1 = reduce_sorted_df_1_4_1.merge(xy_sorted_df_9_15_1, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_2 = reduce_sorted_df_1_4_2.merge(xy_sorted_df_9_15_2, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_3 = reduce_sorted_df_1_4_3.merge(xy_sorted_df_9_15_3, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_4 = reduce_sorted_df_1_4_4.merge(xy_sorted_df_9_15_4, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_5 = reduce_sorted_df_1_4_5.merge(xy_sorted_df_9_15_5, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_6 = reduce_sorted_df_1_4_6.merge(xy_sorted_df_9_15_6, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_7 = reduce_sorted_df_1_4_7.merge(xy_sorted_df_9_15_7, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_8 = reduce_sorted_df_1_4_8.merge(xy_sorted_df_9_15_8, on=['gameId', 'playId', 'frameId'], how='inner')
sorted_df_9 = reduce_sorted_df_1_4_9.merge(xy_sorted_df_9_15_9, on=['gameId', 'playId', 'frameId'], how='inner')

In [48]:
def calculate_distance_2(row, x1, x2, y1, y2):
    return ((row[x1] - row[x2]) ** 2 + (row[y1] - row[y2]) ** 2) ** 0.5

def calculate_angle_2(row, x1, x2, y1, y2):
    # Calculate the differences in coordinates
    delta_x = row[x2] - row[x1]
    delta_y = row[y2] - row[y1]

    # Calculate the angle in radians
    angle_rad = math.atan2(delta_y, delta_x)

    # Convert the angle to degrees (optional)
    angle_deg = math.degrees(angle_rad)

    if row['playDirection'] == 'left':
      angle_deg = (angle_deg + 180) % 360

    return normalize_angle(angle_deg)


# sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

0        158.976154
1         36.209457
2         76.796334
3         55.583454
4        157.215882
            ...    
86251    347.691984
86252    305.380272
86253    272.267955
86254    350.362462
86255    344.538782
Length: 86256, dtype: float64

In [49]:
# distance and angle to the offensive player (non-qb) to the ball carrier
sorted_df_1['dist_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_1['ang_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_1['dist_2_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_1['ang_2_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_1['dist_3_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_1['ang_3_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_1['dist_4_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_1['ang_4_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_1['dist_5_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_1['ang_5_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_1['dist_6_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_1['ang_6_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_1['dist_7_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_1['ang_7_closest_o_bc'] = sorted_df_1.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)


# df 2
sorted_df_2['dist_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_2['ang_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_2['dist_2_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_2['ang_2_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_2['dist_3_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_2['ang_3_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_2['dist_4_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_2['ang_4_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_2['dist_5_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_2['ang_5_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_2['dist_6_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_2['ang_6_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_2['dist_7_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_2['ang_7_closest_o_bc'] = sorted_df_2.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)



# 3
sorted_df_3['dist_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_3['ang_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_3['dist_2_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_3['ang_2_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_3['dist_3_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_3['ang_3_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_3['dist_4_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_3['ang_4_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_3['dist_5_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_3['ang_5_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_3['dist_6_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_3['ang_6_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_3['dist_7_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_3['ang_7_closest_o_bc'] = sorted_df_3.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)


# 4
sorted_df_4['dist_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_4['ang_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_4['dist_2_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_4['ang_2_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_4['dist_3_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_4['ang_3_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_4['dist_4_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_4['ang_4_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_4['dist_5_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_4['ang_5_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_4['dist_6_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_4['ang_6_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_4['dist_7_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_4['ang_7_closest_o_bc'] = sorted_df_4.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)


# 5
sorted_df_5['dist_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_5['ang_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_5['dist_2_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_5['ang_2_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_5['dist_3_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_5['ang_3_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_5['dist_4_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_5['ang_4_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_5['dist_5_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_5['ang_5_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_5['dist_6_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_5['ang_6_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_5['dist_7_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_5['ang_7_closest_o_bc'] = sorted_df_5.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)



# 6
sorted_df_6['dist_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_6['ang_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_6['dist_2_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_6['ang_2_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_6['dist_3_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_6['ang_3_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_6['dist_4_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_6['ang_4_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_6['dist_5_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_6['ang_5_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_6['dist_6_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_6['ang_6_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_6['dist_7_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_6['ang_7_closest_o_bc'] = sorted_df_6.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)


# 7
sorted_df_7['dist_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_7['ang_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_7['dist_2_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_7['ang_2_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_7['dist_3_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_7['ang_3_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_7['dist_4_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_7['ang_4_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_7['dist_5_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_7['ang_5_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_7['dist_6_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_7['ang_6_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_7['dist_7_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_7['ang_7_closest_o_bc'] = sorted_df_7.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)



# 8
sorted_df_8['dist_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_8['ang_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_8['dist_2_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_8['ang_2_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_8['dist_3_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_8['ang_3_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_8['dist_4_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_8['ang_4_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_8['dist_5_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_8['ang_5_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_8['dist_6_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_8['ang_6_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_8['dist_7_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_8['ang_7_closest_o_bc'] = sorted_df_8.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)



# 9
sorted_df_9['dist_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_distance_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)
sorted_df_9['ang_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_angle_2(row, 'x_no_1', 'x_nb', 'y_no_1', 'y_nb'), axis=1)

sorted_df_9['dist_2_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_distance_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)
sorted_df_9['ang_2_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_angle_2(row, 'x_no_2', 'x_nb', 'y_no_2', 'y_nb'), axis=1)

sorted_df_9['dist_3_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_distance_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)
sorted_df_9['ang_3_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_angle_2(row, 'x_no_3', 'x_nb', 'y_no_3', 'y_nb'), axis=1)

sorted_df_9['dist_4_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_distance_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)
sorted_df_9['ang_4_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_angle_2(row, 'x_no_4', 'x_nb', 'y_no_4', 'y_nb'), axis=1)

sorted_df_9['dist_5_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_distance_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)
sorted_df_9['ang_5_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_angle_2(row, 'x_no_5', 'x_nb', 'y_no_5', 'y_nb'), axis=1)

sorted_df_9['dist_6_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_distance_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)
sorted_df_9['ang_6_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_angle_2(row, 'x_no_6', 'x_nb', 'y_no_6', 'y_nb'), axis=1)

sorted_df_9['dist_7_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_distance_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)
sorted_df_9['ang_7_closest_o_bc'] = sorted_df_9.apply(lambda row: calculate_angle_2(row, 'x_no_7', 'x_nb', 'y_no_7', 'y_nb'), axis=1)

In [50]:
import gc

del handoffs_w4, handoffs_w5, handoffs_w3, tracking_week_5, handoffs_w8, handoffs_w2,
tracking_week_4, tracking_week_3, tracking_week_1, tracking_week_8, handoffs_w1,
tracking_week_2, handoffs_w7, tracking_week_7, handoffs_w6, tracking_week_6, handoffs_w9,
tracking_week_9, t_w5_pos, t_w4_pos, t_w3_pos, t_w1_pos, t_w8_pos, t_w2_pos, handoffs_w_others_4,
handoffs_w_others_5, t_w7_pos, handoffs_w_others_3, handoffs_w_others_8, t_w6_pos,
handoffs_w_others_2, handoffs_w_others_1, handoffs_w_others_7, t_w9_pos, handoffs_w_others_6,
handoffs_w_others_9, reduce_sorted_df_4, reduce_sorted_df_5, reduce_sorted_df_3,
reduce_sorted_df_8, reduce_sorted_df_2, reduce_sorted_df_1, reduce_sorted_df_7,
reduce_sorted_df_6, reduce_sorted_df_9, t_w5, t_w4, t_w3, t_w1, t_w8, t_w2, t_w7,
t_w6, t_w9,  reduce_sorted_df_1_4_4, reduce_sorted_df_1_4_5, reduce_sorted_df_1_4_3,
reduce_sorted_df_1_4_8, reduce_sorted_df_1_4_2, reduce_sorted_df_1_4_1,
reduce_sorted_df_1_4_7, reduce_sorted_df_1_4_6, reduce_sorted_df_1_4_9

gc.collect()

0

In [51]:
del players_within_3_5_wks_7_9

gc.collect()

0

In [52]:
# Other than the closest offensive player to ball carrier, find the angle and distance from the defender of the two nearest offensive players (besides QB or ball carrier)

def calculate_min_and_second_min_distances(df):
    # The last 14 columns are the distances and angles
    distance_columns = df.columns[-12::2]  # Select every second column starting from the last 14th
    angle_columns = df.columns[-11::2]  # Select every second column starting from the last 13th

    # Calculate the minimum distance and its corresponding angle
    min_distance = df[distance_columns].min(axis=1)
    min_distance_idx = df[distance_columns].idxmin(axis=1)
    df['min_distance'] = min_distance

    # Map distance columns to their corresponding angle columns
    distance_angle_map = dict(zip(distance_columns, angle_columns))
    angle_for_min_distance = min_distance_idx.map(distance_angle_map)
    df['angle_for_min_distance'] = df.lookup(min_distance_idx.index, angle_for_min_distance)

    # Mask the minimum distance values and find the second minimum
    mask = df[distance_columns].eq(min_distance, axis=0)
    df['second_min_distance'] = df[distance_columns].mask(mask).min(axis=1)
    second_min_distance_idx = df[distance_columns].mask(mask).idxmin(axis=1)
    angle_for_second_min_distance = second_min_distance_idx.map(distance_angle_map)
    df['angle_for_second_min_distance'] = df.lookup(second_min_distance_idx.index, angle_for_second_min_distance)

    return df

In [53]:
sorted_df_1 = calculate_min_and_second_min_distances(sorted_df_1)
sorted_df_2 = calculate_min_and_second_min_distances(sorted_df_2)
sorted_df_3 = calculate_min_and_second_min_distances(sorted_df_3)
sorted_df_4 = calculate_min_and_second_min_distances(sorted_df_4)
sorted_df_5 = calculate_min_and_second_min_distances(sorted_df_5)
sorted_df_6 = calculate_min_and_second_min_distances(sorted_df_6)
sorted_df_7 = calculate_min_and_second_min_distances(sorted_df_7)
sorted_df_8 = calculate_min_and_second_min_distances(sorted_df_8)
sorted_df_9 = calculate_min_and_second_min_distances(sorted_df_9)

  df['angle_for_min_distance'] = df.lookup(min_distance_idx.index, angle_for_min_distance)
  df['angle_for_second_min_distance'] = df.lookup(second_min_distance_idx.index, angle_for_second_min_distance)
  df['angle_for_min_distance'] = df.lookup(min_distance_idx.index, angle_for_min_distance)
  df['angle_for_second_min_distance'] = df.lookup(second_min_distance_idx.index, angle_for_second_min_distance)
  df['angle_for_min_distance'] = df.lookup(min_distance_idx.index, angle_for_min_distance)
  df['angle_for_second_min_distance'] = df.lookup(second_min_distance_idx.index, angle_for_second_min_distance)
  df['angle_for_min_distance'] = df.lookup(min_distance_idx.index, angle_for_min_distance)
  df['angle_for_second_min_distance'] = df.lookup(second_min_distance_idx.index, angle_for_second_min_distance)
  df['angle_for_min_distance'] = df.lookup(min_distance_idx.index, angle_for_min_distance)
  df['angle_for_second_min_distance'] = df.lookup(second_min_distance_idx.index, angle_for_second

In [54]:
sorted_df_1.head(5).T.head(30)

Unnamed: 0,0,1,2,3,4
gameId,2022090800,2022090800,2022090800,2022090800,2022090800
playId,101,101,101,101,101
frameId_x,19,19,19,19,19
event_x,handoff,handoff,handoff,handoff,handoff
frameId_y,45,45,45,45,45
event_y,tackle,tackle,tackle,tackle,tackle
ballCarrierId,47857,47857,47857,47857,47857
ballCarrierDisplayName,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary
playDirection,left,left,left,left,left
nflId,47857.0,43335.0,41239.0,47917.0,47857.0


In [55]:
# get the variables 'max_distance' (ball carrier distance to the end zone) and 'actual_dist_from_final' (ball carrier distance from end of play)

dropped_cols_filtered_df_final_1 = filtered_df_final_x_1[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_2 = filtered_df_final_x_2[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_3 = filtered_df_final_x_3[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_4 = filtered_df_final_x_4[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_5 = filtered_df_final_x_5[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_6 = filtered_df_final_x_6[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_7 = filtered_df_final_x_7[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_8 = filtered_df_final_x_8[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]
dropped_cols_filtered_df_final_9 = filtered_df_final_x_9[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final']]



all_players_1 = sorted_df_1.merge(dropped_cols_filtered_df_final_1, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_2 = sorted_df_2.merge(dropped_cols_filtered_df_final_2, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_3 = sorted_df_3.merge(dropped_cols_filtered_df_final_3, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_4 = sorted_df_4.merge(dropped_cols_filtered_df_final_4, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_5 = sorted_df_5.merge(dropped_cols_filtered_df_final_5, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_6 = sorted_df_6.merge(dropped_cols_filtered_df_final_6, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_7 = sorted_df_7.merge(dropped_cols_filtered_df_final_7, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_8 = sorted_df_8.merge(dropped_cols_filtered_df_final_8, on=['gameId', 'playId', 'frameId'], how='inner')
all_players_9 = sorted_df_9.merge(dropped_cols_filtered_df_final_9, on=['gameId', 'playId', 'frameId'], how='inner')

In [56]:
# reposition so the data so the direction is always facing the same way

all_players_1['std_x_nb'] = np.where(all_players_1['playDirection'] == 'right', 120 - all_players_1['x_nb'] , all_players_1['x_nb'])
all_players_1['std_y_nb'] = np.where(all_players_1['playDirection'] == 'right', 53.3 - all_players_1['y_nb'] , all_players_1['y_nb'])
all_players_1['std_bc_x'] = np.where(all_players_1['playDirection'] == 'right', 120 - all_players_1['bc_x'] , all_players_1['bc_x'])
all_players_1['std_bc_y'] = np.where(all_players_1['playDirection'] == 'right', 53.3 - all_players_1['bc_y'] , all_players_1['bc_y'])

all_players_2['std_x_nb'] = np.where(all_players_2['playDirection'] == 'right', 120 - all_players_2['x_nb'] , all_players_2['x_nb'])
all_players_2['std_y_nb'] = np.where(all_players_2['playDirection'] == 'right', 53.3 - all_players_2['y_nb'] , all_players_2['y_nb'])
all_players_2['std_bc_x'] = np.where(all_players_2['playDirection'] == 'right', 120 - all_players_2['bc_x'] , all_players_2['bc_x'])
all_players_2['std_bc_y'] = np.where(all_players_2['playDirection'] == 'right', 53.3 - all_players_2['bc_y'] , all_players_2['bc_y'])

all_players_3['std_x_nb'] = np.where(all_players_3['playDirection'] == 'right', 120 - all_players_3['x_nb'] , all_players_3['x_nb'])
all_players_3['std_y_nb'] = np.where(all_players_3['playDirection'] == 'right', 53.3 - all_players_3['y_nb'] , all_players_3['y_nb'])
all_players_3['std_bc_x'] = np.where(all_players_3['playDirection'] == 'right', 120 - all_players_3['bc_x'] , all_players_3['bc_x'])
all_players_3['std_bc_y'] = np.where(all_players_3['playDirection'] == 'right', 53.3 - all_players_3['bc_y'] , all_players_3['bc_y'])

all_players_4['std_x_nb'] = np.where(all_players_4['playDirection'] == 'right', 120 - all_players_4['x_nb'] , all_players_4['x_nb'])
all_players_4['std_y_nb'] = np.where(all_players_4['playDirection'] == 'right', 53.3 - all_players_4['y_nb'] , all_players_4['y_nb'])
all_players_4['std_bc_x'] = np.where(all_players_4['playDirection'] == 'right', 120 - all_players_4['bc_x'] , all_players_4['bc_x'])
all_players_4['std_bc_y'] = np.where(all_players_4['playDirection'] == 'right', 53.3 - all_players_4['bc_y'] , all_players_4['bc_y'])

all_players_5['std_x_nb'] = np.where(all_players_5['playDirection'] == 'right', 120 - all_players_5['x_nb'] , all_players_5['x_nb'])
all_players_5['std_y_nb'] = np.where(all_players_5['playDirection'] == 'right', 53.3 - all_players_5['y_nb'] , all_players_5['y_nb'])
all_players_5['std_bc_x'] = np.where(all_players_5['playDirection'] == 'right', 120 - all_players_5['bc_x'] , all_players_5['bc_x'])
all_players_5['std_bc_y'] = np.where(all_players_5['playDirection'] == 'right', 53.3 - all_players_5['bc_y'] , all_players_5['bc_y'])


all_players_6['std_x_nb'] = np.where(all_players_6['playDirection'] == 'right', 120 - all_players_6['x_nb'] , all_players_6['x_nb'])
all_players_6['std_y_nb'] = np.where(all_players_6['playDirection'] == 'right', 53.3 - all_players_6['y_nb'] , all_players_6['y_nb'])
all_players_6['std_bc_x'] = np.where(all_players_6['playDirection'] == 'right', 120 - all_players_6['bc_x'] , all_players_6['bc_x'])
all_players_6['std_bc_y'] = np.where(all_players_6['playDirection'] == 'right', 53.3 - all_players_6['bc_y'] , all_players_6['bc_y'])


all_players_7['std_x_nb'] = np.where(all_players_7['playDirection'] == 'right', 120 - all_players_7['x_nb'] , all_players_7['x_nb'])
all_players_7['std_y_nb'] = np.where(all_players_7['playDirection'] == 'right', 53.3 - all_players_7['y_nb'] , all_players_7['y_nb'])
all_players_7['std_bc_x'] = np.where(all_players_7['playDirection'] == 'right', 120 - all_players_7['bc_x'] , all_players_7['bc_x'])
all_players_7['std_bc_y'] = np.where(all_players_7['playDirection'] == 'right', 53.3 - all_players_7['bc_y'] , all_players_7['bc_y'])

all_players_8['std_x_nb'] = np.where(all_players_8['playDirection'] == 'right', 120 - all_players_8['x_nb'] , all_players_8['x_nb'])
all_players_8['std_y_nb'] = np.where(all_players_8['playDirection'] == 'right', 53.3 - all_players_8['y_nb'] , all_players_8['y_nb'])
all_players_8['std_bc_x'] = np.where(all_players_8['playDirection'] == 'right', 120 - all_players_8['bc_x'] , all_players_8['bc_x'])
all_players_8['std_bc_y'] = np.where(all_players_8['playDirection'] == 'right', 53.3 - all_players_8['bc_y'] , all_players_8['bc_y'])

all_players_9['std_x_nb'] = np.where(all_players_9['playDirection'] == 'right', 120 - all_players_9['x_nb'] , all_players_9['x_nb'])
all_players_9['std_y_nb'] = np.where(all_players_9['playDirection'] == 'right', 53.3 - all_players_9['y_nb'] , all_players_9['y_nb'])
all_players_9['std_bc_x'] = np.where(all_players_9['playDirection'] == 'right', 120 - all_players_9['bc_x'] , all_players_9['bc_x'])
all_players_9['std_bc_y'] = np.where(all_players_9['playDirection'] == 'right', 53.3 - all_players_9['bc_y'] , all_players_9['bc_y'])

all_players_df = pd.concat([all_players_1, all_players_2, all_players_3, all_players_4, all_players_5,
                            all_players_6, all_players_7, all_players_8, all_players_9], axis=0)


all_players_sorted_1 = all_players_1.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_2 = all_players_2.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_3 = all_players_3.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_4 = all_players_4.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_5 = all_players_5.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_6 = all_players_6.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_7 = all_players_7.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_8 = all_players_8.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])
all_players_sorted_9 = all_players_9.sort_values(by=['gameId', 'playId', 'frameId', 'rank'])

**D) Shift data so attributes are columns by flattening and reshaping them**

This process consists of:
*   Pivot x-coordinates
*   Flatten and rename columns
*   Reset index to make 'gameId' and 'playId' columns again
*   Reorder columns to interleave x- and y-coordinates

In [57]:
# shift the data so the player attributes are columns

# Pivot the x-coordinates
all_players_sorted_y_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')


all_players_sorted_dis_closest_o_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')


all_players_sorted_y_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_2 = all_players_sorted_2.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')



all_players_sorted_y_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_3 = all_players_sorted_3.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')




all_players_sorted_y_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_4 = all_players_sorted_4.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')



all_players_sorted_y_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_5 = all_players_sorted_5.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')



all_players_sorted_y_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_6 = all_players_sorted_6.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')




all_players_sorted_y_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_7 = all_players_sorted_7.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')




all_players_sorted_y_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_8 = all_players_sorted_8.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')




all_players_sorted_y_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
all_players_sorted_s_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
all_players_sorted_dist_ball_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
all_players_sorted_ang_ball_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_to_ball')
all_players_sorted_a_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='a_nb')

all_players_sorted_dis_closest_o_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_closest_o_bc')
all_players_sorted_ang_closest_o_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='ang_closest_o_bc')
all_players_sorted_min_dist_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='min_distance')
all_players_sorted_ang_min_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_min_distance')
all_players_sorted_2_min_dist_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='second_min_distance')
all_players_sorted_2_ang_min_9 = all_players_sorted_9.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='angle_for_second_min_distance')

  all_players_sorted_y_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
  all_players_sorted_y_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
  all_players_sorted_y_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='std_y_nb')
  all_players_sorted_s_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
  all_players_sorted_s_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
  all_players_sorted_s_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='s_nb')
  all_players_sorted_dist_ball_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], columns='rank', values='dist_to_ball')
  all_players_sorted_dist_ball_1 = all_players_sorted_1.pivot(index=['gameId', 'playId', 'frameId'], col

In [58]:
# Flatten the columns and rename them
all_players_sorted_y_1.columns = [f'player{rank}_y' for rank in all_players_sorted_y_1.columns]
all_players_sorted_s_1.columns = [f'player{rank}_s' for rank in all_players_sorted_s_1.columns]
all_players_sorted_dist_ball_1.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_1.columns]
all_players_sorted_ang_ball_1.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_1.columns]
all_players_sorted_a_1.columns = [f'player{rank}_a' for rank in all_players_sorted_a_1.columns]

all_players_sorted_dis_closest_o_1.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_1.columns]
all_players_sorted_ang_closest_o_1.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_1.columns]
all_players_sorted_min_dist_1.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_1.columns]
all_players_sorted_ang_min_1.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_1.columns]
all_players_sorted_2_min_dist_1.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_1.columns]
all_players_sorted_2_ang_min_1.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_1.columns]


# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_1.reset_index(inplace=True)
all_players_sorted_s_1.reset_index(inplace=True)
all_players_sorted_dist_ball_1.reset_index(inplace=True)
all_players_sorted_ang_ball_1.reset_index(inplace=True)
all_players_sorted_a_1.reset_index(inplace=True)

all_players_sorted_dis_closest_o_1.reset_index(inplace=True)
all_players_sorted_ang_closest_o_1.reset_index(inplace=True)
all_players_sorted_min_dist_1.reset_index(inplace=True)
all_players_sorted_ang_min_1.reset_index(inplace=True)
all_players_sorted_2_min_dist_1.reset_index(inplace=True)
all_players_sorted_2_ang_min_1.reset_index(inplace=True)


all_players_sorted_combined_1 = pd.merge(all_players_sorted_y_1, all_players_sorted_s_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_dist_ball_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_ang_ball_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_a_1, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_dis_closest_o_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_ang_closest_o_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_min_dist_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_ang_min_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_2_min_dist_1, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_1 = pd.merge(all_players_sorted_combined_1, all_players_sorted_2_ang_min_1, on=['gameId', 'playId', 'frameId'])


all_players_sorted_y_2.columns = [f'player{rank}_y' for rank in all_players_sorted_y_2.columns]
all_players_sorted_s_2.columns = [f'player{rank}_s' for rank in all_players_sorted_s_2.columns]
all_players_sorted_dist_ball_2.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_2.columns]
all_players_sorted_ang_ball_2.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_2.columns]
all_players_sorted_a_2.columns = [f'player{rank}_a' for rank in all_players_sorted_a_2.columns]

all_players_sorted_dis_closest_o_2.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_2.columns]
all_players_sorted_ang_closest_o_2.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_2.columns]
all_players_sorted_min_dist_2.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_2.columns]
all_players_sorted_ang_min_2.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_2.columns]
all_players_sorted_2_min_dist_2.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_2.columns]
all_players_sorted_2_ang_min_2.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_2.columns]


# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_2.reset_index(inplace=True)
all_players_sorted_s_2.reset_index(inplace=True)
all_players_sorted_dist_ball_2.reset_index(inplace=True)
all_players_sorted_ang_ball_2.reset_index(inplace=True)
all_players_sorted_a_2.reset_index(inplace=True)

all_players_sorted_dis_closest_o_2.reset_index(inplace=True)
all_players_sorted_ang_closest_o_2.reset_index(inplace=True)
all_players_sorted_min_dist_2.reset_index(inplace=True)
all_players_sorted_ang_min_2.reset_index(inplace=True)
all_players_sorted_2_min_dist_2.reset_index(inplace=True)
all_players_sorted_2_ang_min_2.reset_index(inplace=True)



all_players_sorted_combined_2 = pd.merge(all_players_sorted_y_2, all_players_sorted_s_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_dist_ball_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_ang_ball_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_a_2, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_dis_closest_o_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_ang_closest_o_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_min_dist_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_ang_min_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_2_min_dist_2, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_2 = pd.merge(all_players_sorted_combined_2, all_players_sorted_2_ang_min_2, on=['gameId', 'playId', 'frameId'])

all_players_sorted_y_3.columns = [f'player{rank}_y' for rank in all_players_sorted_y_3.columns]
all_players_sorted_s_3.columns = [f'player{rank}_s' for rank in all_players_sorted_s_3.columns]
all_players_sorted_dist_ball_3.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_3.columns]
all_players_sorted_ang_ball_3.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_3.columns]
all_players_sorted_a_3.columns = [f'player{rank}_a' for rank in all_players_sorted_a_3.columns]

all_players_sorted_dis_closest_o_3.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_3.columns]
all_players_sorted_ang_closest_o_3.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_3.columns]
all_players_sorted_min_dist_3.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_3.columns]
all_players_sorted_ang_min_3.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_3.columns]
all_players_sorted_2_min_dist_3.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_3.columns]
all_players_sorted_2_ang_min_3.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_3.columns]


# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_3.reset_index(inplace=True)
all_players_sorted_s_3.reset_index(inplace=True)
all_players_sorted_dist_ball_3.reset_index(inplace=True)
all_players_sorted_ang_ball_3.reset_index(inplace=True)
all_players_sorted_a_3.reset_index(inplace=True)

all_players_sorted_dis_closest_o_3.reset_index(inplace=True)
all_players_sorted_ang_closest_o_3.reset_index(inplace=True)
all_players_sorted_min_dist_3.reset_index(inplace=True)
all_players_sorted_ang_min_3.reset_index(inplace=True)
all_players_sorted_2_min_dist_3.reset_index(inplace=True)
all_players_sorted_2_ang_min_3.reset_index(inplace=True)


all_players_sorted_combined_3 = pd.merge(all_players_sorted_y_3, all_players_sorted_s_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_dist_ball_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_ang_ball_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_a_3, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_dis_closest_o_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_ang_closest_o_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_min_dist_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_ang_min_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_2_min_dist_3, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_3 = pd.merge(all_players_sorted_combined_3, all_players_sorted_2_ang_min_3, on=['gameId', 'playId', 'frameId'])


all_players_sorted_y_4.columns = [f'player{rank}_y' for rank in all_players_sorted_y_4.columns]
all_players_sorted_s_4.columns = [f'player{rank}_s' for rank in all_players_sorted_s_4.columns]
all_players_sorted_dist_ball_4.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_4.columns]
all_players_sorted_ang_ball_4.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_4.columns]
all_players_sorted_a_4.columns = [f'player{rank}_a' for rank in all_players_sorted_a_4.columns]

all_players_sorted_dis_closest_o_4.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_4.columns]
all_players_sorted_ang_closest_o_4.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_4.columns]
all_players_sorted_min_dist_4.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_4.columns]
all_players_sorted_ang_min_4.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_4.columns]
all_players_sorted_2_min_dist_4.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_4.columns]
all_players_sorted_2_ang_min_4.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_4.columns]


# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_4.reset_index(inplace=True)
all_players_sorted_s_4.reset_index(inplace=True)
all_players_sorted_dist_ball_4.reset_index(inplace=True)
all_players_sorted_ang_ball_4.reset_index(inplace=True)
all_players_sorted_a_4.reset_index(inplace=True)

all_players_sorted_dis_closest_o_4.reset_index(inplace=True)
all_players_sorted_ang_closest_o_4.reset_index(inplace=True)
all_players_sorted_min_dist_4.reset_index(inplace=True)
all_players_sorted_ang_min_4.reset_index(inplace=True)
all_players_sorted_2_min_dist_4.reset_index(inplace=True)
all_players_sorted_2_ang_min_4.reset_index(inplace=True)


all_players_sorted_combined_4 = pd.merge(all_players_sorted_y_4, all_players_sorted_s_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_dist_ball_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_ang_ball_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_a_4, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_dis_closest_o_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_ang_closest_o_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_min_dist_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_ang_min_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_2_min_dist_4, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_4 = pd.merge(all_players_sorted_combined_4, all_players_sorted_2_ang_min_4, on=['gameId', 'playId', 'frameId'])



all_players_sorted_y_5.columns = [f'player{rank}_y' for rank in all_players_sorted_y_5.columns]
all_players_sorted_s_5.columns = [f'player{rank}_s' for rank in all_players_sorted_s_5.columns]
all_players_sorted_dist_ball_5.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_5.columns]
all_players_sorted_ang_ball_5.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_5.columns]
all_players_sorted_a_5.columns = [f'player{rank}_a' for rank in all_players_sorted_a_5.columns]

all_players_sorted_dis_closest_o_5.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_5.columns]
all_players_sorted_ang_closest_o_5.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_5.columns]
all_players_sorted_min_dist_5.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_5.columns]
all_players_sorted_ang_min_5.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_5.columns]
all_players_sorted_2_min_dist_5.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_5.columns]
all_players_sorted_2_ang_min_5.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_5.columns]


# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_5.reset_index(inplace=True)
all_players_sorted_s_5.reset_index(inplace=True)
all_players_sorted_dist_ball_5.reset_index(inplace=True)
all_players_sorted_ang_ball_5.reset_index(inplace=True)
all_players_sorted_a_5.reset_index(inplace=True)

all_players_sorted_dis_closest_o_5.reset_index(inplace=True)
all_players_sorted_ang_closest_o_5.reset_index(inplace=True)
all_players_sorted_min_dist_5.reset_index(inplace=True)
all_players_sorted_ang_min_5.reset_index(inplace=True)
all_players_sorted_2_min_dist_5.reset_index(inplace=True)
all_players_sorted_2_ang_min_5.reset_index(inplace=True)




all_players_sorted_combined_5 = pd.merge(all_players_sorted_y_5, all_players_sorted_s_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_dist_ball_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_ang_ball_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_a_5, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_dis_closest_o_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_ang_closest_o_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_min_dist_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_ang_min_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_2_min_dist_5, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_5 = pd.merge(all_players_sorted_combined_5, all_players_sorted_2_ang_min_5, on=['gameId', 'playId', 'frameId'])



all_players_sorted_y_6.columns = [f'player{rank}_y' for rank in all_players_sorted_y_6.columns]
all_players_sorted_s_6.columns = [f'player{rank}_s' for rank in all_players_sorted_s_6.columns]
all_players_sorted_dist_ball_6.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_6.columns]
all_players_sorted_ang_ball_6.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_6.columns]
all_players_sorted_a_6.columns = [f'player{rank}_a' for rank in all_players_sorted_a_6.columns]

all_players_sorted_dis_closest_o_6.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_6.columns]
all_players_sorted_ang_closest_o_6.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_6.columns]
all_players_sorted_min_dist_6.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_6.columns]
all_players_sorted_ang_min_6.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_6.columns]
all_players_sorted_2_min_dist_6.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_6.columns]
all_players_sorted_2_ang_min_6.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_6.columns]

# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_6.reset_index(inplace=True)
all_players_sorted_s_6.reset_index(inplace=True)
all_players_sorted_dist_ball_6.reset_index(inplace=True)
all_players_sorted_ang_ball_6.reset_index(inplace=True)
all_players_sorted_a_6.reset_index(inplace=True)

all_players_sorted_dis_closest_o_6.reset_index(inplace=True)
all_players_sorted_ang_closest_o_6.reset_index(inplace=True)
all_players_sorted_min_dist_6.reset_index(inplace=True)
all_players_sorted_ang_min_6.reset_index(inplace=True)
all_players_sorted_2_min_dist_6.reset_index(inplace=True)
all_players_sorted_2_ang_min_6.reset_index(inplace=True)


all_players_sorted_combined_6 = pd.merge(all_players_sorted_y_6, all_players_sorted_s_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_dist_ball_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_ang_ball_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_a_6, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_dis_closest_o_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_ang_closest_o_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_min_dist_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_ang_min_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_2_min_dist_6, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_6 = pd.merge(all_players_sorted_combined_6, all_players_sorted_2_ang_min_6, on=['gameId', 'playId', 'frameId'])


all_players_sorted_y_7.columns = [f'player{rank}_y' for rank in all_players_sorted_y_7.columns]
all_players_sorted_s_7.columns = [f'player{rank}_s' for rank in all_players_sorted_s_7.columns]
all_players_sorted_dist_ball_7.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_7.columns]
all_players_sorted_ang_ball_7.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_7.columns]
all_players_sorted_a_7.columns = [f'player{rank}_a' for rank in all_players_sorted_a_7.columns]

all_players_sorted_dis_closest_o_7.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_7.columns]
all_players_sorted_ang_closest_o_7.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_7.columns]
all_players_sorted_min_dist_7.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_7.columns]
all_players_sorted_ang_min_7.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_7.columns]
all_players_sorted_2_min_dist_7.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_7.columns]
all_players_sorted_2_ang_min_7.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_7.columns]



# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_7.reset_index(inplace=True)
all_players_sorted_s_7.reset_index(inplace=True)
all_players_sorted_dist_ball_7.reset_index(inplace=True)
all_players_sorted_ang_ball_7.reset_index(inplace=True)
all_players_sorted_a_7.reset_index(inplace=True)

all_players_sorted_dis_closest_o_7.reset_index(inplace=True)
all_players_sorted_ang_closest_o_7.reset_index(inplace=True)
all_players_sorted_min_dist_7.reset_index(inplace=True)
all_players_sorted_ang_min_7.reset_index(inplace=True)
all_players_sorted_2_min_dist_7.reset_index(inplace=True)
all_players_sorted_2_ang_min_7.reset_index(inplace=True)



all_players_sorted_combined_7 = pd.merge(all_players_sorted_y_7, all_players_sorted_s_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_dist_ball_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_ang_ball_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_a_7, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_dis_closest_o_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_ang_closest_o_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_min_dist_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_ang_min_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_2_min_dist_7, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_7 = pd.merge(all_players_sorted_combined_7, all_players_sorted_2_ang_min_7, on=['gameId', 'playId', 'frameId'])


all_players_sorted_y_8.columns = [f'player{rank}_y' for rank in all_players_sorted_y_8.columns]
all_players_sorted_s_8.columns = [f'player{rank}_s' for rank in all_players_sorted_s_8.columns]
all_players_sorted_dist_ball_8.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_8.columns]
all_players_sorted_ang_ball_8.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_8.columns]
all_players_sorted_a_8.columns = [f'player{rank}_a' for rank in all_players_sorted_a_8.columns]

all_players_sorted_dis_closest_o_8.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_8.columns]
all_players_sorted_ang_closest_o_8.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_8.columns]
all_players_sorted_min_dist_8.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_8.columns]
all_players_sorted_ang_min_8.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_8.columns]
all_players_sorted_2_min_dist_8.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_8.columns]
all_players_sorted_2_ang_min_8.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_8.columns]

# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_8.reset_index(inplace=True)
all_players_sorted_s_8.reset_index(inplace=True)
all_players_sorted_dist_ball_8.reset_index(inplace=True)
all_players_sorted_ang_ball_8.reset_index(inplace=True)
all_players_sorted_a_8.reset_index(inplace=True)

all_players_sorted_dis_closest_o_8.reset_index(inplace=True)
all_players_sorted_ang_closest_o_8.reset_index(inplace=True)
all_players_sorted_min_dist_8.reset_index(inplace=True)
all_players_sorted_ang_min_8.reset_index(inplace=True)
all_players_sorted_2_min_dist_8.reset_index(inplace=True)
all_players_sorted_2_ang_min_8.reset_index(inplace=True)



all_players_sorted_combined_8 = pd.merge(all_players_sorted_y_8, all_players_sorted_s_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_dist_ball_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_ang_ball_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_a_8, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_dis_closest_o_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_ang_closest_o_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_min_dist_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_ang_min_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_2_min_dist_8, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_8 = pd.merge(all_players_sorted_combined_8, all_players_sorted_2_ang_min_8, on=['gameId', 'playId', 'frameId'])


all_players_sorted_y_9.columns = [f'player{rank}_y' for rank in all_players_sorted_y_9.columns]
all_players_sorted_s_9.columns = [f'player{rank}_s' for rank in all_players_sorted_s_9.columns]
all_players_sorted_dist_ball_9.columns = [f'player{rank}_dist_ball' for rank in all_players_sorted_dist_ball_9.columns]
all_players_sorted_ang_ball_9.columns = [f'player{rank}_ang_ball' for rank in all_players_sorted_ang_ball_9.columns]
all_players_sorted_a_9.columns = [f'player{rank}_a' for rank in all_players_sorted_a_9.columns]

all_players_sorted_dis_closest_o_9.columns = [f'player{rank}_dis_closest_o' for rank in all_players_sorted_dis_closest_o_9.columns]
all_players_sorted_ang_closest_o_9.columns = [f'player{rank}_ang_closest_o' for rank in all_players_sorted_ang_closest_o_9.columns]
all_players_sorted_min_dist_9.columns = [f'player{rank}_min_dist' for rank in all_players_sorted_min_dist_9.columns]
all_players_sorted_ang_min_9.columns = [f'player{rank}_ang_min' for rank in all_players_sorted_ang_min_9.columns]
all_players_sorted_2_min_dist_9.columns = [f'player{rank}_2_min_dist' for rank in all_players_sorted_2_min_dist_9.columns]
all_players_sorted_2_ang_min_9.columns = [f'player{rank}_2_ang_min' for rank in all_players_sorted_2_ang_min_9.columns]


# Reset the index to make 'gameId' and 'playId' columns again
all_players_sorted_y_9.reset_index(inplace=True)
all_players_sorted_s_9.reset_index(inplace=True)
all_players_sorted_dist_ball_9.reset_index(inplace=True)
all_players_sorted_ang_ball_9.reset_index(inplace=True)
all_players_sorted_a_9.reset_index(inplace=True)

all_players_sorted_dis_closest_o_9.reset_index(inplace=True)
all_players_sorted_ang_closest_o_9.reset_index(inplace=True)
all_players_sorted_min_dist_9.reset_index(inplace=True)
all_players_sorted_ang_min_9.reset_index(inplace=True)
all_players_sorted_2_min_dist_9.reset_index(inplace=True)
all_players_sorted_2_ang_min_9.reset_index(inplace=True)


all_players_sorted_combined_9 = pd.merge(all_players_sorted_y_9, all_players_sorted_s_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_dist_ball_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_ang_ball_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_a_9, on=['gameId', 'playId', 'frameId'])

all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_dis_closest_o_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_ang_closest_o_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_min_dist_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_ang_min_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_2_min_dist_9, on=['gameId', 'playId', 'frameId'])
all_players_sorted_combined_9 = pd.merge(all_players_sorted_combined_9, all_players_sorted_2_ang_min_9, on=['gameId', 'playId', 'frameId'])

In [59]:
# Reorder columns to interleave x and y coordinates
column_order_1 = [col for octet in zip(all_players_sorted_y_1.columns[3:],
                                       all_players_sorted_s_1.columns[3:],
                                      #  all_players_sorted_dis_1.columns[3:],
                                      #  all_players_sorted_o_1.columns[3:],
                                       all_players_sorted_dist_ball_1.columns[3:],
                                       all_players_sorted_ang_ball_1.columns[3:],
                                       all_players_sorted_dis_closest_o_1.columns[3:],
                                       all_players_sorted_ang_closest_o_1.columns[3:],
                                       all_players_sorted_min_dist_1.columns[3:],
                                       all_players_sorted_ang_min_1.columns[3:],
                                       all_players_sorted_2_min_dist_1.columns[3:],
                                       all_players_sorted_2_ang_min_1.columns[3:],
                                      #  all_players_sorted_dir_1.columns[3:],
                                       all_players_sorted_a_1.columns[3:])
                  for col in octet]


all_players_sorted_final_1 = all_players_sorted_combined_1[['gameId', 'playId', 'frameId'] + column_order_1]


column_order_2 = [col for octet in zip(all_players_sorted_y_2.columns[3:],
                                       all_players_sorted_s_2.columns[3:],
                                      #  all_players_sorted_dis_2.columns[3:],
                                      #  all_players_sorted_o_2.columns[3:],
                                       all_players_sorted_dist_ball_2.columns[3:],
                                       all_players_sorted_dis_closest_o_2.columns[3:],
                                       all_players_sorted_ang_closest_o_2.columns[3:],
                                       all_players_sorted_min_dist_2.columns[3:],
                                       all_players_sorted_ang_min_2.columns[3:],
                                       all_players_sorted_2_min_dist_2.columns[3:],
                                       all_players_sorted_2_ang_min_2.columns[3:],
                                       all_players_sorted_ang_ball_2.columns[3:],
                                      #  all_players_sorted_dir_2.columns[3:],
                                       all_players_sorted_a_2.columns[3:])
                  for col in octet]

all_players_sorted_final_2 = all_players_sorted_combined_2[['gameId', 'playId', 'frameId'] + column_order_2]



column_order_3 = [col for octet in zip(all_players_sorted_y_3.columns[3:],
                                       all_players_sorted_s_3.columns[3:],
                                      #  all_players_sorted_dis_3.columns[3:],
                                      #  all_players_sorted_o_3.columns[3:],
                                       all_players_sorted_dist_ball_3.columns[3:],
                                       all_players_sorted_ang_ball_3.columns[3:],
                                       all_players_sorted_dis_closest_o_3.columns[3:],
                                       all_players_sorted_ang_closest_o_3.columns[3:],
                                       all_players_sorted_min_dist_3.columns[3:],
                                       all_players_sorted_ang_min_3.columns[3:],
                                       all_players_sorted_2_min_dist_3.columns[3:],
                                       all_players_sorted_2_ang_min_3.columns[3:],
                                      #  all_players_sorted_dir_3.columns[3:],
                                       all_players_sorted_a_3.columns[3:])
                  for col in octet]


all_players_sorted_final_3 = all_players_sorted_combined_3[['gameId', 'playId', 'frameId'] + column_order_3]



column_order_4 = [col for octet in zip(all_players_sorted_y_4.columns[3:],
                                       all_players_sorted_s_4.columns[3:],
                                      #  all_players_sorted_dis_4.columns[3:],
                                      #  all_players_sorted_o_4.columns[3:],
                                       all_players_sorted_dist_ball_4.columns[3:],
                                       all_players_sorted_ang_ball_4.columns[3:],
                                       all_players_sorted_dis_closest_o_4.columns[3:],
                                       all_players_sorted_ang_closest_o_4.columns[3:],
                                       all_players_sorted_min_dist_4.columns[3:],
                                       all_players_sorted_ang_min_4.columns[3:],
                                       all_players_sorted_2_min_dist_4.columns[3:],
                                       all_players_sorted_2_ang_min_4.columns[3:],
                                      #  all_players_sorted_dir_4.columns[3:],
                                       all_players_sorted_a_4.columns[3:])
                  for col in octet]

all_players_sorted_final_4 = all_players_sorted_combined_4[['gameId', 'playId', 'frameId'] + column_order_4]



column_order_5 = [col for octet in zip(all_players_sorted_y_5.columns[3:],
                                       all_players_sorted_s_5.columns[3:],
                                      #  all_players_sorted_dis_5.columns[3:],
                                      #  all_players_sorted_o_5.columns[3:],
                                       all_players_sorted_dist_ball_5.columns[3:],
                                       all_players_sorted_ang_ball_5.columns[3:],
                                       all_players_sorted_dis_closest_o_5.columns[3:],
                                       all_players_sorted_ang_closest_o_5.columns[3:],
                                       all_players_sorted_min_dist_5.columns[3:],
                                       all_players_sorted_ang_min_5.columns[3:],
                                       all_players_sorted_2_min_dist_5.columns[3:],
                                       all_players_sorted_2_ang_min_5.columns[3:],
                                      #  all_players_sorted_dir_5.columns[3:],
                                       all_players_sorted_a_5.columns[3:])
                  for col in octet]

all_players_sorted_final_5 = all_players_sorted_combined_5[['gameId', 'playId', 'frameId'] + column_order_5]




column_order_6 = [col for octet in zip(all_players_sorted_y_6.columns[3:],
                                       all_players_sorted_s_6.columns[3:],
                                      #  all_players_sorted_dis_6.columns[3:],
                                      #  all_players_sorted_o_6.columns[3:],
                                       all_players_sorted_dist_ball_6.columns[3:],
                                       all_players_sorted_ang_ball_6.columns[3:],
                                       all_players_sorted_dis_closest_o_6.columns[3:],
                                       all_players_sorted_ang_closest_o_6.columns[3:],
                                       all_players_sorted_min_dist_6.columns[3:],
                                       all_players_sorted_ang_min_6.columns[3:],
                                       all_players_sorted_2_min_dist_6.columns[3:],
                                       all_players_sorted_2_ang_min_6.columns[3:],
                                      #  all_players_sorted_dir_6.columns[3:],
                                       all_players_sorted_a_6.columns[3:])
                  for col in octet]


all_players_sorted_final_6 = all_players_sorted_combined_6[['gameId', 'playId', 'frameId'] + column_order_6]



column_order_7 = [col for octet in zip(all_players_sorted_y_7.columns[3:],
                                       all_players_sorted_s_7.columns[3:],
                                      #  all_players_sorted_dis_7.columns[3:],
                                      #  all_players_sorted_o_7.columns[3:],
                                       all_players_sorted_dist_ball_7.columns[3:],
                                       all_players_sorted_ang_ball_7.columns[3:],
                                       all_players_sorted_dis_closest_o_7.columns[3:],
                                       all_players_sorted_ang_closest_o_7.columns[3:],
                                       all_players_sorted_min_dist_7.columns[3:],
                                       all_players_sorted_ang_min_7.columns[3:],
                                       all_players_sorted_2_min_dist_7.columns[3:],
                                       all_players_sorted_2_ang_min_7.columns[3:],
                                      #  all_players_sorted_dir_7.columns[3:],
                                       all_players_sorted_a_7.columns[3:])
                  for col in octet]

all_players_sorted_final_7 = all_players_sorted_combined_7[['gameId', 'playId', 'frameId'] + column_order_7]


column_order_8 = [col for octet in zip(all_players_sorted_y_8.columns[3:],
                                       all_players_sorted_s_8.columns[3:],
                                      #  all_players_sorted_dis_8.columns[3:],
                                      #  all_players_sorted_o_8.columns[3:],
                                       all_players_sorted_dist_ball_8.columns[3:],
                                       all_players_sorted_ang_ball_8.columns[3:],
                                       all_players_sorted_dis_closest_o_8.columns[3:],
                                       all_players_sorted_ang_closest_o_8.columns[3:],
                                       all_players_sorted_min_dist_8.columns[3:],
                                       all_players_sorted_ang_min_8.columns[3:],
                                       all_players_sorted_2_min_dist_8.columns[3:],
                                       all_players_sorted_2_ang_min_8.columns[3:],
                                      #  all_players_sorted_dir_8.columns[3:],
                                       all_players_sorted_a_8.columns[3:])
                  for col in octet]

all_players_sorted_final_8 = all_players_sorted_combined_8[['gameId', 'playId', 'frameId'] + column_order_8]



column_order_9 = [col for octet in zip(all_players_sorted_y_9.columns[3:],
                                       all_players_sorted_s_9.columns[3:],
                                      #  all_players_sorted_dis_9.columns[3:],
                                      #  all_players_sorted_o_9.columns[3:],
                                       all_players_sorted_dist_ball_9.columns[3:],
                                       all_players_sorted_ang_ball_9.columns[3:],
                                       all_players_sorted_dis_closest_o_9.columns[3:],
                                       all_players_sorted_ang_closest_o_9.columns[3:],
                                       all_players_sorted_min_dist_9.columns[3:],
                                       all_players_sorted_ang_min_9.columns[3:],
                                       all_players_sorted_2_min_dist_9.columns[3:],
                                       all_players_sorted_2_ang_min_9.columns[3:],
                                      #  all_players_sorted_dir_9.columns[3:],
                                       all_players_sorted_a_9.columns[3:])
                  for col in octet]


all_players_sorted_final_9 = all_players_sorted_combined_9[['gameId', 'playId', 'frameId'] + column_order_9]

**E) drop duplicates and save 'df_all' and 'all_players_sorted_df_(1-9)'**

In [60]:
all_players_sorted_final_1 = all_players_sorted_final_1.T.drop_duplicates().T
all_players_sorted_final_2 = all_players_sorted_final_2.T.drop_duplicates().T
all_players_sorted_final_3 = all_players_sorted_final_3.T.drop_duplicates().T

all_players_sorted_final_4 = all_players_sorted_final_4.T.drop_duplicates().T
all_players_sorted_final_5 = all_players_sorted_final_5.T.drop_duplicates().T
all_players_sorted_final_6 = all_players_sorted_final_6.T.drop_duplicates().T

all_players_sorted_final_7 = all_players_sorted_final_7.T.drop_duplicates().T
all_players_sorted_final_8 = all_players_sorted_final_8.T.drop_duplicates().T
all_players_sorted_final_9 = all_players_sorted_final_9.T.drop_duplicates().T

df_1 = all_players_1[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_1, on=['gameId', 'playId', 'frameId'], how='inner')
df_2 = all_players_2[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_2, on=['gameId', 'playId', 'frameId'], how='inner')
df_3 = all_players_3[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_3, on=['gameId', 'playId', 'frameId'], how='inner')

df_4 = all_players_4[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_4, on=['gameId', 'playId', 'frameId'], how='inner')
df_5 = all_players_5[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_5, on=['gameId', 'playId', 'frameId'], how='inner')
df_6 = all_players_6[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_6, on=['gameId', 'playId', 'frameId'], how='inner')

df_7 = all_players_7[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_7, on=['gameId', 'playId', 'frameId'], how='inner')
df_8 = all_players_8[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_8, on=['gameId', 'playId', 'frameId'], how='inner')
df_9 = all_players_9[['gameId', 'playId', 'frameId', 'max_distance', 'actual_dist_from_final', 'frame_since_bc']].merge(all_players_sorted_final_9, on=['gameId', 'playId', 'frameId'], how='inner')

In [61]:
# Large files saved for XGBoost and SHAP

# all_players_sorted_1.to_pickle('all_players_sorted_1.pkl')
# all_players_sorted_2.to_pickle('all_players_sorted_2.pkl')
# all_players_sorted_3.to_pickle('all_players_sorted_3.pkl')
# all_players_sorted_4.to_pickle('all_players_sorted_4.pkl')
# all_players_sorted_5.to_pickle('all_players_sorted_5.pkl')
# all_players_sorted_6.to_pickle('all_players_sorted_6.pkl')
# all_players_sorted_7.to_pickle('all_players_sorted_7.pkl')
# all_players_sorted_8.to_pickle('all_players_sorted_8.pkl')
# all_players_sorted_9.to_pickle('all_players_sorted_9.pkl')

In [70]:
print(f"Week 1 'all_players_sorted_1' has {all_players_sorted_1.shape[0]} rows and {all_players_sorted_1.shape[1]} columns")
all_players_sorted_1.head(5).T.head(27)

Week 1 'all_players_sorted_1' has 86256 rows and 67 columns


Unnamed: 0,0,1,2,3,4
gameId,2022090800,2022090800,2022090800,2022090800,2022090800
playId,101,101,101,101,101
frameId_x,19,19,19,19,19
event_x,handoff,handoff,handoff,handoff,handoff
frameId_y,45,45,45,45,45
event_y,tackle,tackle,tackle,tackle,tackle
ballCarrierId,47857,47857,47857,47857,47857
ballCarrierDisplayName,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary,Devin Singletary
playDirection,left,left,left,left,left
nflId,47857.0,43335.0,41239.0,47917.0,47857.0


In [68]:
all_players_sorted_1.head(5).T.tail(40)

Unnamed: 0,0,1,2,3,4
frame_since_bc,0.0,0.0,0.0,0.0,1.0
rank,1.0,2.0,3.0,4.0,1.0
x_no_1,73.54,73.54,73.54,73.54,73.32
y_no_1,34.19,34.19,34.19,34.19,34.9
x_no_2,72.52,72.52,72.52,72.52,72.46
y_no_2,33.21,33.21,33.21,33.21,33.51
x_no_3,71.85,71.85,71.85,71.85,71.62
y_no_3,34.01,34.01,34.01,34.01,34.29
x_no_4,71.66,71.66,71.66,71.66,73.15
y_no_4,30.42,30.42,30.42,30.42,37.37


In [62]:
df_all = pd.concat([df_1, df_2, df_3, df_4, df_5, df_6, df_7, df_8, df_9], axis=0)
df_all = df_all.drop(['player1.0_dist_ball', 'player1.0_ang_ball'], axis=1)
df_all = df_all.drop_duplicates()
df_all = df_all.reset_index(drop=True)

In [63]:
df_all['max_distance'] = df_all['max_distance'].astype('float64')
df_all['actual_dist_from_final'] = df_all['actual_dist_from_final'].astype('float64')

In [64]:
# Large file saved for XGBoost and SHAP

#df_all.to_pickle('df_all.pkl')

In [81]:
print(f"'df_all' for all weeks has {df_all.shape[0]} rows and {df_all.shape[1]} columns\n")

print("First two variables are for future reference and will be removed\n")
print("'actual_dist_from_final' is used as target variable for model\n")
df_all.head(5).T

'df_all' for all weeks has 194924 rows and 48 columns

First two variables are for future reference and will be removed

'actual_dist_from_final' is used as target variable for model



Unnamed: 0,0,1,2,3,4
gameId,2022090800.0,2022090800.0,2022090800.0,2022090800.0,2022090800.0
playId,101.0,101.0,101.0,101.0,101.0
frameId,19.0,20.0,21.0,22.0,23.0
max_distance,69.16,69.01,68.82,68.6,68.34
actual_dist_from_final,12.47,12.32,12.13,11.91,11.65
frame_since_bc,0.0,1.0,2.0,3.0,4.0
player1.0_y,32.03,32.51,33.01,33.53,34.05
player1.0_s,4.89,5.16,5.41,5.73,6.01
player1.0_dis_closest_o,6.020797,6.171564,6.287329,6.326642,6.13
player1.0_ang_closest_o,158.976154,157.215882,155.473084,174.739978,176.726846
