Link to the Kaggle competion and my notebook :D

https://www.kaggle.com/competitions/nfl-big-data-bowl-2022

https://www.kaggle.com/code/tianmin/how-to-improve-the-valid-metric-kick-length


# Table of content

* [Introduction](#intro)
* [Kickoff as a special team play](#kickoff)
* [Outcomes of a kickoff](#outcome)
* [Gained yards by kicking teams](#gainedyards)
* [Kick the ball far, if you want to gain more yards](#far)
* [Initial analysis - How to kick the ball far](#initial)
* [Variable 1 - Foot speed](#speed)
* [Variable 2 - Body orientation](#orientation)
* [Conclusion and future research](#conclusion)
* [Reference](#reference)

# Introduction <a id="intro"></a>

**"Kick the ball harder, if you want to kick it further."** This is what I learned from a lot of people saying. 

But how true is it? Or what does it exactly mean when we say "kicking it harder"? Data from NFL games reveal to us some insight which I'd like to share with you in this notebook.

Kickoff play is the most common type in special plays from season 2018 to 2020. Even though with six different outcomes, the kicking team has basically one ultimate goal - gaining as many yards as possible after the session ends. In this notebook, I'd bring up the **kick length** of the ball as an actionable and robust metric to evaluate how good a kickoff it is. This metric has a high positive association with the number of gained yards, since the receiving team needs to return the ball from a very low yard line if the ball is kicked long distance. Apart from that, with using the multiple linear regression method, I also find two relevant variables which might affect the kick length. Those are 

1. **Foot speed** in the approach, usually seen in a long approach distance, and it is also a predictor of the last step length, which is related with how much engergy will be transmitted to the ball, according to the formula of kinetic energy.

2. The **direction of the ball being kicked to**. If the ball was kicked to the right side of the kicker, the length is usually shorter than to the center or to the left. The hypothesis behind is kicking to the right takes more energy from the body because of a larger pivoting angle. This cost of energy results in less volume brought to the ball.



In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import statsmodels.api as sm # using linear regression library for analysis
from scipy.stats import pearsonr # calculating pearson correlation

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

# ployly is the main tool for visualization in this notebook
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# make sure the code in plotly is able to run properly
from plotly.offline import plot, iplot, init_notebook_mode
init_notebook_mode(connected=True)


# Style for the visualization
fonts = ['Rockwell','Oswald','Lato']
colors = ['#013369','#013f82','#0264ce','#d50a0a','#ffffff']

# remove unnecessary warnings in the output
pd.options.mode.chained_assignment = None  # default='warn'


# Kickoff as a special team play <a id="kickoff"></a>

**Special teams** are units that are on the field during kicking plays. While many players who appear on offensive or defensive squads also play similar roles on special teams (offensive linemen to block or defensive players to tackle), there are some specialist roles that are unique to the kicking game([1](https://en.wikipedia.org/wiki/American_football_positions#Special_teams)).

In a special play, it usually has 4 play types - those are 
* Kickoff
* Punt
* Extra Point
* Field Goal

**Kickoff** is the most commont one in terms of number. By definition in NFL official website, It is a kick that puts the ball in play at the start of each half, at the start of overtime, after each try, and after a successful field goal([2](https://operations.nfl.com/learn-the-game/nfl-basics/terms-glossary/)). Because of its popularity, I'm interested to dig it further so that we can know how to evaluate the performance of a kicking team.

In [None]:
# fetch play data
play_data = pd.read_csv("/kaggle/input/nfl-big-data-bowl-2022/plays.csv")

# fetch game data
game_data = pd.read_csv("/kaggle/input/nfl-big-data-bowl-2022/games.csv")

# join play_data and game_data on gameId and playId
gamePlay_data = pd.merge(
    play_data,
    game_data,
    how="inner",
    left_on=["gameId"],
    right_on=["gameId"],
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
)

# select 'season','specialTeamsPlayType','gameId','playId' columns into a new table season_play
season_play = gamePlay_data[['season','specialTeamsPlayType','gameId','playId']]

# create a unique id for each play per game
season_play['uniqueId'] = season_play['gameId'].astype(str) + season_play['playId'].astype(str)

# get number of plays per specialTeamsPlayType during the three seasons
season_play_number = season_play.groupby(['season','specialTeamsPlayType']).nunique().reset_index()[['season','specialTeamsPlayType','uniqueId']]
season_play_number['total'] = season_play_number.groupby('season')['uniqueId'].transform('sum')

# get the percentage of each type per year
season_play_number['perc_type'] = round(season_play_number['uniqueId'] * 100 / season_play_number['total'],1)

# Visualize data into a stacked bar chart
x=list(set(season_play_number['season'].astype(str)))
y_ExtraPoint = season_play_number.loc[season_play_number['specialTeamsPlayType'] == 'Extra Point']['perc_type']
y_FieldGoal = season_play_number.loc[season_play_number['specialTeamsPlayType'] == 'Field Goal']['perc_type']
y_Kickoff = season_play_number.loc[season_play_number['specialTeamsPlayType'] == 'Kickoff']['perc_type']
y_Punt = season_play_number.loc[season_play_number['specialTeamsPlayType'] == 'Punt']['perc_type']

fig = go.Figure(go.Bar(x=x, y=y_Kickoff, 
                       name='Kickoff',
                       width = 0.35,
                       hovertemplate =
                            '<i>Season</i>: %{x} <br>' + 
                            '<i>% of total plays</i>: %{y}',
                      marker_color=colors[3]))
fig.add_trace(go.Bar(x=x, y=y_Punt, 
                     name='Punt',
                     width = 0.35,
                    hovertemplate =
                    '<i>Season</i>: %{x} <br>' + 
                    '<i>% of total plays</i>: %{y}',
                    marker_color=colors[0]))
fig.add_trace(go.Bar(x=x, y=y_ExtraPoint, 
                     name='Extra Point',
                     width = 0.35,
                    hovertemplate =
                    '<i>Season</i>: %{x} <br>' + 
                    '<i>% of total plays</i>: %{y}',
                    marker_color=colors[1]))
fig.add_trace(go.Bar(x=x, y=y_FieldGoal, 
                     name='Field Goal',
                     width = 0.35,
                    hovertemplate =
                    '<i>Season</i>: %{x} <br>' + 
                    '<i>% of total plays</i>: %{y}',
                    marker_color=colors[2]))

# update layout of the chart
fig.update_yaxes(tickvals=[25, 50, 75, 100])

fig.update_layout(title='Number of special plays per type from 2018 to 2020 <br><sup>Kickoff is the most common type, with a yearly average percentage over 37%.</sup>',
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  autosize = True,
                  font=dict(
                            family=fonts[0],
                            size=14
                  ),
                  xaxis_tickfont_size=14,
                  yaxis=dict(
                          title='% of total plays',
                          titlefont_size=16,
                          tickfont_size=14,
                    ),
                  hoverlabel=dict(
                    bgcolor="white",
                    font_size=14,
                    font_family="Rockwell"
                    ),
                  legend=dict(
                    #x=0,
                    #y=1.0,
                    bgcolor='rgba(255, 255, 255, 0)',
                    bordercolor='rgba(255, 255, 255, 0)',
                    font=dict(
                            size=12
                        ),
                    ),
                  bargap=0.05,
                  bargroupgap=0.05,
                  barmode='stack', 
                  xaxis={'categoryorder':'category ascending'})

fig.show()

# Outcomes of a kickoff <a id="outcome"></a>

There are seven outcomes in general. **Touchback** and **return** are the most two popular ones. In 2018, in order to make the game more safer for both sides, and avoid unneccessary collisions. NFL introduced new rules on the touchback -

> Kickoffs that hit the end zone without being touched by a member of the receiving team automatically become touchbacks([3](https://profootballtalk.nbcsports.com/2018/07/06/another-tweak-to-the-kickoff-rule-promotes-more-touchbacks/)).

For years, the automatic touchback rule has applied to punts that enter the end zone, with or without being touched. For kickoffs, the touchback becomes automatic only if it strikes the ground in the end zone without being touched by a member of the receiving team; the player can still catch the kickoff and choose to return it. It’s not a change that will come into play very often, but it’s another example of the league’s broader effort to encourage touchbacks on kickoffs.


This policy pushes the touchback to a more popular status since 2018, as we can see it in the chart below. 

In [None]:
# get data where play type is kickoff
kickoff_data = gamePlay_data.loc[gamePlay_data['specialTeamsPlayType'] == 'Kickoff']

# create a unique id
kickoff_data['uniqueId'] = kickoff_data['gameId'].astype(str) + kickoff_data['playId'].astype(str)
kickoff_data_result = kickoff_data[['season','uniqueId','specialTeamsResult']]

# get number unique id per specialTeamsResult
kickoff_data_result_aggr = kickoff_data_result.groupby(['specialTeamsResult']).nunique().reset_index()[['specialTeamsResult','uniqueId']]

# get the total number of plays 
kickoff_data_result_aggr['total'] = kickoff_data_result_aggr['uniqueId'].sum()

# get the percentage of plays out of total per result
kickoff_data_result_aggr['share_of_total'] = round(kickoff_data_result_aggr['uniqueId'] * 100 / kickoff_data_result_aggr['total'],1)

# get number unique id per specialTeamsResult per year
kickoff_data_result_year_aggr = kickoff_data_result.groupby(['season','specialTeamsResult']).nunique().reset_index()[['season','specialTeamsResult','uniqueId']]

# get number plays per year
kickoff_data_result_year_aggr['total'] = kickoff_data_result_year_aggr.groupby('season')['uniqueId'].transform('sum')
kickoff_data_result_year_aggr['share_of_total'] = round(kickoff_data_result_year_aggr['uniqueId'] * 100 / kickoff_data_result_year_aggr['total'],1)

# get the share of plays are touchback in the three seasons
kickoff_data_result_year_touchback = kickoff_data_result_year_aggr.loc[kickoff_data_result_year_aggr['specialTeamsResult'] == 'Touchback']
# turn season data into categorical
kickoff_data_result_year_touchback['season'] = kickoff_data_result_year_touchback['season'].astype(str)

In [None]:
# visualize the chart 
fig = make_subplots(rows=1, cols=2,
                    subplot_titles=("Touchback is the most common one", 
                                    "...and its popularity has slightly increased in the three years"),
                    vertical_spacing=0.1,
                    column_widths=[0.55, 0.45])

fig.add_trace(
    go.Bar(
        x=kickoff_data_result_aggr['specialTeamsResult'], 
        y=kickoff_data_result_aggr['uniqueId'],
        name = '',
        hovertemplate =
                    '<i>Play result</i>: %{x} <br>' + 
                    '<i>Nr. of plays</i>: %{y}',
        marker_color=colors[0]
          ),
    row=1, col=1
)

fig.add_trace(
    go.Scatter(
        x=kickoff_data_result_year_touchback['season'], 
        y=kickoff_data_result_year_touchback['share_of_total'],
        name = '',
        hovertemplate =
                    '<i>Season</i>: %{x} <br>' + 
                    '<i>% of plays end up as touchback</i>: %{y}',
        marker_color=colors[3]
              ),
    row=1, col=2
)

# update chart layout
fig.update_layout(
                  title_text="Out of the seven results in kickoff plays,",
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(
                            family=fonts[0],
                            size=14
                  ),
                  xaxis_tickfont_size=14,
                  yaxis=dict(
                          #title='Nr of plays',
                          titlefont_size=16,
                          tickfont_size=14,
                    ),
                  hoverlabel=dict(
                    bgcolor="white",
                    font_size=14,
                    font_family="Rockwell"
                    ),
                  #showlegend=True,
                  bargap=0.05,
                  bargroupgap=0.05,
                  barmode='stack', 
                  xaxis={'categoryorder':'category ascending'}
)

fig.update_yaxes(title_text="Nr of plays", 
                 tickvals=[2000, 3500, 5000],
                 row=1, col=1)
fig.update_yaxes(title_text="touchback perc. (%)", 
                 #tickvals=[60.5, 61, 61.5],
                 row=1, col=2)

fig.show()

To understand how does each outcome exactly mean, I've created an interactive visualization below, showing the ball route per outcome and a table containg the definition.

In [None]:
# Kickoff data
kickoff_play = play_data.loc[play_data['specialTeamsPlayType'] == 'Kickoff']

# load tracking data
tracking_datafile = [
                    "/kaggle/input/nfl-big-data-bowl-2022/tracking2018.csv",
                    "/kaggle/input/nfl-big-data-bowl-2022/tracking2019.csv",
                    "/kaggle/input/nfl-big-data-bowl-2022/tracking2020.csv"
                    ]

# merge tracking data with play data
kickoff_tracking_dataset = []
for file in tracking_datafile:
    data = pd.read_csv(file)
    # select rows relevent with the route of the ball
    data = data.loc[data['displayName'] == 'football']
    # join kickoff_play and tracking_data on gameId and playId
    kickoff_tracking_data = pd.merge(
    kickoff_play,
    data,
    how="inner",
    left_on=["gameId","playId"],
    right_on=["gameId","playId"],
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
    )
    kickoff_tracking_dataset.append(kickoff_tracking_data)

# merge three seasons into one dataset
kickoff_tracking_dataset = pd.concat(kickoff_tracking_dataset)

# create unique id
kickoff_tracking_dataset['uniqueId'] = kickoff_tracking_dataset['gameId'].astype(str) + kickoff_tracking_dataset['playId'].astype(str)
# select revelant columns
ball_route_dataset = kickoff_tracking_dataset[['uniqueId','specialTeamsResult','playResult',
                         'time', 'x', 'y', 's', 'a', 'dis','event','playDirection']]

# creating a new column - showing the time of each move start from time zero
ball_route_dataset['rank'] = ball_route_dataset.groupby("uniqueId")["time"].rank("dense", ascending=True)
ball_route_dataset['sec'] = ball_route_dataset['rank'] * 0.1 - 0.1

In [None]:
# visualize the ball route per kickoff outcome

# taking one sample play per one outcome
ball_route_right = ball_route_dataset.loc[ball_route_dataset['playDirection'] == 'right']

unique_ids = [
"20210103152847", # Touchback
"20210103154182", # Return
"2020120611395", # Muffed
"202101030239", # Out of Bounds
"20181111003659", # Kickoff Team Recovery
"20190922044363", # Fair Catch
"20201108043603" # Downed
            ]

# visualizing ball routes and the field
# make the chart interactive so that the reader can choose the type he or she is interested
fig = go.Figure()

figs = []
count = 0
for i in unique_ids:
    sample = ball_route_right.loc[ball_route_right['uniqueId'] == i]
    sample['x'] = sample['x'] - 10
    # set the first trace visible, otherwise not once the fig is loaded
    if count == 0:
        # add ball route
        f = fig.add_trace(go.Scatter(x=sample['x'], 
                         y=sample['y'],
                            # yaxis='y0',
                        mode='lines+markers+text',
                        text = [None if x=='None' else x for x in sample['event'].values],
                        textposition="top center",
                        textfont=dict(
                                family=fonts[0],
                                size=12,
                                color= colors[0]
                                ),
                        name="",
                        hovertemplate =
                        '<i>x</i>: %{x} <br>' + 
                        '<i>y</i>: %{y} <br>',
                        #'<i>event</i>: %{text} <br>' ,
                        marker_color=np.where(sample['event'] == 'None', colors[0], colors[3]),
                        line_color = colors[0],
                        #marker_color=colors[0],
                        visible = True,
                        opacity = 0.75))
    else:
        # add ball route
        f = fig.add_trace(go.Scatter(x=sample['x'], 
                         y=sample['y'],
                            # yaxis='y0',
                        mode='lines+markers+text',
                        text = [None if x=='None' else x for x in sample['event'].values],
                        textposition="top center",
                        textfont=dict(
                                family=fonts[0],
                                size=12,
                                color= colors[0]
                                ),
                        name="",
                        hovertemplate =
                        '<i>x</i>: %{x} <br>' + 
                        '<i>y</i>: %{y} <br>',
                        #'<i>event</i>: %{text} <br>' ,
                        marker_color=np.where(sample['event'] == 'None', colors[0], colors[3]),
                        line_color = colors[0],
                        #marker_color=colors[0],
                        visible = False,
                        opacity = 0.75))

    figs.append(f)
    count += 1


# add field visualization
fig.add_shape(type="rect",
    x0=0, y0=0, x1=10, y1=53.5,
    line_width = 0,
    layer="below",
    fillcolor="#717785")

fig.add_shape(type="rect",
    x0=110, y0=0, x1=120, y1=53.5,
    line_width = 0,
    layer="below",
    fillcolor="#717785")

# add yard line
for i in range(11):
    fig.add_shape(type="line",
    x0= (i + 1) * 10, y0= 0, 
    x1= (i + 1) * 10, y1= 53.5,
    layer="below",
    line=dict(color="white",width=2))
    
for i in range(10):
    fig.add_shape(type="line",
    x0= (i + 1) * 10 + 5, y0= 0, 
    x1= (i + 1) * 10 + 5, y1= 53.5,
    layer="below",
    line=dict(color="white",width=1.2))

# add annotation
fig.add_annotation(text="Home Endzone",
                  #xref="paper", yref="paper",
                  x=5, y=26.5, showarrow=False,
                  font = dict(
                      size = 20,
                      family=fonts[0],
                      color= colors[0]
                  ),
                   opacity = 0.75,
                  textangle=-90)

fig.add_annotation(text="Visitor Endzone",
                  #xref="paper", yref="paper",
                  x=115, y=26.5, showarrow=False,
                  font = dict(
                      size = 20,
                      family=fonts[0],
                      color= colors[0]
                  ),
                  opacity = 0.75,
                  textangle=90)

# add yard line number
for i in range(9):
    fig.add_annotation(text="0",
                       x= (i+2) * 10 + 1, 
                       y = 3, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'))

    fig.add_annotation(text="0",
                       x= (i+2) * 10 - 1, 
                       y = 50.5, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'))

for i in range(5):
        fig.add_annotation(text= str(i+1),
                       x= (i+2) * 10 - 1, 
                       y = 3, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'))
        fig.add_annotation(text= str(i+1),
                       x= (i+2) * 10 + 1, 
                       y = 50.25, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'),
                          textangle=180)
        
# add yard line number
for i in range(4):
        fig.add_annotation(text= str(5 - (i+1)),
                       x= (i+7) * 10 - 1, 
                       y = 3, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'))
        fig.add_annotation(text= str(5 - (i+1)),
                       x= (i+7) * 10 + 1, 
                       y = 50.25, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'),
                          textangle=180)
# add goal zone line
fig.add_annotation(text= "G",
                       x=  11,
                       y = 3,
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'))
fig.add_annotation(text= "G",
                       x= 11, 
                       y = 50, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'),
                  textangle=180)

fig.add_annotation(text= "G",
                       x=  109,
                       y = 3,
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'))
fig.add_annotation(text= "G",
                       x= 109, 
                       y = 50, 
                       showarrow=False, 
                       font = dict(size = 15, color = 'white'),
                  textangle=180)



# remove x,y axis
fig.update_yaxes(showticklabels=False, showgrid=False, zeroline=False)
fig.update_xaxes(showticklabels=False, showgrid=False, zeroline=False)

fig.update_layout(
                  title_text="Ball route of seven kickoff results",
                  autosize=True,
                  paper_bgcolor='#96b78c',
                  plot_bgcolor='#96b78c',
                  font=dict(
                            family=fonts[0],
                            size=14
                  ),
                  xaxis_tickfont_size=14,
                  yaxis=dict(
                          #title='Nr of plays',
                          titlefont_size=16,
                          tickfont_size=14,
                    ),
                  hoverlabel=dict(
                    bgcolor="white",
                    font_size=14,
                    font_family="Rockwell"
                    ),
                  showlegend=False,
                  #bargap=0.05,
                  #bargroupgap=0.05,
                  #barmode='stack', 
                  xaxis={'visible':False},
                  updatemenus=[
                    dict(
                        type="buttons",
                        #direction="right",
                        active=0,
                        #x=0.57,
                        #y=1.2,
                        buttons=list([
                            dict(label="Touchback",
                                 method="update",
                                 args=[{"visible": [True, False, False,False,False,False,False]},
                                       {"annotations": figs[0]}]),
                            dict(label="Return",
                                 method="update",
                                 args=[{"visible": [False, True, False,False,False,False,False]},
                                       {"annotations": figs[1]}]),
                            dict(label="Muffed",
                                 method="update",
                                 args=[{"visible": [False, False, True,False,False,False,False]},
                                       {"annotations": figs[2]}]),
                            dict(label="Out of Bounds",
                                 method="update",
                                 args=[{"visible": [False, False, False,True,False,False,False]},
                                       {"annotations": figs[3]}]),
                            dict(label="Kickoff Team Recovery",
                                 method="update",
                                 args=[{"visible": [False, False, False,False,True,False,False]},
                                       {"annotations": figs[4]}]),
                            dict(label="Fair Catch",
                                 method="update",
                                 args=[{"visible": [False, False, False,False,False,True,False]},
                                       {"annotations": figs[5]}]),
                            dict(label="Downed",
                                 method="update",
                                 args=[{"visible": [False, False, False,False,False,False,True]},
                                       {"annotations": figs[6]}])
                        ]),
                    )
                ]
)

fig.show()

| Outcome      | Definition |
| ----------- | ----------- |
| **Touchback**      | When a player downs the ball after a free kick behind his team’s own goal line,or the ball is kicked  <br>through the back of the end zone, the play is dead and the ball is spotted on the 25-yard line.       |
| **Return**   | Once the receiving team possess the ball, the objective of them is to score a touchdown, <br>i.e. returning the ball to the end zone of the kicking team, though that is very unlikely on return plays.        |
|**Muffed**|When a player touches a loose ball while unsuccessfully attempting to gain possession. <br>Muffs most frequently occur when a kick or punt returner fails to successfully execute a catch on a free kick or a punt.|
|**Out of Bounds**|A player is out of bounds when he touches any boundary line or touches anything — except a player, <br> an official, or a pylon — that is on or outside a boundary line.|
|**Kickoff Team Recovery**|The kickoff team gains the possession of the ball.|
|**Fair Catch**|A player in position to receive a punt can signal for a fair catch by raising one arm above his head and waving it <br> from side to side. Once the receiver signals for a fair catch, he cannot advance the ball and the play is over when he <br>catches the ball <br>and the opponent may not interfere with or tackle him.|
|**Downed**|The ball is caused to be out of play.|

# Gained yards by kicking teams <a id="gainedyards"></a>

How to evaluate a kickoff is good or not? Considering the ultimate goal is to take the ball towards the end zone of the opponent side, one relevant metric could be how much yards the kickoff team gained after the play. In other words, it is the net yards gained by the kicking team, including penalty yardage, which is documented as the **playResult** in the **Play** data dataset.

On the contrary, for the receiving team, the goal of them is to limit the gained yard of the kick team as small as possible, or we can say, they should return the ball as further as they can once they possess it. 

If we break down the performance by kickoff outcome, we'll see the **Return** is the most exciting one due to its wide range of potential yards the kickoff team can gain. Unlike the other types of outcome, a kicking team could possibly lose yards in the play which makes the game a bit risky. But on the other hand, they could also gain yards over 40 yards in the return.

In [None]:
# select column 'uniqueId','playResult','specialTeamsResult'
kickoff_gained_yards = kickoff_data[['gameId','playId','playDescription','playResult','specialTeamsResult']]

# create subdataset about each outcome
kickoff_gained_touchback_yards = kickoff_gained_yards.loc[kickoff_gained_yards['specialTeamsResult'] == 'Touchback']
kickoff_gained_return_yards = kickoff_gained_yards.loc[kickoff_gained_yards['specialTeamsResult'] == 'Return']
kickoff_gained_muffed_yards = kickoff_gained_yards.loc[kickoff_gained_yards['specialTeamsResult'] == 'Muffed']
kickoff_gained_recovery_yards = kickoff_gained_yards.loc[kickoff_gained_yards['specialTeamsResult'] == 'Kickoff Team Recovery']
kickoff_gained_outbound_yards = kickoff_gained_yards.loc[kickoff_gained_yards['specialTeamsResult'] == 'Out of Bounds']
kickoff_gained_fair_yards = kickoff_gained_yards.loc[kickoff_gained_yards['specialTeamsResult'] == 'Fair Catch']
kickoff_gained_downed_yards = kickoff_gained_yards.loc[kickoff_gained_yards['specialTeamsResult'] == 'Downed']

# create a visualization to show the gained yards per play grouping by different kickoff outcomes
fig = go.Figure()

fig.add_trace(go.Box(y=kickoff_gained_touchback_yards['playResult'],
                     marker = dict(color = colors[0]),
                     pointpos = 0,
                     line = dict(color = 'rgba(0,0,0,0)'),
                     fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all', 
                     jitter=0.5,                                              
                     marker_size=2,
                     name = 'Touchback',
                     #text = kickoff_gained_touchback_yards['playDescription'].values,
                     hoverinfo='skip'
              ))

fig.add_trace(go.Box(y=kickoff_gained_return_yards['playResult'],
                     marker = dict(color = colors[3]),
                     pointpos = 0,
                     line = dict(color = 'rgba(0,0,0,0)'),
                     fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all', 
                     jitter=0.5,                                              
                     marker_size=2,
                     name = 'Return',
                     hoverinfo='skip'
              ))

fig.add_trace(go.Box(y=kickoff_gained_muffed_yards['playResult'],
                     marker = dict(color = colors[0]),
                     pointpos = 0,
                     line = dict(color = 'rgba(0,0,0,0)'),
                     fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all', 
                     jitter=0.5,                                              
                     marker_size=2,
                     name = 'Muffed',
                     hoverinfo='skip'
              ))

fig.add_trace(go.Box(y=kickoff_gained_recovery_yards['playResult'],
                     marker = dict(color = colors[0]),
                     pointpos = 0,
                     line = dict(color = 'rgba(0,0,0,0)'),
                     fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all', 
                     jitter=0.5,                                              
                     marker_size=2,
                     name = 'Recovery',
                     hoverinfo='skip'
              ))

fig.add_trace(go.Box(y=kickoff_gained_outbound_yards['playResult'],
                     marker = dict(color = colors[0]),
                     pointpos = 0,
                     line = dict(color = 'rgba(0,0,0,0)'),
                     fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all', 
                     jitter=0.5,                                              
                     marker_size=2,
                     name = 'Out of Bound',
                     hoverinfo='skip'
              ))

fig.add_trace(go.Box(y=kickoff_gained_fair_yards['playResult'],
                     marker = dict(color = colors[0]),
                     pointpos = 0,
                     line = dict(color = 'rgba(0,0,0,0)'),
                     fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all', 
                     jitter=0.5,                                              
                     marker_size=2,
                     name = 'Fair Catch',
                     hoverinfo='skip'
              ))

fig.add_trace(go.Box(y=kickoff_gained_downed_yards['playResult'],
                     marker = dict(color = colors[0]),
                     pointpos = 0,
                     line = dict(color = 'rgba(0,0,0,0)'),
                     fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all', 
                     jitter=0.5,                                              
                     marker_size=2,
                     name = 'Downed',
                     hoverinfo='skip'
              ))

# add a line where y equals zero 
fig.add_shape(type="line",
    x0=0, y0=0, x1=6, y1=0,
    line=dict(
        color="grey",
        width=1,
        dash="dashdot",
    )
)


fig.update_layout(title_text="Gained yards of the kicking team per result <br><sup>Return is the most exicting play for its wide range of scores tha a kickoff team could gain or lose</sup>",
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(
                            family=fonts[0],
                            size=14
                  ),
                  xaxis_tickfont_size=14,
                  yaxis=dict(
                          title='Gained yards',
                          titlefont_size=16,
                          tickfont_size=14,
                    ),
                  hoverlabel=dict(
                    bgcolor="white",
                    font_size=14,
                    font_family="Rockwell"
                    ),
                  showlegend=False)


fig.show()

# Kick the ball far, if you want to gain more yards <a id="far"></a>
Kicking the ball far, closing to the end zone will result in a lower yard line once the receiving team possess the ball. And it will require the latter one a longer distance to return the ball. On the other hand, the kicking team players will have much more time and space to tackle the ball.
From the output below, it implies that

* When comparing two kicks differ by one yard in the kick length, the longer kick length on average have **~0.36** units more yards gained than the shorter one.
* The difference is statistically **significant** (p < 0.05) and it means there is strong evidence that there is a real association between kick length and number of gained yards in this population.
* The R-squared value is 0.238, indicating that **23.8%** of the variation of number of gained yards can be explained by the kick length.

And based on the findings, I'd suggest taking the kick length of the ball as a metric for evaluating how good a kick off is.

In [None]:
# relationship between the kickLength and number of gained yards
model = sm.OLS.from_formula("playResult ~ kickLength", data=kickoff_data)
result = model.fit()
result.summary()

# Initial analysis - How to kick the ball far <a id="initial"></a>
The next question is how to kick a ball long distance, which should be a skill for kickers to handle. Before diving into what the data reveals to us, let's re-cap some physics at first. The whole process of kicking a ball is a transition of energy from the kicker's body to the ball. The energy exists due to the motion of an object is known as **Kinetic Energy**. Here is its formula ([4](https://en.wikipedia.org/wiki/Kinetic_energy)) -
> **KE = 1/2 * m$v^2$**

inidicating that, given the mass of the kicker's body is unchanged, **the higher the velocity of the kicking leg, the more energy it will be generated and transmitted to the ball**, which contributes to a longer kick length. 

Based on the knowledge above, here we come up with two assumptions -
* The part of the body contacting the ball is the kicker's foot, so ultimately, **is the speed of the kicking foot a useful predictor for kick length?** 
* **Does knowing the body orientation makes a difference in terms of the kicking length?** - Assuming since much more angle pivoted of the body will absorb more energy from transmitting to the ball


By using multiple linear regression, we can see from the output that 

* The **kicker_s_50**, i.e. the median of the kicker's speed, which we use as a proxy of the foot speed in this case, is positively correlated with the kick length. It indicates that comparing two kicks that the ball was being kicked to the same direction, the one kicker has a faster speed by 1 yard per 0.1 second, its estimated kick length is **~7.1** units longer than the slower one.

* The **direction of the ball being kicked to** is also postively associated with the kick length. The result implies that comparing three kicks with the same kicker's foot speed, if the ball was being kicked to the right, the kick length is estimated to be shorter by **~6.8** units, **~5.2** units than to the center or to the left, respectively.

* The model fit above has an R-squared value as 0.257, which indicates that **25.7%** of the variation of the kick length can be explained by the two independent variables, kicking foot speed and direction, together.

In [None]:
# get tracking data of the kicker and the ball
# load tracking data
tracking_datafile = [
                    "/kaggle/input/nfl-big-data-bowl-2022/tracking2018.csv",
                    "/kaggle/input/nfl-big-data-bowl-2022/tracking2019.csv",
                    "/kaggle/input/nfl-big-data-bowl-2022/tracking2020.csv"
                    ]
kickoff_tracking_dis_dir = []

# merge kickoff tracking data with kickoff play data
for file in tracking_datafile:
    data = pd.read_csv(file)
    # join kickoff_play and tracking_data on gameId and playId
    kickoff_tracking_data = pd.merge(
    kickoff_play,
    data,
    how="inner",
    left_on=["gameId","playId"],
    right_on=["gameId","playId"],
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
    )
    # select rows relevent with the route of ball and the kicker
    kickoff_tracking_data = kickoff_tracking_data.loc[(kickoff_tracking_data['nflId'] == kickoff_tracking_data['kickerId']) |
                                                     (kickoff_tracking_data['displayName'] == 'football')]

    kickoff_tracking_dis_dir.append(kickoff_tracking_data)

# merge three seasons
kickoff_tracking_dis_dir_merged = pd.concat(kickoff_tracking_dis_dir)

# Load pff data 
pff_data = pd.read_csv("/kaggle/input/nfl-big-data-bowl-2022/PFFScoutingData.csv")

# join tracking data with pff data
kickoff_tracking_pff_data = pd.merge(
    kickoff_tracking_dis_dir_merged,
    pff_data,
    how="inner",
    left_on=["gameId","playId"],
    right_on=["gameId","playId"],
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
    )

# create unique id to identify play
kickoff_tracking_pff_data['uniqueId'] = kickoff_tracking_pff_data['gameId'].astype(str) + kickoff_tracking_pff_data['playId'].astype(str)
#kickoff_tracking_pff_data.columns
kickoff_tracking_pff_data_select = kickoff_tracking_pff_data[['uniqueId','kickerId',
                                                      'time', 'x', 'y',  'dis', 'o', 's','a',
                                                       'event', 'displayName']]
# choose kickoff events
kickoff_events = ['kickoff',
                 'onside_kick',
                 'free_kick',
                 #'kickoff_play',
                 'autoevent_kickoff']

# select rows which are kickoff events
kickoff_tracking_pff_events = kickoff_tracking_pff_data_select.loc[kickoff_tracking_pff_data_select['event'].isin(kickoff_events)]
# create a list containing uniqueId
unique_ids_pre_kick = kickoff_tracking_pff_events['uniqueId'].unique()

# select relevant columns
prekick_dataset = pd.DataFrame(columns=['uniqueId', 'event','kickerId','kicker_x', 'kicker_y','kicker_o',
                                        'kicker_s_25','kicker_s_50','kicker_s_75','kicker_s_max',
                                        'kicker_a_25','kicker_a_50','kicker_a_75','kicker_a_max',
                                        'ball_x','ball_y','kicker_kick_dis','kicker_approach_dis','kicker_last_step'])
unique_ids_pre_kick = kickoff_tracking_pff_events['uniqueId'].unique()
count = 0

# create a new dataset containing aggregated metrics
for i in unique_ids_pre_kick:
    #print(i)
    dataset = kickoff_tracking_pff_data_select.loc[kickoff_tracking_pff_data_select['uniqueId'] == i]
    #print(i)
    time = dataset.loc[(dataset['event'].isin(kickoff_events)) & (dataset['displayName'] == 'football')]['time'].iloc[0]
    pre_data = dataset.loc[dataset['time'] <=time]
    event = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] != 'football')]['event'].iloc[0]
    kickerId = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] != 'football')]['kickerId'].iloc[0]
    kicker_x = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] != 'football')]['x'].iloc[0]
    #print(i)
    kicker_y  = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] != 'football')]['y'].iloc[0]
    kicker_o  = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] != 'football')]['o'].iloc[0]
    kicker_s_25 = pre_data.loc[pre_data['displayName'] != 'football']['s'].quantile(0.25)
    kicker_s_50 = pre_data.loc[pre_data['displayName'] != 'football']['s'].quantile(0.5)
    kicker_s_75 = pre_data.loc[pre_data['displayName'] != 'football']['s'].quantile(0.75)
    kicker_s_max = max(pre_data.loc[pre_data['displayName'] != 'football']['s'])
    kicker_a_25 = pre_data.loc[pre_data['displayName'] != 'football']['a'].quantile(0.25)
    kicker_a_50 = pre_data.loc[pre_data['displayName'] != 'football']['a'].quantile(0.5)
    kicker_a_75 = pre_data.loc[pre_data['displayName'] != 'football']['a'].quantile(0.75)
    kicker_a_max = max(pre_data.loc[pre_data['displayName'] != 'football']['a'])
    kicker_last_step  = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] != 'football')]['dis'].iloc[0]
    kicker_approach_dis = sum(pre_data.loc[pre_data['displayName'] != 'football']['dis'])
    ball_x = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] == 'football')]['x'].iloc[0]
    ball_y = pre_data.loc[(pre_data['event'].isin(kickoff_events)) & (pre_data['displayName'] == 'football')]['y'].iloc[0]
    kicker_kick_dis = ((kicker_x - ball_x) ** 2 + (kicker_y - ball_y) ** 2) ** 0.5
    row = [i,event,kickerId,kicker_x,kicker_y,kicker_o,
           kicker_s_25,kicker_s_50,kicker_s_75,kicker_s_max,
           kicker_a_25,kicker_a_50,kicker_a_75,kicker_a_max,
           ball_x,ball_y,kicker_kick_dis,kicker_approach_dis,kicker_last_step]
    prekick_dataset.loc[count] = row
    count += 1
    
kickoff_pff_info = kickoff_tracking_pff_data[['uniqueId','kickLength','specialTeamsResult','playResult',
                                        'playDirection','kickDirectionActual']]
kickoff_pff_info_aggr = kickoff_pff_info.groupby(['uniqueId','kickLength','specialTeamsResult',
                            'playDirection','kickDirectionActual']).max().reset_index()[['uniqueId','kickLength','specialTeamsResult','playResult',
                                        'playDirection','kickDirectionActual']]

# merge pff data with kicking data
kickoff_move_data = pd.merge(
    kickoff_pff_info_aggr,
    prekick_dataset,
    how="inner",
    left_on=["uniqueId"],
    right_on=["uniqueId"],
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
    )

#Load player data
player_data = pd.read_csv("../input/nfl-big-data-bowl-2022/players.csv")

# turn player height into cm
h_ft_in = (player_data.height.str.contains('-'), 'height')
player_data.loc[h_ft_in] = player_data.loc[h_ft_in].str.split('-').str[0].astype(int)*12 \
    + player_data.loc[h_ft_in].str.split('-').str[1].astype(int)
player_data['height'] = player_data.height.astype(int) / 39.37

# merge kicking data with player data
kickoff_move_player_data = pd.merge(
    kickoff_move_data,
    player_data,
    how="inner",
    left_on=["kickerId"],
    right_on=["nflId"],
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
    )

kickoff_move_player_data = kickoff_move_player_data[['uniqueId', 'kickLength', 'specialTeamsResult', 
       'playDirection', 'kickDirectionActual', 'event', 'kickerId', 'kicker_x',
       'kicker_y', 'kicker_o', 'kicker_s_25', 'kicker_s_50', 'kicker_s_75',
       'kicker_s_max', 'kicker_a_25', 'kicker_a_50', 'kicker_a_75',
       'kicker_a_max', 'ball_x', 'ball_y', 'kicker_kick_dis',
       'kicker_approach_dis', 'kicker_last_step','height', 'weight', 'birthDate']]

In [None]:
# sort the categorical values in column kickDirectionActual 
kickoff_move_player_data.kickDirectionActual = pd.Categorical(kickoff_move_player_data.kickDirectionActual, 
                      categories=["R","C","L"],
                      ordered=True)

model = sm.OLS.from_formula("kickLength ~ kicker_s_50 + kickDirectionActual", data=kickoff_move_player_data)
result = model.fit()
result.summary()

We also confirmed that the foot speed and direction is nearly uncorrelated (~0.16 is negligible). Thus it is expected that when we add kicking direction to the model, the foot speed coefficient is unaffected.

In [None]:
# need to switch the direction from text into numeric variable
kickoff_move_player_data["kickDirectionActual"] = kickoff_move_player_data.kickDirectionActual.replace({"R": 1, "L":2,"C":3})

# calculate Pearson's correlation
corr, _ = pearsonr(kickoff_move_player_data['kicker_s_50'], kickoff_move_player_data['kickDirectionActual'])
print('The correlation between foot speed and the kicking direction: %.3f' % corr)

# Variable 1 - Foot speed <a id="speed"></a>
The positive relation between the kicking foot speed and the kick length has also been found in Australian Rules football([5](https://acephysed.files.wordpress.com/2015/01/atricle-1-reference-list.pdf)). Inspired by that paper, I was looking at whether there is a strong association between the foot speed and the last step length. I used the data **dis**, the distance traveled from prior time point, in yards (numeric), as a measure for estimating the length of the last step of each kick. And the outcome is promising. We can clearly see that the longer the last step length, the faster speed of the kick's foot. The foot speed has a very high positive correlation (**95.6%**) with the larger last step length before the kick, which usually indicates a larger angle of the pelvis. 

> So for the NFL professions and fans, if you see a kicker has a very large step before kicking the ball, it usually means he runs very fast towards the ball and the length of the kick shouldn't be short.

In [None]:
# calculate Pearson's correlation
corr, _ = pearsonr(kickoff_move_player_data['kicker_s_50'], kickoff_move_player_data['kicker_last_step'])
print('The correlation between foot speed and the last step length: %.3f' % corr)

Apart from that, it is also found that a longer approach run can help you accelerate to the high speed before contacting the ball - the correlation between the two is highly positive, **~55%**.

In [None]:
# calculate Pearson's correlation
corr, _ = pearsonr(kickoff_move_player_data['kicker_s_50'], kickoff_move_player_data['kicker_approach_dis'])
print('The correlation between foot speed and the approach distance: %.3f' % corr)

# Variable 2 - Body orientation <a id="orientation"></a>
Another interesting fact is the ball kicked to the right direction usually has a shorter length than the one to the left or to the middle. 

In [None]:
# merge the pff data with play and tracking data
kickoff_pff_data = pd.merge(
    kickoff_data,
    pff_data,
    how="inner",
    left_on=["gameId","playId"],
    right_on=["gameId","playId"],
    sort=True,
    suffixes=("_x", "_y"),
    copy=True,
    indicator=False,
    validate=None,
    )

# select relevant columns
kickoff_direction_length = kickoff_pff_data[['season','uniqueId','specialTeamsResult','kickLength','kickDirectionActual']]
# turn the season data type from int to str
kickoff_direction_length['season'] = kickoff_direction_length['season'].astype(str)
# sorting the L,C,R values in the kickDirectionActual column
kickoff_direction_length.kickDirectionActual = pd.Categorical(kickoff_direction_length.kickDirectionActual, 
                      categories=["L","C","R"],
                      ordered=True)

# replace values into full
kickoff_direction_length['kickDirectionActual'] = kickoff_direction_length['kickDirectionActual'].replace('C', 'Central')
kickoff_direction_length['kickDirectionActual'] = kickoff_direction_length['kickDirectionActual'].replace('L', 'Left')
kickoff_direction_length['kickDirectionActual'] = kickoff_direction_length['kickDirectionActual'].replace('R', 'Right')

# get data on number of plays per direction of a kicking ball in the three seasons
kickoff_direction_length_aggr = kickoff_direction_length.groupby(['season','kickDirectionActual']).nunique().reset_index()[['season','kickDirectionActual','uniqueId']]
kickoff_direction_length_aggr['total'] = kickoff_direction_length_aggr.groupby('season')['uniqueId'].transform('sum')
kickoff_direction_length_aggr['share_total'] = round(kickoff_direction_length_aggr['uniqueId'] * 100 / kickoff_direction_length_aggr['total'],1)
kickoff_direction_length_median = kickoff_direction_length.groupby(['season','kickDirectionActual']).median().reset_index()[['season','kickDirectionActual','kickLength']]

# visualization of the kick direction data
fig = make_subplots(rows=2, cols=1,
                    shared_xaxes=True,
                    vertical_spacing=0.02)

directions = kickoff_direction_length_aggr['kickDirectionActual'].unique()
box_colors = ['#013369',
          '#d50a0a',
         '#0264ce']
count = 0
for d in directions:
    dataset_1 = kickoff_direction_length.loc[kickoff_direction_length['kickDirectionActual'] == d]
    
    fig.add_trace(go.Box(x=dataset_1['season'], 
                             y=dataset_1['kickLength'],
                         marker = dict(color = box_colors[count]),
                        pointpos = 0,
                        #line = dict(color = 'rgba(0,0,0,0)'),
                     #fillcolor = 'rgba(0,0,0,0)',
                     boxpoints='all',
                     name = d,
                     hovertemplate =
                        '<i>Kickoff Length (yrd)</i>: %{y}',
                         #hoverinfo = 'skip',
                     jitter=0.5,
                     showlegend=False,
                     marker_size=2),
              row=1, col=1)
    
    dataset_2 = kickoff_direction_length_aggr.loc[kickoff_direction_length_aggr['kickDirectionActual'] == d]
    #dataset_3 = kickoff_direction_length_median.loc[kickoff_direction_length_median['kickDirectionActual'] == d]
    fig.add_trace(go.Bar(x=dataset_2['season'], 
                         y=dataset_2['share_total'],
                         marker_color=box_colors[count],
                         name = d,
                         text = d,
                         showlegend=True,
                         hovertemplate =
                        '<i>% of Kickoffs</i>: %{y}',
                    ),
              row=2, col=1)

    count += 1



fig.update_layout(title_text="Number of kickoffs per direction and kick length scatter plot from 2018 to 2020<br><sup>Most kicks fly in a central direction. </sup>" 
                  + "<sup>Kicks to the left is more likely than to the right.</sup>",
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(
                            family=fonts[0],
                            size=14
                  ),
                  xaxis_tickfont_size=14,
                  yaxis2=dict(
                          title='% of Kickoffs',
                          titlefont_size=12,
                          tickfont_size=12,
                    ),
                  yaxis=dict(
                          title='Length (yrd)',
                          titlefont_size=12,
                          tickfont_size=12,
                    ),
                  hoverlabel=dict(
                    bgcolor="white",
                    font_size=14,
                    font_family="Rockwell"
                    ),
                  #showlegend=False,
                  bargap=0.05,
                  bargroupgap=0.05,
                  boxmode='group',
                  boxgap = 0.05,
                  legend_title_text='Ball Direction'
                  #barmode='stack'
                 )

fig.show()

The hypothesis is, based on data from average people, most of the players are right footed ([6](https://www.psychologytoday.com/us/blog/the-asymmetric-brain/202002/5-scientific-facts-about-left-footedness#:~:text=Most%20people%20are%20right%2Dfooted,et%20al.%2C%202020)), and they most likely choose to position at the left back side of the ball before the approach.

This is because players need to place their support leg, which is most likely the left leg at the left side near the ball ([7](https://ftvs.cuni.cz/FTVS-2332-version1-the_biomechanics_of_kicking_in_soccer_a_review.pdf)). If they come from the right side, the route will be more curved and make the kicking action awkward.

In [None]:
# position before approaching to the ball
kickoff_move_player_data['diff_x'] = kickoff_move_player_data['kicker_x'] - kickoff_move_player_data['ball_x']
kickoff_move_player_data['diff_y'] = kickoff_move_player_data['kicker_y'] - kickoff_move_player_data['ball_y']
kickoff_move_player_data = kickoff_move_player_data.loc[(kickoff_move_player_data['diff_x'] <= 1.5) & (kickoff_move_player_data['diff_x'] >= -1.5)]
kickoff_move_player_data = kickoff_move_player_data.loc[(kickoff_move_player_data['diff_y'] <= 1.5) & (kickoff_move_player_data['diff_y'] >= -1.5)]

# visualization
# fig = make_subplots(rows=1, cols=2)

fig = go.Figure(data=go.Scatter(x=kickoff_move_player_data.loc[kickoff_move_player_data['playDirection'] == 'left']['diff_x'], 
                                y=kickoff_move_player_data.loc[kickoff_move_player_data['playDirection'] == 'left']['diff_y'], 
                                mode='markers',
                                name = 'Left',
                                marker_color = ['#013369','#d50a0a','#0264ce'][0]
                               ))

fig.add_trace(go.Scatter(x=kickoff_move_player_data.loc[kickoff_move_player_data['playDirection'] == 'right']['diff_x'], 
                                y=kickoff_move_player_data.loc[kickoff_move_player_data['playDirection'] == 'right']['diff_y'], 
                                mode='markers',
                                name = 'Right',
                                marker_color = ['#013369','#d50a0a','#0264ce'][1]
                               ))
# add the ball
fig.add_shape(type="circle",
    xref="x", yref="y",
    x0=0.02, y0=0.05,
    x1=-0.02, y1=-0.05,
    opacity=0.75,
    fillcolor="orange",
    line_color="orange",
)

fig.add_annotation(x=0.01, y=0.015,
            text="Ball",
            showarrow=True,
            arrowhead=1)

# left back side
fig.add_shape(type="circle",
    xref="x", yref="y",
    x0=0.05, y0=0.5,
    x1=-1.45, y1=1.45,
    opacity=0.75,
    #fillcolor=False,
    line_color="orange",
)

fig.add_shape(type="circle",
    xref="x", yref="y",
    x0=0.05, y0=-0.5,
    x1=1.5, y1=-1.4,
    opacity=0.75,
    #fillcolor=False,
    line_color="orange",
)

fig.update_layout(title_text="Position of the kicker before approaching to the ball<br><sup>Kickers most likely stand at the left back side of the ball.</sup>",
                  paper_bgcolor='rgba(0,0,0,0)',
                  plot_bgcolor='rgba(0,0,0,0)',
                  font=dict(
                            family=fonts[0],
                            size=14
                  ),
                  xaxis_tickfont_size=14,
                  xaxis=dict(
                          title='horizontal distance between ball and kicker',
                          titlefont_size=12,
                          tickfont_size=12,
                    ),
                  yaxis=dict(
                          title='vertical distance between ball and kicker',
                          titlefont_size=12,
                          tickfont_size=12,
                    ),
                  hoverlabel=dict(
                    bgcolor="white",
                    font_size=14,
                    font_family="Rockwell"
                    ),
                  #showlegend=False,
                  legend_title_text='Play Direction'
                  #barmode='stack'
                 )


fig.show()

In this case, if the player wants to kick the ball to his right, his body needs to be pivoted at a larger angle, which is confirmed by our data as shown below.

In [None]:
#kickoff_move_player_data_o
kickoff_move_player_data_o = kickoff_move_player_data.groupby(['playDirection','kickDirectionActual']).median().reset_index()[['playDirection',
                                                                                                                              'kickDirectionActual',
                                                                                                                              'kickLength',
                                                                                                                              'kicker_o']]
# replace values into full
kickoff_move_player_data_o["kickDirectionActual"] = kickoff_move_player_data_o.kickDirectionActual.replace({1:"R", 2 : "L", 3:"C"})
kickoff_move_player_data_o['kickDirectionActual'] = kickoff_move_player_data_o['kickDirectionActual'].replace('C', 'Central')
kickoff_move_player_data_o['kickDirectionActual'] = kickoff_move_player_data_o['kickDirectionActual'].replace('L', 'Left')
kickoff_move_player_data_o['kickDirectionActual'] = kickoff_move_player_data_o['kickDirectionActual'].replace('R', 'Right')

# get the difference in degrees between left and central; left and right
kickoff_move_player_data_o_pivoted = pd.pivot_table(kickoff_move_player_data_o, values = 'kicker_o', index=['playDirection'], columns = 'kickDirectionActual').reset_index()
kickoff_move_player_data_o_pivoted['Right_Central'] = round(kickoff_move_player_data_o_pivoted['Right'] - kickoff_move_player_data_o_pivoted['Central'],1)
kickoff_move_player_data_o_pivoted['Right_Left'] = round(kickoff_move_player_data_o_pivoted['Right'] - kickoff_move_player_data_o_pivoted['Left'],1)

left_right_to_central = kickoff_move_player_data_o_pivoted['Right_Central'][0]
left_right_to_left = kickoff_move_player_data_o_pivoted['Right_Left'][0]
right_right_to_central = kickoff_move_player_data_o_pivoted['Right_Central'][1]
right_right_to_left = kickoff_move_player_data_o_pivoted['Right_Left'][1]

print("Kicking the ball to the right side of the field needs to orient the kicker's body by ")
print("    - " + str(left_right_to_central) + " more degrees than kicking to the central, and " + str(left_right_to_left) + " more degrees than kicking to the left, if the play direction is to the left.")
print("    - " + str(right_right_to_central) + " more degrees than kicking to the central, and " + str(right_right_to_left) + " more degrees than kicking to the left, if the play direction is to the right.")


This larger angle will absorb some extent of energy from the body that makes the energy transmitting to the ball become less.

# Conclusion and further research <a id="conclusion"></a>

* This notebook suggests taking the kick length as a valid metric for evaluating performance of the kicking team in a kickoff special game play. It also reveals from data, that two potential drivers to a long kick length. Those are 
    * foot speed
    * kicking direction
* Adding new measures of the tracking could benefit the research on a more granuarly level. To be more specific, the current competition don't provide data on movement of kickers' toe, knee, hip, support leg, shoulder and foot.

# Reference <a id="reference"></a>

1. [The definition of special teams](https://en.wikipedia.org/wiki/American_football_positions#Special_teams)
2. [The definition of kickoff play?](https://operations.nfl.com/learn-the-game/nfl-basics/terms-glossary/)
3. [Another tweak to the kickoff rule promotes more touchbacks](https://profootballtalk.nbcsports.com/2018/07/06/another-tweak-to-the-kickoff-rule-promotes-more-touchbacks/)
4. [The formula of kinetic energy](https://en.wikipedia.org/wiki/Kinetic_energy)
5. [Biomechanical considerations of distance kicking in Australian Rules football](https://acephysed.files.wordpress.com/2015/01/atricle-1-reference-list.pdf)
6. [5 Scientific Facts About Left-Footedness](https://www.psychologytoday.com/us/blog/the-asymmetric-brain/202002/5-scientific-facts-about-left-footedness)
7. [The biomechanics of kicking in soccer: A review](https://ftvs.cuni.cz/FTVS-2332-version1-the_biomechanics_of_kicking_in_soccer_a_review.pdf)