In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from joblib import dump, load

PATH = '../input/model/FG.csv'
FG = pd.read_csv(PATH, delimiter='\t')
fg = FG.loc[()].groupby(['name'])[['FGAOE', 'fg_attempt']].sum()
fg = fg.loc[fg.fg_attempt>=15]
fg['avg_FGAOE'] = fg['FGAOE'] / fg['fg_attempt']

PATH = '../input/model/XP.csv'
XP = pd.read_csv(PATH, delimiter='\t')
xp = XP.loc[()].groupby(['name'])[['FGAOE', 'xp_attempt']].sum()
xp['avg_FGAOE'] = xp['FGAOE'] / xp['xp_attempt']

KA_model = load('../input/model/expected_kick_accuracy_model.joblib') 
dump(KA_model, '/kaggle/working/expected_kick_accuracy_model.joblib') 

# Kick Accuracy - Beyond Field Goal Percentage üëü üèà

## Introduction

Place kicker accuracy has traditionally been measured in terms of field goal percentage.  It's a simple outcome based metric that treats all field goal makes equally, and all field goal misses equally.  However, we know that all field goal attempts are not created equal.  The biggest factors that influence a kick are distance and weather. The goal of this effort was to account for these factors in order to gain more insight about place kicker accuracy.

## Deviation from Center üéØ

First, we establish new method to measure kick accuracy.  I make one important assumption in this work - that a place kicker is always aiming for the center of the goal posts.  Therefore, deviation from the center of the goal posts is not intentional, and due to inaccuracy.  Using this assumption, we can use deviation from center as a measure of accuracy.  This, of course, can be computed using the ball tracking data provided by in this dataset. 

## Expected Accuracy üèπ

Now that we've established an accuracy metric that goes beyond make / miss, we can train an expected accuracy model.  It is intuative that longer field goals, in wind, rain or cold are more difficult so the deviation from center might be higher.  On the other hand (or should i say foot hahaha), shorter field goals in a dome with no wind are easier, so the deviation from center might be lower.  

I used 2nd order polynomial regression to train an expected accuracy model using the following features:
1. X position of ball at time of kick
2. Y position of ball at time of kick
3. Average wind speed for the game
4. Average temperature for the game
5. Average precipitation for the game

Games played in a dome or retractable roof were assumed to have wind speed and precipitation of zero and temperature of 70 degrees.

The [weather data](https://github.com/ThompsonJamesBliss/WeatherData) is publicly available, thanks to Tom Bliss.

## Kick Accuracy Over Expectation

Once we've established an expected accuracy model, we can use the difference between the *actual* kick deviation from center, and the *expected* deviation from center to create a new contextualized kick accuracy metric - Kick Accuracy Over Expectation (KAOE).  The metric can be applied to both field goal kicks, as well as extra points.

In [None]:
import matplotlib.pyplot as plt
import pylab as pl

fig, ax = pl.subplots(figsize=(13, 11))

Unsigned = ['Josh Lambo', 'Stephen Hauschka', 'Eddy Pineiro', 'Stephen Hauschka', 'Dan Bailey', 'Stephen Gostkowski', 'Cody Parkey', 'Mike Nugent', 'Matt Bryant', 'Aldrick Rosas', 'Adam Vinatieri', 'Kai Forbath', 'Brett Maher', 'Sebastian Janikowski', 'Chandler Catanzaro']

for P in fg.index.values:

    name = P

    if P in xp.index.values:
        xxx = 1
    else:
        continue

    A = -1 * fg.loc[fg.index==P]['avg_FGAOE'].to_numpy()[0]
    B = -1 * xp.loc[xp.index==P]['avg_FGAOE'].to_numpy()[0]

    S = 150
    color = 'blue'
    marker = 'o'
    fillcolor = 'none'

    if name in Unsigned:
        marker = 'X'
        fillcolor = 'red'
        color = 'none'
    if name == "Jason Sanders" or name == "Josh Lambo" or name == "Aldrick Rosas" or name == "Justin Tucker":
        color = 'green'

    plt.scatter(A, B, s=S, marker=marker, alpha=0.7, linewidth=2.5, color=fillcolor, edgecolors=color)

    plt.annotate(name.split()[1],
                 (A, B),
                 textcoords="offset points",
                 xytext=(0,12),
                 ha='center',
                 color='black',
                 fontsize=12,
                 alpha=0.7)

font_color = 'black'
background_color = 'white'#'#f1e7da'

plt.suptitle('Kicker Accuracy Over Expectation', fontsize=32, color=font_color)
plt.title('2018-2020 seasons. Minimum 15 FG attempts\n', fontsize=18, color=font_color)
plt.xlabel('Avg FG KAOE', fontsize=18, color=font_color)
plt.ylabel('Avg XP KAOE', fontsize=18, color=font_color)
ax.tick_params(axis='both', which='major', labelsize=15, color=background_color, labelcolor=font_color)
plt.grid(color=font_color, linestyle='--', linewidth=1, alpha=0.15)

plt.figtext(.75,.03,'Plot: @Pavel_Vab\nData: @BDB2022',fontsize=15, color=font_color)

fig.patch.set_facecolor(background_color) #f6f0e6
ax.set_facecolor(background_color) #f6f0e6
ax.spines['bottom'].set_color(background_color)
ax.spines['top'].set_color(background_color)
ax.spines['left'].set_color(background_color)
ax.spines['right'].set_color(background_color)
ax.set_aspect(1)

plt.axvline(0, color='red', linestyle='--', linewidth=2, alpha=0.2, zorder=0) # x = 0
plt.axhline(0, color='red', linestyle='--', linewidth=2, alpha=0.2, zorder=0) # y = 0

plt.show()

*Figure 1 : Average kick accuracy over expectation. Green => AP All Pro honors.  Blue => On an NFL roster.  Red X => Currently Not on an NFL roster or retired*

## Who is the most accurate kicker of them all? üèÜ

To nobody's surprise, **Justin Tucker** ü•á.  He has been widely considered the best kicker in the NFL over the past several years, and this metric supports that.  From 2018-2020, Tucker was 1st in KAOE on extra points, and 2nd in KAOE on field goals. Tucker was named to the AP All-pro first team in 2018 and 2019, and second team in 2020.

**Josh Lambo** ü•à, who I will point out is currently not on an NFL roster, actually beat out Tucker for 1st place in FG KAOE, but fell short in XP KAOE.  Lambo was name AP All-pro second team in 2019.  Other All-pro kickers were Jason Sander (1st team 2020), and Aldrick Rosas (2nd team 2018).

Inaccurate kickers do not last long in the NFL.  The red X's represent kickers who are no longer on an NFL roster in 2021, and 12 / 14 of them had a negative FG KAOE from 2018-2020.  

## Does pre-snap motion affect the kicker?

No. Since player tracking data is available, I explored a few other features as inputs for the model, but they unfortunately did not have predictive power.  One of those features was pre-snap motion by the defending special teams.  I used average and max velocity within 1 second of the snap to see if pre-snap motion had any impact on the kicker's accuracy.  It did not.  I really had hoped it would because this would be easy and actionable information for defensive special teams.  

The other feature I explored was the minimum distance of defender from the ball at the time of the kick.  Does applying pressure to the kicker, but not blocking the kick, impact the kicker's accuracy?  No, unfortunately this feature also did not have predicitive power.

## Limitations of study

The lack of a Z-component for the ball tracking means that kicks that were short misses were not penalized by this metric.  To my surprise, the three weather features had very little predictive power in this model, but I left them in regardless. However, I remain convinced that weather plays a factor in kick accuracy, and that more precise weather data (i.e. at the time of the kick as opposed to game average) would be valuable.

There is also a degree of error in the tracking data, which I discovered when plotting Justin Tuckers' kick trajectories.  In week 7 of the 2018 season, Tucker uncharacteristically missed an extra point attempt against the Saints which would have sent the game into overtime.  The video evidence of this is a clear miss, but the tracking data of that play suggests that Tucker made the extra point.  The image below displays all of Tucker's XP trajectories from the 2018 season.  His lone miss appears to be within the right upright.

![](https://i.imgur.com/NDvSHam.png)

## Conclusion

I trained a simple model that uses ball tracking data to contentualize and better measure place kicker accuracy, kick accuracy over expectation (KAOE).  This method and metric can be used by teams to help decide on a kicker during kicking competitions in training camp.  Yes, both of them may have tied and gone 10/10 on their field goals in practice, but KAOE will tell you which one of them was more accurate.  I also expect this work will lead to Josh Lambo's future employment as an NFL place kicker as soon as he gets healthy.

Python | scikit-learn | Matplotlib