In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

(Disclaimer: I work for Pro Football Focus, one of the data provider for this competition and hence not eligible for the prize)

Punting is a huge part of special team game in NFL with about four per team games, however strategy behind them espically what are the best decision for punter and returner are not well understand. The following would be a discussion of the following problems:

1. What's the best location for a punt to land?
2. How much of execution error will affect the result?
3. How can data evaluate returner performance?

Generally speaking as the punt land there are two decision a returner a make: 
1. Field the ball for a fair catch or return for yardage
2. Let the ball bounce

As American football is an oval shape, the path of it bounce would be much more unpredictable than sports with spherical shape e.g. soccer or tennis, and therefore a model should need to capture the uncertainty of the ball. Therefore, a multi-layer perception model is used for capturing the uncertainty of the bounce. 

By creating a bivariate Normal Distribution model with x-y location of punter, returner and where the ball is landed, we can train a model and sample the output distribution of ball location and build upon it for more detailed analysis. 

$$PuntBounceYard \sim Normal(xy_{punter},xy_{returner},xy_{punt})$$

And yard that the ball has bounced is by difference of yard when the ball has velocity and accelaration smaller than 1 $ms^{-1}$ and  $ms^{-2}$

![](https://imgur.com/EJyakEr.png)

Then we can calculate average starting yard of opponent drive if opponent let the ball boounce. For every 1x1 yard grid on the field, 10,000 samples of the yardline where ball eventually stop at are used and hence calculate the probability of getting touchback which put the ball on 20 yard line, or the ball bounce out of bounce which put the ball to the yard that intersect the boundary. Since for most long distance kick punters are aimed for power not much for accuracy, all models below assume the punt happen on 50 yard line for maximize accuracy. The result is as below:

![](https://imgur.com/Qw0qi12.png)

For the darkest color which indicate lowest starting yard line (beneficial to punting team), given the returner let the ball to bounce the optimal location of the punt would be about 10 yards, or extend to the sideline where the ball would most likely to bounce out of bound.

After establishing how the ball bounce, the next thing to do is finding how often the returner let the ball bounce in punter perspective. While it is possible to use a gradient boosting model with similar features as before with location of punter, returner... etc, a tree base model often do not generate an output that is easy to interpret compare to a neural network given similar performance. Therefore a simple multi-layer perception model is used here to predict return probability and others.

![](https://imgur.com/VQUrx6V.png)

![](https://imgur.com/nGToQRO.png)

For a complete picture, fair catch probability and return yard average (including penalty yards from both side) are also considered.

$$NetReturnYard = ReturnYard-NetPenaltyYardByReceiving$$

![](https://imgur.com/NCyHCK1.png)

$$fairCatchProbability = P(fairCatch|FairCatch,Return,Muffed)$$

![](https://imgur.com/AxmJRrI.png)

With all the information above, the value of an average return equal to

$$ CatchProbability \cdot (fairCatchProbability + (1-fairCatchProbability) \cdot NetReturnYard)) + catchYardline $$

And difference between catch and returning can be found

![](https://imgur.com/4It67kx.png)

As seen above, the cutoff point is about on 5 yard line and if punt land further than that it is better to let the ball bounce instead of catching it. Assuming returner making the optimal decision, here is the average starting yardline corresponding to where the punt land

![](https://imgur.com/LaxpkLS.png)

It shows that the optimal location of punt is about the sideline, which coincide of traditional knowledge of "coffin corner" punt. However, it assume punter has perfect execution of punt while in real life there are some room for error. To address the issue, Gaussian random noise is added to punt location to obvserve the how punt yard change when returner let the ball bounce

![](https://imgur.com/Qw0qi12.png)

![](https://imgur.com/d0Qkl7J.png)

![](https://imgur.com/VqPWPBg.png)

Larger the error, the advantage of corner punt tends to disappear since there would be a higher chance to give up a touchback. After consider return we can similarly calculate average starting yardline assume returner make the optimal decision

![](https://imgur.com/faHqvCR.png)

Comparing to optimal punt location withour error before, it is unfavaourable to punt to corner, and it is pretty much the same aiming for sideline or middle of the field where aming about 9 yard line is the most optimal

Finally we can use the above information to evaluate how returner perform. But first of all a really big factor about returner is how often he muff the ball, espically for returners who do not set his feet it become much easier. 

![](https://imgur.com/UPnlwRU.png)

With tracking data, it shows that returner speed is correlated to muff risk and higher the speed he is more likely to lose the ball. Given a lost possession close to own end zone can give up ~80 yards of value, it is not worth it for few yards given up by let the ball bounce. We can see how often returners have dangerous by define it as returner speed greater than 5 yards per second. Here is top 10 returners with highest dangerous catch rate (min 20 returns)

In [None]:
pd.DataFrame({'player':['Tim White','Donovan Peoples-Jones','Mecole Hardman','Tyreek Hill','D.J. Moore','CeeDee Lamb','Cooper Kupp','Micah Hyde','David Moore','Gunner Olszewski'],'dangerous catch rate':[0.266667, 0.200000,0.183333,0.166667,0.157895,0.153846,0.142857,0.142857,0.142857,0.138462]}).set_index('player')


Secondly as shown on above punt that land inside 5 yard are likely to become a touchback and hence returners should not try to catch the ball. Here is returners who catch the ball in the 5 yard for more than 1 time

In [None]:
pd.DataFrame({'player':['Nyheim Hines','Tarik Cohen','Tyreek Hill','JoJo Natson','Christian Kirk','DaeSean Hamilton',"De'Anthony Thomas",'Diontae Johnson','Adam Humphries','Pharoh Cooper','Micah Hyde'],'Catch Inside 5 yards':[3,3,2,2,2,2,2,2,2,2,2]}).set_index('player')


As discussed above most returners are able to judge optimally and generally it is not an issue.

Finally with the expected catch rate model established before, one can find how often a returners can catch the ball over expected. Since for most situations even fair catch the ball can gain few more yards than let it bounce, returners who have more catches are generally better, but of course it is not build on cost that returners catch when the feet are not set. Here is top and bottom five returners with catch rate over expected

In [None]:
pd.DataFrame({'player':['Troymaine Pope', 'Ted Ginn', "Da'Mari Scott",'Brandon Zylstra' ,'K.J. Osborn', 'D.J. Reed' ,'Chad Beebe' ,'CeeDee Lamb', 'Danny Amendola', 'DaeSean Hamilton'],'Catch rate over expected':[-0.346902,-0.321041,-0.319027,-0.279114,-0.264701,0.008246,0.011463,0.051884,0.060763,0.061008]}).set_index('player')

Hope that the above discussion can answer some of the questions you may have regarding to punt. Notebook of code is avaliable here: https://www.kaggle.com/s903124/punt-submission-notebook