![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/summary.jpg)

## [Summary Slides PDF](https://drive.google.com/open?id=1xl3C1QIme3scYFvwFgRW2mZ_IhwOHzqF)

# Introduction
Since 2002 alone, the NFL has made 50 rules changes intended to eliminate potentially dangerous tactics and reduce the risk of injuries.
Yet the yearly reported concussions does not decrease. We are glad that player safety gets more and more attention.
Big Data and Data Science was probably used to improve NFL athlete performance before the term became *sexy*. 
Now it's our turn to make punts safer!
  
Beside the challenging problem and dataset, the competition gave us a good excuse to watch as many games as possible.
Previously punts were not our favorite part of the game, however as we watched them more closely we must admit they [could be pretty cool](https://www.kaggle.com/gaborfodor/don-t-forget-to-watch-the-games)!
Blocked punts, fake punts, punt return TDs even the lucky out of bounds were exciting.

Unfortunately injury or concussion happens way more often than blocked punts or TDs.
Obviously we would not want to [eliminate punts](https://www.kaggle.com/c/NFL-Punt-Analytics-Competition/discussion/74599) or turn it into a ceremonial play.
However, things has to be changed.

## Low sample high diversity
We got detailed description and video footage for 37 concussions that happened during more than 6600 punt plays.
The low number of control events made the task more challenging. 
While 6600 punt plays might feel like a lot, with the low overall concussion probability you have to be careful to not to overfit.
It is especially true when you crunch the data for a month you will likely find patterns that are not significant.

### The tale of the left and right gunners
We found that 4 left gunners and 1 right gunner suffered concussions.
Does it really mean that being a left gunner is more dangerous?
It would be surprising, right? 
The truth is that we don't know. Statistically speaking the difference is insignificant.
When we check all the collisions on the field we see symmetric distributions.

![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/CollisionCoords.png)

On the other hand we watched the injury videos over and over again to understand the dataset and the problem better.
![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/VideoInjuries.gif)
It became clear that there is not a single rule that fits all the incidents.
Many of the injuries were unfortunate
accidents (e.g. friendly fire)
injured players had different activity (blocked, blocking, tackled, tackling) impact type (body or helmet) role (returner, gunner, wing, guard, line).

We also hope that the recently introduced weapon as a helmet rule already reduced the most serious impacts.
We could not confirm it though as we don't have concussion results for the current season.  

## External data
The competition provided only punt plays. However punts are not isolated. They are often the last desperate event after an exhausting drive.
How successful is the punt? That depends a lot on the next drive.

We used additional external data be able to better understand the game overall. 
Special thanks to Maksim Horowitz for providing [detailed play data for seasons 2009-2017](https://www.kaggle.com/maxhorowitz/nflplaybyplay2009to2016/home) with more than 400,000 plays.
We find it crucial to use all available data to answer game integrity related questions.

Since we had to deal with different data sources and sometimes noisy free text fields we often used available aggregated statistics for sanity check.

## Methodology
* Overview the collisions frame by frame
* Collect more data
* Understand relevant medical studies
* Calculate important features (speed, acceleration, collision) from NGS
* Derive insights from detailed play descriptions and overall statistics
* Check the current NFL, NCAA rules and recent modifications
* Consult with coaches

## Kernels
Reproducibility is crucial for research and we took it seriously.
Please check out our kernels for more details with interactive plots and source code.

 * [Exploratory Data Analysis & External Data](https://www.kaggle.com/gaborfodor/exploratory-data-analysis-external-data)
 * [The Speed, the Acceleration and the Collision](https://www.kaggle.com/gaborfodor/the-speed-the-acceleration-and-the-collision)

### Example injury video

In [None]:
from IPython.display import HTML
HTML('''
<video width="800" height="450" controls>
<source src="https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/BudapesPythonsFromRawDataToInsights.mp4" type="video/mp4">
Your browser does not support the video tag.</video>
''')

# How to make the game safer without breaking game integrity? 

## Punt Types
With the help of the detailed play descriptions we were able to classify each punt event into the following categories:
![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/PuntType.png)

Official statistics have similar categories. One important difference is that we kept MUFFS catch separate. 
After iteratively finetuning our classification rules we ended with only a handful of other uncategorized plays (OTHER). 
 
## Return is 9x more dangerous than fair catch 
The vast majority of confirmed concussions happened during punt returns.
It is not ground breaking result it was already known that after kickoff returns punt returns are the most dangerous part of the game.
To measure important risk factors we collected additional KPIs from the provide Next Gen Stats player movements and additional punt play descriptions.
Beside the confirmed concussions we checked serious penalties, injuries and collisions.

![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/returnvsfaircatch.png)
*The y-axis and color indicates risk while the size shows the volume of the punt plays.
All plot is sorted based on the risk levels. Returns and muffs were the most dangerous plays.*

Muffs are the least frequent events (3%) and any other part of the game with turnover is expected to be 
 risky. 
We would like to focus on returns (41%) and fair catches (24%) as they have high share and huge difference in terms of risks. 

# Suggested Rule Modifications

## Fair Catch is FAIR
Given the speed of punt returns and the fact that these plays already have higher penalty rates, we believe any new rule that tries to penalize specific details about tackles or blocks would have limited effect.

Although it is known that returns are more dangerous, currently risky behavior is encouraged by the expected rewards.
The average four yards difference between average gross (45) and net punt yards (41) may not feel too much.
If we just focus on returns the average 8.7 returned yards gives you almost a first down.
We would like to encourage and reward safety more to reduce the share of punt returns and increase fair catches.

#### The Carrot
**Reward any fair catch after scrimmage kick with 5 yards**


The median returned yards is close to the proposed five yards reward.
We estimate that this single rule could reduce the punt returns by 30%.
We would like to monitor the actual effects in the following season and optimize the rule based on data. 

#### The Fair Zone
**Treat fair catch inside the 20-yard line by the receiving team on a scrimmage kick as a touchback**

37% of the punt returns starts below 20 yards. This option could reduce punt returns by 20%.

### Game Integrity

#### Similar Rules
In 2011 NFL had similar motivations when changed kickoffs by moving the starting position from 30 to 35 yard line.
[Study](https://mpra.ub.uni-muenchen.de/90314/1/MPRA_paper_90314.pdf) showed it increased touchback rates from the previous 16% to 45% and effectively
reduced player injuries.

#### Overall effect in points is negligible
![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/DS.png)
*In the most common punt results 10 yards gain in LOS gives you 0.3 points*

We believe that even 50% punt return reduction rate would have small impact on the overall game integrity and our proposed changes would have smaller effects.
On an average game a team has 2.1 returns out of 4.7 punts.

The rules could have different effects on offensive and defensive teams but the overall effect of either option would be less than 1 point.

#### Additional risks are low
Muffs could increase a bit but any other punt type is much more safer than returns.
The rule changes would increase the share of out of bounds and touchbacks even more so the overall effect would remain positive.
As the fair zone would change the game around the goal line we expect higher turbulence and risks. 

We prefer the carrot over the fair zone due to the easier implementation, bigger estimated impact and lower risks.
  
We are confident that this is the area where the most significant immediate success could be achieved.
We are happy to discuss any other ideas that aim to reduce the share of punt returns.
Other teams have already suggested similar improvement ideas like 15 yards touchbacks (TB15) or 10 yards fair catch reward (FC10).
We think TB15 would have lower impact due to punting distance limitations.
While FC10 could hurt game integrity by giving more points to fair catch than the average returned yards.  

## Additional rule modifications with lower impact
[Illegal Block Above the Waist](https://operations.nfl.com/the-rules/nfl-video-rulebook/illegal-block-above-the-waist/) is not necessary a dangerous foul, the penalty is 10 yards.
It could mean a simple push, however when all the players are chasing the ball at full speed that push could certainly cause injury.
We decided to keep it among our selected serious penalty list as it is certainly more serious than the most common penalties like False Start, Delay of Game or Offensive Holding.

![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/IBAW.png)

#### IBAW 15
**Increase the penalty to 15 yards for Illegal Block Above the Waist**

#### Emphasize

* Emphasize the importance of rules protecting players in a defenseless posture
* Emphasize the  importance of existing rules of helmet to helmet collisions and helmet use as weapon

# Further Research
Additional data would enable researchers and data scientists to evaluate more sophisticated aspects of the game.

A few ideas for additional data collection:

* Ball position
* Referee position
* Realtime head acceleration sensors 
* Detailed college data
* More precise position tracking
* Mapping sensor time to game time

# Acknowledments
We would like to thank to Istv醤 Kov醕s Budapest Wolves Head Coach and M醨ton Iv醤yi Special Teams Coordinator for their valuable feedback.

![](https://s3-eu-west-1.amazonaws.com/nfl-punt-analytics/ABPFHD.png)
We are two friends from Hungary who decided to join the NFL Punt Analytics Competition to make the game safer.

**Gabor** is passionate about data science and has long history with kaggle competitions. He usually watches just the Superbowl and maybe a few playoff games. He does not have strong team preference but prefers python over R. 

**Laszlo** is a huge fan of the NFL, KC and NE 卆nd yes, unfortunately in the AFC final he must choose a side. Laszlo works as a senior communication expert at the largest bank of Hungary. He uses data analysis to make better communication strategies, his main research fields are crises communication and social media. It was his first competition on Kaggle, but definitely not the last.

In [None]:
! cp ../input/externalnfl/summary/summary/* .