# From Contact to Context: Evolving Tackling Metrics in NFL Analytics


# Introduction

American football, a game of strategy and physical prowess, presents unique challenges in understanding and quantifying defensive performance. Vince Lombardi once emphasized the essence of tackling: stopping the opponent by any means necessary. This principle underlies our innovative approach to analyzing tackling, a critical aspect of football defense. 

We have crafted novel metrics to provide a more nuanced understanding of defensive play. Our goal is to transcend traditional statistics by offering insightful, data-driven evaluations of tackling.

### Our Metrics

1. **Time Saved:** This metric evaluates a defender's efficiency in reducing the time to tackle the ball carrier. Utilizing a sophisticated BiLSTM neural network, we analyze players' positions and movements to predict the expected time for a tackle and compare it with the actual time taken.
2. **Optimal Path Deviation:** This quantifies the variance between a defender's actual path and the calculated optimal path to the ball carrier. It's a measure of path efficiency, highlighting players who adeptly navigate the field.
3. **PURSUIT:** Building on previous research, we introduce **PURSUIT**, a dynamic measure of a defender's effectiveness in chasing and engaging with the ball carrier. It considers both the rate of closing distance and the angle of pursuit.

These metrics offer a comprehensive view of defensive strategies and individual contributions. They shed light on the subtle, often overlooked aspects of defense, such as constraining the ball carrier's options or steering them into disadvantageous positions.

### **Time Saved Calculation:**

We evaluate the defender's ability to reduce the expected tackle time (**Time Saved**) using a neural network trained with the positions and movements of all players. The Bidirectional Long Short-Term Memory (BiLSTM) network produces an expectation of how fast each defender should get to the ball carrier. The network accounts for the sequential nature of player movements during a play. We use a similar feature set by the 2020 NFL Big Data Bowl winners to predict rushing yards after handoff. At each time frame the model receives the current relative and previous positions (x, y coordinates), directions, and velocities of all players on the field. These dynamic features are crucial as they change over time and provide context on how players move relative to one another. 

The network is trained on sequences of these features across multiple time frames.  It learns the patterns and tendencies of how defenders close in on ball carriers. By capturing the interaction of players, the network makes informed predictions of the time it will take for a defender to make contact with the ball carrier given the current trajectory and motion of all players.

During training, the network learns the intricacies of player interactions and how defenders navigate the field to reach the ball carrier. It analyzes past plays to recognize patterns and applies this knowledge to predict future events. The network uses player movements (**Table 1**) to compute the estimated time until contact with the ball carrier.

**Time Saved** represents how quickly the defender reaches the ball carrier relative to the expected given the locations and information of all players on the field. Positive values indicate that the defender reached the ball carrier faster than expected. 

We find that **Time Saved** accurately identifies the best defenders and is positively correlated with conventional metrics such as the total number of tackles.
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/TUCfinalfeatures.png?raw=true)
**Table 1: Features Used in BiLSTM to Predict Time Until Contact.**

The BiLSTM model has an RMSE of 0.04, meaning that the average difference between the predicted and actual time until contact is 0.04 seconds. This model outperformed others, including gradient boosting and other neural networks, highlighting the importance of knowing the previous information of players on the play and the inherent sequential nature of the data. The model excels due to its ability to process sequential player data in both forward and backward directions. This dual-direction approach enables the model to comprehensively understand the dynamic interactions and movements of all players on the field. BiLSTMs capture the evolving spatial relationships and movement strategies between players, crucial for real-time analysis. Its continuous adaptation to new data as the play progresses makes it highly effective for predicting tackling timing and paths, enhancing tackling efficiency evaluation. See **Appendix** for additional model evaluation.

![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/BiLSTMmodel.jpg?raw=true)
**Figure 1: BiLSTM Network Architecture**
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/timeuntilcontactfinal.jpg?raw=true)
**Figure 2: Time Until Contact Calculation**

Defensive players need to constantly adjust their direction of motion in order to reach the ball carrier. The time until contact (**TUC**) allows us to compute the optimal angle of pursuit (**AOP**) each defender should take to efficiently reach the ball carrier (**Figure 2**). Updated every tenth of a second for every player on defense, **AOP** represents the difference between the direction a defender is facing and the ball carrier's projected future location. The angle, ranging from 0 to 180 degrees, indicates the directional adjustment a defender must make to effectively reach the ball carrier. A lower angle suggests that the defender is already on or near the optimal path, increasing the likelihood of a successful tackle.
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/angleofpursuitfinal.jpg?raw=true)
**Figure 3:  Angle of Pursuit Calculation**

### **Optimal Path Deviation:** 

We quantify how much a defender's actual path to the ball carrier deviates from the calculated optimal path. The optimal path represents the most direct and efficient trajectory a defender can take to reach the ball carrier, calculated based on the angle of pursuit (**AOP**). Optimal path deviation (**OPD**) is measured by distance, with smaller values indicating a path closely aligned with the optimal trajectory. Defenders that move closer to the determined optimal path towards the ball carrier are more likely to register a tackle.

### **PURSUIT:** 

The best tacklers are able to pursue the ball carrier at an optimal angle and quickly reduce the distance between themselves and the ball carrier. We extend and improve the work by Quang Nguyen et al (2023) for the context of tackling. We measure the **PURSUIT** of defenders by the rate of change of the distance to the ball carrier with the addition of the angle of pursuit. A defender's **PURSUIT** value increases when rapidly closing in on the ball carrier and decreases when moving away. A defender directly pursuing the ball carrier at the ideal angle has a higher value of **PURSUIT** than one moving tangentially or in the opposite direction. As such, **PURSUIT** serves as a dynamic measure of how effective a defender is in a play. A perfect angle of pursuit (0 degrees) would yield maximum **PURSUIT** while a completely off-direction pursuit (180 degrees) would diminish it. 

**PURSUIT** is computed for each defender frame-by-frame during a play according to:

$$\Huge \large \text{PURSUIT}_{ij}(t) =\begin{cases}
0 & \text{for } t = 1 \\
-\frac{f'_{dij}(t)}{d_{ij}(t)} \cdot \left(1 - \frac{\text{angle of pursuit}_{ij}(t)}{180}\right) & \text{for } t > 1
\end{cases}$$

Where:
- $\large \text{PURSUIT}_{ij}(t)$ represents the **PURSUIT** metric for defender $i$ at frame $t$.
- $\large f'_{dij}(t)$ is the rate of change (derivative) of the defender's distance to the ball carrier at frame $t$.
- $\large d_{ij}(t)$ is the absolute distance between the defender $i$ and the ball carrier at frame $t$.
- $\large \text{angle of pursuit}_{ij}(t)$ is the angle of pursuit between defender $i$ and the ball carrier at frame $t$.

**PURSUIT** is calculated based on the rate at which the distance between the defender and the ball carrier changes, the current distance, and the angle of pursuit. Here are some examples to illustrate its applications:

- Defenders who have a higher average **PURSUIT** throughout a game are better tacklers.
- In individual plays, a higher **PURSUIT** indicates a higher probability of a successful tackle. Defenders with a higher **PURSUIT** have a significantly higher likelihood of making the tackle on a given play.
- Defenders who increase their **PURSUIT** during a play are more likely to reach and engage ball carriers who are attempting to evade them.
- **PURSUIT** identifies unsung defenders who make a positive contribution to plays but are not given credit in the stat sheet.

# Example Play

A play from the Buffalo Bills versus LA Rams (**Figure 3**) illustrates our analysis. At the start of the play, Ernest Jones is 11.3 yards away from Josh Allen, is moving away from him, and is faced 125 degrees from the ideal pursuit of the ball carrier. As the play progresses, Ernest Jones identifies the ball carrier and begins to put himself in position to tackle the ball carrier. His **PURSUIT** increases as he rapidly closes in on the ball carrier and is moving in a more optimal angle. He reaches the ball carrier 0.13 seconds slower than expected at the start of the play. Ernest Jones is credited with the tackle. 
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/actualplay.gif?raw=true)
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/defenderanimation_optimalpath.gif?raw=true)
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/defenderexampleplayanimation_metrics.gif?raw=true)
**Figure 4: Play Animation Showing Metrics of Tackler**

One of the benefits of our approach is that we can measure the contributions of all defenders involved in a play, not just those who are credited with a tackle or an assisted tackle. This enables us to identify defenders who may not typically be recognized for their role in the outcome of a play, but who play an essential role. To demonstrate this, we present the play again, but this time we show all players on the field, along with each defender's **PURSUIT**. This allows us to highlight the many defenders who made a positive contribution to the play, as noted by their positive **PURSUIT** values.
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/actualplay.gif?raw=true)
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/alldefendersplayanimationoptimalclose.gif?raw=true)
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/CombinedAnimation.gif?raw=True)
**Figure 5: Play Animation Showing PURSUIT and Optimal Path of All Defenders**

# Results

We have found that the metric **PURSUIT** is highly stable, showing consistency over time from Week 1 to Week 9 of the regular season. Furthermore, there is a strong correlation (r = 0.79) and a significant proportion of explained variability (R-squared = 0.62), implying that **PURSUIT** is predictive of future tackling performance. This suggests that if a player has a high or low **PURSUIT** value in the early weeks of the season, they will most likely maintain a similar level of performance in the following weeks, making it a reliable metric for predicting tackling performance over time.
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/PURSUITstabilityovertime.png?raw=true)
**Figure 6: Stability of PURSUIT from Weeks 1-5 to Weeks 6-9**

We present teams' run and pass tackling performance using the **PURSUIT** metric, dividing them into four quadrants. 
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/team_pass_run_pursuitfinal.png?raw=true)
**Figure 7: Team Pass and Run Tackling by PURSUIT**

There is a strong negative correlation between Run Time Saved and Rushing Yards Allowed Per Game (r = -0.7). This indicates that teams with defenders who consistently reach ball carriers faster than expected tend to give up fewer rushing yards, highlighting the crucial role of efficient tackling in limiting offensive production.
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/runcorr.png?raw=true)
**Figure 8: Run Time Saved and Rushing Yards Allowed Per Game**

We show the top ten defenders by position ranked by **Time Saved**, **Optimal Path Deviation (OPD)**, and **PURSUIT**. We compute the mean for each metric across all defenders and frames to arrive at a single value of each metric. These three metrics all measure tackling performance and are correlated. The best tacklers typically rank highly on all three metrics. These defenders effectively pursue and reduce the time it takes to reach the ball carrier. Many of them have received Pro Bowl and All-Pro Honors.
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/lb.png?raw=true)
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/dt.png?raw=true)
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/cb.png?raw=true)
**Figure 9: Top 10 Defenders by Position**


# Conclusion

The traditional metrics used to evaluate the defensive impact and tackling capabilities in the NFL are limited in their scope as they only focus on outcomes, which fail to provide a complete picture of a player's on-field contributions. Our new  metrics help to better measure the efforts of defenders who may not necessarily make a tackle, but play pivotal roles in affecting the outcome of a play. By using these metrics, we can identify the vital contributions of players who might otherwise go unnoticed in the stat sheet. We believe that these metrics will revolutionize how we assess, understand, and appreciate the art of defense in the NFL.

However, it's important to acknowledge that there are limitations to this analysis. Firstly, these metrics focus on individual actions and may overlook the intricate web of collaboration and tactical maneuvering that defines successful defensive plays. Currently, these metrics focus on tackling, but expanding them to encompass other defensive actions such as pass deflection, interception attempts, and coverage effectiveness would provide a broader picture of a player's impact. Lastly, the effectiveness of these metrics may vary depending on the specific game situation and offensive and defensive scheme, requiring further analysis and refinement for broader applicability


# Appendix

We show the learning curve of the BiLSTM model per epoch. This helps to identify if the model is overfitting the data. The close proximity of the final RMSE values for both training and validation sets at the end of the observed epochs shows that the model has achieved a good balance between learning the training data patterns and maintaining performance on validation data. 
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/learrningcurve.png?raw=true)
**Figure 10: BiLSTM Learning Curve**

Unsurprisingly, the model's accuracy improves as the play progresses because there is more data regarding the relative and historical information of players on the field. The model improves over time reaching around 0.05 RMSE on average 1 second into a given play.
![Sample GIF](https://github.com/BenRossJenkins/NFLBDB2024/blob/main/RMSE.png?raw=true)
**Figure 11: BiLSTM Predicted vs Actual Time Until Contact RMSE**