# **Defensive Entropy (DENT): Determining Effectiveness of Pre-Snap Randomness**

## *Introduction*

In the NFL, success and failure are often separated by mere inches and milliseconds. As teams seek every possible competitive advantage, the pre-snap phase of each play presents a crucial window where strategic decisions can dramatically impact the outcome. Enter *entropy* – a fundamental concept that measures the degree of disorder or unpredictability in a system. By analyzing the spatial entropy of defensive players before the snap, we can quantify the apparent chaos or organization in their positioning and potentially gain valuable insights.
<br/> <br/> Our hypothesis is that defensive entropy serves as more than just a measure of randomness; it may be a key indicator of defensive effectiveness and play outcome. Through this analysis, we aim to uncover whether pre-snap defensive entropy correlates with defensive success, potentially providing teams with a new metric for evaluating and optimizing their pre-snap strategies. We named our metric DENT (Defensive ENTropy) in honor of *Richard Dent*, the great NFL Hall of Fame defensive end who helped lead the 1985 Chicago Bears to a Super Bowl victory.

# **Calculating Entropy & DENT**
## *Physical Entropy*
In thermodynamics, entropy is a measure of the disorder or randomness within a system. Originally formulated by *Rudolf Clausius* and later expanded by *Ludwig Boltzmann*, entropy quantifies the number of possible microscopic arrangements (microstates) that could yield the observed macroscopic state of a system. A higher entropy indicates more disorder and more possible arrangements, while lower entropy suggests more order and fewer possible arrangements. The classic example is an ice cube melting into water – the highly ordered crystal structure of ice transforms into the more disordered liquid state, increasing entropy.

## *Measuring Player Entropy in Football*
In the context of NFL player tracking data, we can adapt these principles to quantify the spatial organization of defensive players before the snap. Our proposed entropy metric considers the following components:

## *Base Entropy Equation*
The spatial entropy $(S)$ for a defensive formation at any given moment can be calculated as: <br/>  <br/> 

$$S = -\sum(p_i \cdot \log_2(p_i))$$
<br/> 
Where:

- $S$ is the spatial entropy measured in bits.
- $p_i$ represents the probability of finding a defensive player in a particular spatial region.
- The summation is taken over all defined regions of the field.

## *Implementation Methodology*
1. **Field Discretization:** We divide the defensive half of the field into a grid of 1-yard by 1-yard squares.

2. **Player Position Probability:** For each frame, we calculate the probability density of defensive players across these grid squares.

3. **Time Window:** We analyze the last seconds before the snap, capturing the final defensive adjustments.

## *Additional Considerations*
After careful consideration, we incorporated the following additional factors into our entropy calculation:

1. **Player Orientation $(θ)$:**
    - *Rationale:* A defender's facing direction significantly impacts their ability to react. <br/>
    - *Modified term:* &nbsp; $p_i(1 + w_\theta \cos(\theta_{relative}))$ <br/>

    &nbsp;&nbsp;where $w_θ$ is a weighting factor and $θ_{relative}$ is the angle relative to the ball.
<br/> 

2. **Player Velocity $(v)$:**
    - *Rationale:* Moving players create more uncertainty for the offense.
    - *Additional term:* &nbsp; $w_v \cdot \frac{v}{v_{max}}$ <br/>

   &nbsp;&nbsp; where $w_v$ is a velocity weight factor and $v_{max}$ is a normalization constant.
<br/>


Our enhanced entropy equation becomes:


$$S = -\sum(p_i \cdot (1 + w_\theta \cdot \cos(\theta_{relative})) \cdot (1 + w_v \cdot \frac{v}{v_{max}}) \cdot \log_2(p_i))$$
   
## *Why These Factors Matter*

- **Position $(x,y)$:** Forms the foundation of spatial entropy.

- **Orientation $(θ)$:** A defender facing the wrong direction is less effective, regardless of position.

- **Velocity $(v)$:** Moving defenders create more uncertainty and can mask their true coverage intentions.

We deliberately exclude acceleration from our calculations as it tends to be more noisy in tracking data and may not significantly contribute to pre-snap deception. The combination of position, orientation, and velocity provides a robust measure of defensive unpredictability without over-complicating the metric.


# **DENT Demonstration:**
Below is a demonstration of the DENT metric applied to a sample play from the 2022 season. The plot shows the entropy values for *two* defensive players pre-snap.

<br/>
<div align="center">
    <img src="https://github.com/bualimov/nfl_project/blob/baha/combined_animation.gif?raw=true" style="border: 1px solid black"); />
</div>

<br/>


# **Data Filtering & Cleaning**

As part of our analysis, we focused on the critical pre-snap period between *line set* and *ball snap*, when defensive players are making their final reads and adjustments. We specifically chose this window because movement before *line set* primarily involves defenders getting into their initial positions, which is less relevant for entropy analysis. This targeted approach allows us to capture and analyze the strategic defensive positioning and reactions to offensive formations. To ensure data quality, we first examined the timing distribution across *all plays* to identify and establish appropriate filtering criteria for our analysis.

## *Initial Data Overview:*
- **Total plays analyzed:** 15,916 (Weeks 1-9)

    - **Mean:** 5.60 sec
    - **Median:** 5.30 sec
    - **Standard deviation:** 3.30 sec
    - **Range:** -0.60 to 95.20 sec

Our initial examination of the timing distribution across all plays confirmed our understanding of typical pre-snap sequences, leading us to implement several data quality measures to refine our dataset. The resulting filtered dataset, shown below, provides a more accurate representation of standard NFL pre-snap timing patterns.


## *Data Quality Assessment and Filtering:*

Our team identified anomalies in the data and implemented the following filtering critieria: 

- Removed any potential plays with negative times (snap before line set).
- Removed plays with unreasonably long durations (>40 seconds) - exceeds realistic play clock scenarios.
- Removed plays with extremely short durations (<1 second) - insufficient time for meaningful pre-snap reads. 
<br/>

## *Final Filtered Dataset:*

The histogram below shows the distribution of pre-snap duration (from *line set* to *ball snap*) for the filtered dataset:

<img 
  src="https://github.com/bualimov/nfl_project/blob/baha/snap_timing_distribution.png?raw=true" 
  alt="histogram" 
  width="600" 
  style="display: block; margin: 0 auto; padding: 20px;"
/>


- **Valid plays:** 14,981 (94.1% of original data)

    - **Mean:** 5.89 sec
    - **Median:** 5.50 sec
    - **Standard deviation:** 2.99 sec
    - **Range:** 1.00 to 36.40 sec
    - **IQ Range - 25th percentile:** 3.80 sec
    - **IQ Range - 75th percentile:** 7.50 sec


Our analysis revealed that typical pre-snap duration falls between 3.8 and 7.5 seconds, with the most common timing at approximately 5.5 seconds. By filtering anomalous data (approximately 5.9% of plays) from our dataset, we established a reliable base for analyzing defensive pre-snap entropy patterns.

# **Setting Defensive Success Criteria**

To determine the success of a defensive play, we needed to define a criteria that quantifies the effectiveness of the defense. We proposed that the *success* of a defensive play is determined by the following criteria:

- Zero or negative yards gained by the offense  
- Passes broken up or incomplete
- Quarterback pressures (hits and sacks)
- Tackles for loss
- Interceptions
- Forced fumbles

The criteria above allowed us to create a binary variable that indicates whether a play was successful or not for the defense. We then used this criteria to analyze defensive entropy.

# **Defensive Entropy Analysis**

We ran our entropy analysis using the DENT metric on the filtered dataset, and we ran statistical tests to determine if the difference in entropy between successful and unsuccessful plays is statistically significant. Our key findings over the 14,981 plays are shown below:

## *Overall Defensive Entropy Results*
- **Success Average:** $47.68$
- **Failure Average:** $50.03$
- **Net Difference:** $-2.35$ $(p < 0.001)$

Our findings indicate that successful defensive plays generally exhibited **lower** entropy.


## *Position-Specific Entropy Results*

Position-specific results were normalized to a 0-100 scale for easier comparison. Results of the entropy comparison between *successful* and *unsuccessful* defensive plays are shown below:

<img 
  src="https://github.com/bualimov/nfl_project/blob/baha/entropy_difference_by_position.png?raw=true" 
  alt="position specific results" 
  width="600" 
  style="display: block; margin: 0 auto; padding: 20px;"
/>


##### Positions Benefiting from Higher Entropy:
- **Outside Linebackers (OLB):** $+1.71$&nbsp;$(p < 0.001)$

##### Positions Benefiting from Lower Entropy:
- **Free Safeties (FS):** $-5.66$&nbsp;$(p < 0.001)$
- **Defensive Tackles (DT):** $-5.17$&nbsp;$(p < 0.001)$
- **Cornerbacks (CB):** $-3.44$&nbsp;$(p < 0.001)$
- **Defensive Ends (DE):** $-2.25$&nbsp;$(p < 0.001)$
- **Inside Linebackers (ILB):** $-1.69$&nbsp;$(p < 0.001)$
- **Middle Linebackers (MLB):** $-1.35$&nbsp;$(p < 0.001)$
- **Strong Safeties (SS):** $-0.90$&nbsp;$(p < 0.001)$


## *Statistical Validity*

- **Total sample size:** 14,981 plays
- **Largest position sample:** Cornerbacks (2,337,120 player-frames)
- **Smallest position sample:** Middle Linebackers (205,969 player-frames)

Our null hypothesis was that there would be no significant difference in entropy values between successful and unsuccessful defensive plays - in other words, that pre-snap defensive entropy had no relationship with defensive success. All findings achieved statistical significance $(p < 0.001)$ and results remained consistent across all nine weeks analyzed - therefore, we *rejected* our null hypothesis. This suggests a negative relationship between entropy and success, meaning, statitistically, that higher pre-snap defensive unpredictability is actually associated with worse defensive outcomes. In other words, more chaotic/unpredictable pre-snap defensive alignments correlate with less successful defensive plays for most positions (with the exception of the OLB position).

## *Key Initial Insights & Practical Implications*
The stronger negative values, observed between entropy differential and defensive success ($-2.35$ overall), suggests that disciplined, structured defensive positioning may be  *more effective* than previously thought. This would challenge some traditional assumptions about pre-snap deception. <br/>

Below are some of the intial key insights we identified looking at defensive entropy as a whole and by position:

- **Outside Linebackers:** Uniquely benefit from higher entropy ($+1.71$), suggesting that unpredictable positioning for OLBs creates *advantages* in defending both run and pass plays.

- **Secondary Positions:** Both safety positions (FS: $-5.66$, SS: $-0.90$), and cornerbacks (CB: $-3.44$), show *significantly better* performance with more structured positioning.

- **Interior Positions:** Defensive tackles (DT) show a strong preference for *structured positioning* ($-5.17$), indicating that disciplined interior line play is critical for defensive success.

- **Linebacker Corps:** Shows *varied results*, with OLBs benefiting from unpredictability while ILBs ($-1.69$) and MLBs ($-1.35$) perform better with more structured approaches.

The clear preference for structured positioning among most positions suggests teams should focus on *disciplined pre-snap alignments* rather than deceptive movements. The unique positive entropy results for OLBs suggests they should be given more freedom to vary their pre-snap positioning.

# **Additional Analysis**

## *Entropy Patterns Across Different Formations*

Additional analysis was performed on defensive entropy patterns across different formations and receiver alignments. The heatmaps visualize the difference in defensive movement entropy between successful and unsuccessful plays for each defensive position (CB, SS, FS, ILB, OLB, MLB, DT, DE). *Blue* cells indicate more predictable/effective defensive movement patterns, while *red* cells represent more variable/less predictable movements. The results of the analysis are shown in the heatmaps below:
<br/> <br/>

<img 
  src="https://github.com/bualimov/nfl_project/blob/baha/combined_entropy_heatmaps.png?raw=true" 
  alt="entropy difference by formation" 
  width="600" 
  style="display: block; margin: 0 auto; padding: 12px 20px; border: 1px solid black;"
/>

## *Key Insights*

Below are several insights our team identified when looking at position and alignment patterns using the DENT metric (***please note that the insights one could gain from the heatmap are not limited to what is shown below***):
  
- **DTs & DEs** show strong negative (blue) entropy differences in JUMBO formations, suggesting more predictable and effective alignments against heavy offensive sets.
  
- **CBs & SSs** display contrasting patterns - CBs are more effective (blue) in SHOTGUN formations while SSs show higher unpredictability (red) in JUMBO formations.
  
- **MLBs** demonstrate high positive entropy (red) in EMPTY formations (4x2), indicating more variable positioning when offenses spread out.
  
- **OLBs** show strong negative entropy (blue) in both JUMBO and WILDCAT formations, suggesting they maintain more disciplined alignments against run-heavy formations.
  
- **ILBs** are most predictable (blue) in SHOTGUN formations but show higher variability (red) in JUMBO sets, particularly against multiple-receiver alignments.

These insights could provide valuable guidance for defensive coaches, highlighting the importance of position-specific strategies and formation-alignment combinations in pre-snap movement.

## *Player Analysis*

The DENT metric allows us to assess specific positions' pre-snap positioning tendencies. For example, the graphs below illustrate and compare the predictability of defensive tackles' (DT) and outside linebackers' (OLB) pre-snap positioning across the league, with lower entropy values indicating more consistent/predictable positioning and higher values showing more variable positioning. Graphs are shown for the DT and OLB positions. The players are ranked based on their PFF ranking for the 2022 season, and mean entropy values are shown for each player.

<img 
  src="https://github.com/bualimov/nfl_project/blob/baha/dt_rankings_2022.png?raw=true" 
  alt="player analysis" 
  width="600" 
  style="display: block; margin: 0 auto; padding: 10px; border: 1px solid black; border-radius: 5px;"
/>

<br/>

<img 
  src="https://github.com/bualimov/nfl_project/blob/baha/olb_rankings_2022.png?raw=true" 
  alt="player analysis" 
  width="600" 
  style="display: block; margin: 0 auto; padding: 10px; border: 1px solid black; border-radius: 5px;"
/>


<br/>
Looking at the data, there are some interesting patterns: 
<br/> <br/>

- Aaron Donald, despite being ranked #1 by *PFF*, shows relatively moderate entropy ($48.8$), suggesting his effectiveness comes from consistent positioning rather than unpredictability. In contrast, Justin Madubuike shows the highest entropy value ($56.1$) among all defensive tackles, indicating he was likely used in more varied pre-snap positions.
   
- There appears to be a clear pattern among elite outside linebackers: The top-ranked OLBs consistently show high entropy values, with Nick Bosa ($55.7$), Haason Reddick ($53.1$), and Micah Parsons ($51.3$) all displaying significant pre-snap unpredictability. This highlights how pre-snap deception enhanced OLB performance in 2022. 

# **Going Forward**

## *Potential Applications:* 

The DENT metric provides immediate practical applications for NFL teams. Defensive coordinators can use these insights to develop position-specific pre-snap strategies, particularly focusing on structured positioning for most positions while allowing OLBs more freedom in various formations. The formation-specific heatmaps can inform defensive installation and practice planning, helping coaches design position-specific drills that emphasize either disciplined positioning (for CBs, Safeties, DTs) or controlled variability (for OLBs). Teams can also use this analysis to evaluate defensive players' pre-snap positioning tendencies and identify areas for improvement in their movement patterns.

## *Areas for Improvement:* 

Future research could enhance the DENT metric by incorporating additional factors such as offensive motion, quarterback tendencies, and down-and-distance situations. The analysis could be expanded to include post-snap outcomes and their relationship to pre-snap entropy, potentially revealing how initial positioning affects specific play types. Additionally, the metric could be refined to account for defensive scheme variations (e.g., zone vs. man coverage) and situational factors (e.g., red zone vs. open field). Machine learning techniques could also be applied to predict optimal entropy levels for each position based on offensive formations and game situations.

