# Disguised Intentions: How DBs Keep Offenses Guessing
Metric track | Author: [Miguel Duarte](https://github.com/miguelmendesduarte)

"_All warfare is based on deception. (...) When we are near, we must make the enemy believe we are far away; when far away, we must make him believe we are near._"

**- Sun Tzu, The Art of War**

---

# I. Introduction
In modern NFL defenses, **defensive backs** (DBs) are vital players in both coverage and pressure schemes. While their primary role is often in pass coverage, they also play a significant part in blitz packages. A well-executed DB blitz can disrupt the offense by confusing the quarterback and offensive line with unexpected pressure. Success in these situations relies on timing and the DB's ability to disguise their intentions. Disguising coverage is another essential skill for DBs. By showing one defensive look pre-snap and shifting to another post-snap, they can mislead quarterbacks and force poor decisions, leading to incompletions, sacks, or even turnovers.

To illustrate this dynamic, consider Jaire Alexander (#23), the Packers' cornerback, in a key play against the Vikings (Figure 1). With a clever disguise, Alexander blitzed and sacked Kirk Cousins for a safety, demonstrating how effective deception can impact the game.

<div style="display: flex; justify-content: center; align-items: center;">
    <img src="../assets/clips/jaire_alexander_sack.gif" width="640">
</div>

<figcaption style="text-align: center; font-size: 14px; color: gray;">Figure 1: Jaire Alexander sacks Kirk Cousins for a safety in Week 1 of the 2020 NFL season.</figcaption>

But how good was the disguise? Could the offense have predicted and adjusted their blocking to account for his blitz? The ability to disguise intentions can force the offense into a rushed decision or a missed block, which may not always be reflected in immediate statistics like sacks or turnovers, but can still significantly influence the outcome of a play. By quantifying these subtle influences, we can better understand how much of a DB's success comes from strategic deception, rather than pure physical skill.

This raises two key questions: **Can we quantitatively measure how effectively a DB disguises his intentions?** And more importantly, **how much does a successful disguise lead to tangible improvements in defensive outcomes?** In the following sections, we will explore these questions by analyzing data, applying a classification algorithm, creating a **disguise metric**, and examining the resulting insights.

Section II outlines the steps taken to clean and process the data, introduces the classification algorithm used, and explains the training process. In Section III, we present and analyze the results, addressing the key questions raised earlier. Finally, Section IV summarizes the main conclusions of this work and suggests potential directions for future research.

# II. Methodology
This study aims to quantitatively measure how effectively a DB disguises his intentions pre-snap and how this impacts defensive outcomes. To achieve this, a machine learning model is trained using pre-snap data to predict the likelihood of a blitz. The predicted blitz probabilities are then compared to the DB's post-snap actions (blitz or coverage) to assess the alignment between the predicted and actual outcomes.

## Data
The analysis focuses on cornerback behavior during pass plays, so all non-passing plays, QB kneels, and designed rollouts and runs are excluded. Wildcat formation plays are also removed, as they feature unconventional formations and player roles, introducing unnecessary variability. Plays with penalties are also excluded because penalties disrupt the natural flow of the play and can skew player actions.

Big Data Bowl 2025 focuses on pre-snap insights, so the analysis includes only pre-snap tracking data. All frames before the offense is set are excluded, as there is no need to analyze data from this period. To standardize the data, all plays are adjusted so that the offense faces right. The angles are rotated such that 0º always points to the right, with angles increasing counterclockwise. Speeds are also modified to reflect movement relative to the line of scrimmage (LOS): if a player is moving toward the LOS, their speed is assigned a negative value, while moving away from the LOS results in a positive speed. Lastly, the data is filtered to include only defensive backs (DBs) - specifically safeties and cornerbacks - eliminating all other positions from the analysis.

To train and evaluate the model, the data is split into two sets: weeks 1 through 5 (inclusive) for training and weeks 6 through 9 for testing. This results in **5351 plays in the training set and 3707 in the testing set**. For the training data, only the 1.5 seconds (15 frames) immediately before the ball is snapped are used, focusing on the players' movements just before the snap. Therefore, **all results come from the testing data**, ensuring the model is not used on the data it is trained on.

## Features
The following features were used to train the model. Features marked with an asterisk (*) were excluded from the training of the final model (details in the [appendix](#appendix)). Both **movement-based features** and **contextual features** were used. Movement features capture the players' immediate pre-snap actions (e.g., speed, direction, distance), which help identify whether a player is in position for a blitz. Additionally, **contextual features** consider situational factors (e.g., down, yards to go, score differential) - that help assess whether it makes sense to blitz in a given scenario.

- **yardsToGo**
- **down**
- **position**
- **x, y**
- **s** *
- **a**
- **dis**
- **o**
- **dir** *
- **distance_to_qb**: The distance between the player and the quarterback.
- **orientation_to_qb**: The player's orientation relative to the quarterback.
- **direction_to_qb**: The angle of player's motion to the quarterback. *
- **distance_to_line_of_scrimmage**: The distance between the player and the LOS.
- **yardsToEndzone**: The number of yards remaining to the end zone. *
- **scoreDifferential**: The difference in score between the two teams (negative is defense is behind).
- **timeRemainingInSeconds**: The amount of time remaining in the game.
- **defenders_near_LOS**: The number of defenders within 2 yards in depth and 7 yards in width of the LOS, with the LOS centered around the quarterback's position..
- **TEs_on_right, TEs_on_left**: The number of tight ends on the right and left sides of the quarterback.
- **FBs_on_right, FBs_on_left**: The number of fullbacks on the right and left sides of the quarterback.
- **RBs_on_right, RBs_on_left**: The number of running backs on the right and left sides of the quarterback.
- **distance_to_closest_opponent**: The distance to the nearest opposing player. *
- **position_of_closest_opponent**: The position of the closest opposing player.
- **orientation_to_closest_opponent**: The player's orientation relative to the closest opponent.

The target variable for the model is **wasInitialPassRusher**, which indicates whether the DB was a pass rusher (blitz) for the play (1 = blitz, 0 = no blitz).

## Model
Extreme Gradient Boosting (**XGBoost**) was chosen for this task due to its strong performance in supervised classification problems with large datasets. The model was trained using the following hyperparameters, selected after testing 144 combinations:

- **max_depth**: 9
- **n_estimators**: 300
- **min_child_weight**: 3
- **learning_rate**: 0.1

Key performance metrics:
- **Accuracy**: 97.14%
- **Precision**: 33.29%
- **Recall**: 22.18%
- **Log Loss**: 0.098

Although precision and recall may appear low at first glance, these results should not be viewed as poor. The main goal of the model is to predict probabilities, not just class labels. Furthermore, achieving high precision is inherently difficult because DBs are trained to mislead the opponent, making it harder for the model. Consequently, **quantifying uncertainty is essential**, allowing the model to express its confidence in its prediction. Therefore, **log loss** is the most appropriate metric, as it evaluates the accuracy of the model's probability estimates.

Further details on the hyperparameter tuning are provided in the [appendix](#appendix).

## Disguise Score
To evaluate how well a defensive back (DB) disguises his intentions, we calculate a **Disguise Score** for each frame in the pre-snap phase of the play. The Disguise Score is computed as follows:

1. **Calculate the disguise score for each frame**:  
   For each frame, we compute the disguise score as the absolute difference between the action taken by the DB (blitz = 1, coverage = 0) and the predicted probability of the DB blitzing.

    $$
    \text{Disguise Score (frame)} = \left| \text{Action} - \text{Predicted Blitz Probability} \right|
    $$

   - If the DB is predicted to blitz with a high probability, but the action is coverage (or vice versa), the disguise score will be higher, indicating a stronger disguise.

2. **Sum and average the disguise scores**:  
   After calculating the disguise score for each frame, we sum these scores across all frames in a given play. The total disguise score is then averaged by the number of frames to yield the **Disguise Score** for the entire play:

   $$
   \text{Disguise Score (play)} = \frac{\sum \text{Disguise Scores (frames)}}{\text{Number of Frames}}
   $$

The **Disguise Score** reflects how effectively the DB hides his intentions in a play. A higher score indicates that the DB is more successful in "fooling" the model (and by extension, the offense), while a lower score suggests that the DB's actions are more predictable.

# III. Results
- Probability of Blitz -  Can we quantitatively measure how well a DB hides their intent?

<div style="display: flex; justify-content: center; align-items: center; gap: 20px;">
    <img src="../assets/animations/2022101300_291_cut.gif" height="420">
    <img src="../assets/animations/2022101300_blitz_probs.gif" height="420">
</div>

<figcaption style="text-align: center; font-size: 14px; color: gray;">Figure 2A: bla bla. Figure 2B: ble ble.</figcaption>

- Disguise Score

<figcaption style="text-align: center; font-size: 14px; color: gray;">Table 1: Top 10 DBs with the Highest Disguise Score (50+ Plays, 2022 NFL Season, Weeks 6-9).</figcaption>
<div style="display: flex; justify-content: center; align-items: center;">
    <img src="../reports/figures/top10_DBs_disguise_score.png" width="400">
</div>

- Which DBs in the league are the best at disguising their blitz?
<figcaption style="text-align: center; font-size: 14px; color: gray;">Table 2: DBs with 5+ Blitzes Ranked by Blitz Disguise Score (2022 NFL Season, Weeks 6-9).</figcaption>
<div style="display: flex; justify-content: center; align-items: center;">
    <img src="../reports/figures/DBs_blitz_disguise_score.png" width="640">
</div>

some text here...

<figcaption style="text-align: center; font-size: 14px; color: gray;">Table 3: Top 10 DBs with the Highest Disguise Score When Dropping into Coverage (50+ Plays, 2022 NFL Season, Weeks 6-9).</figcaption>
<div style="display: flex; justify-content: center; align-items: center;">
    <img src="../reports/figures/top10_DBs_simulating_blitz.png" width="400">
</div>

- Which teams implement the best DB disguises?

IMAGE

- Does a well-disguised blitz leads to more pressure to the QB?

IMAGE

- Does a more effective disguise by the DBs lead to a higher win probability added?

IMAGE

## Limitations
Safeties.

QB cadence.

Audibles, shifts.

Limited dataset.

Simplified representation of defensive behaviour.

Defensive schemes.

# IV. Conclusion

## Future Work
Analyze how early or late in the play the DB’s true intentions are revealed, giving more weight to DBs who can maintain their disguise longer. - which can also be used to evaluate QBs' hard counts.

Evaluate Offensive Lines on how well play aginst disguises.

Opponent offensive tendencies: Analyze how a DB's disguises may vary based on the opponent's offensive strategy or tendencies in similar game situations.

---
Word count: ...

Figure/Table count: ...

All code is available [here](https://github.com)

# Acknowledgements
Special thanks to Michael Lopez, Thompson Bliss, Ally Blake, Paul Mooney, Addison Howard, and the NFL staff for their contributions to the NFL Big Data Bowl 2025. I also want to thank the NFL and Kaggle for providing the data and making this experience possible.

As a passionate football fan in Portugal, I'm grateful for the opportunity to engage with the game through this competition.

# Appendix

## A.
...

## B.
...