# Uncovering Tackle Opportunities and Missed Opportunities

Metric Track | Authors: 
[Matthew Chang](https://mpchang.github.io/), [Katherine Dai](https://github.com/katdai), [Daniel Jiang](https://danielrjiang.github.io), [Harvey Cheng](https://bolongcheng.com/) <br>


*"Football is two things. It's blocking and tackling... You block and tackle better than the team you're playing, you win." - Vince Lombardi*

---

# Introduction

Traditional metrics such as made and missed tackles offer only a surface-level understanding of a defender's tackling skill. Before attempting a tackle, a defender engages in a long series of steps: accurately predicting the ball carrier's path, strategically positioning himself, and forcing the ball carrier into a vulnerable position for a tackle or push out of bounds. The outcome, whether a made or missed tackle, marks the end of a complex process that unfolds throughout the play.

We present a new set of metrics to analyze how well defenders perform in the tackling process (see Figure 1 for a visual guide): 
- **Tackle Probability**: Probability that defender X tackles the ball carrier within T seconds (blue line in Figure 1). T is a tunable parameter, which is set to 1 second in this work. 
- **Tackle Opportunity**: When defender X's tackle probability on a play exceeds 75% for $>0.5$ second interval (areas above dotted gray line in Figure 1). This represents a window of time over which the defender has a real opportunity to make a tackle. 
- **Missed Tackle Opportunity**: When defender X's tackle probability on a play exceeds 75% for $>0.5$ second interval and subsequently falls below 75% for $>0.5$ second interval (shaded red in Figure 1). This occurs when neither defender X nor any of their teammates makes a tackle during the tackle opportunity. This is a new class of defensive mistake not captured by current tackling metrics. 
- **Converted Tackle Opportunity**: When defender X is assigned a tackle in the tackle data. 

<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/tackle_metrics_overview.jpg" width="800"/></center>
<center>Figure 1. Deriving tackle metrics from tackle probability.</center>
<br>

These metrics are designed to provide more insight into how defensive players engage *within* each play, leveraging player kinematics and unique spatial features to convey information about blocking and defensive containment. The metrics can be measured in real-time on an individual play or accumulated over multiple plays to track long-term performance. 

Section I presents the model used to predict tackle probability. In Section II, we run inference on several plays to illustrate tackle probability throughout a play. In Section III, we use the above metrics to assess the league's top tacklers over weeks 1-9 of the 2022 season. 

# I. Model

We trained an XGBoost binary classifier to predict whether a defender makes a tackle within the next 1 second. We used confirmed made and missed tackles from weeks 1-8 of the tackle data as training examples. For these examples, all input features were derived from the tracking data 1 second prior to the made or missed tackle. Since missed tackles were not tied to specific frames in the tracking data, we assumed missed tackles happened at the frame when the ball carrier and the potential tackler's separation were at a minimum. Any non-tackle could have been used as a negative training example, but we chose to use missed tackles so that the model would learn the most accurate threshold between made tackles and non-tackles. 

### a. Feature Selection and Engineering

We selected nine features as model inputs:
1. Euclidean distance between tackler and ball carrier
2. Ball carrier speed
3. Relative X speed between tackler and ball carrier
4. |Relative Y speed| between tackler and ball carrier
5. Angle of attack
6. Forward <a href="https://en.wikipedia.org/wiki/Voronoi_diagram">Voronoi</a> area 
7. Team influence
8. Blocker influence
9. Pass or run

The motivation for including player kinematic features (features 1-5) is self-explanatory. The spatial features (features 6-8) were inspired by how humans evaluate the likelihood of a successful tackle. Humans assess the spatial distribution of players around the ball carrier, like whether there are blockers protecting the ball carrier or multiple defenders confining the ball carrier and cutting off escape paths. Football terminology often encapsulates these spatial configurations and intentions with phrases like "setting the edge", "contain rush", or "using the sideline as a defender". Figure 2 provides a graphical view of each of the following spatial features. 

**Forward Voronoi area** is the area of the <a href="https://en.wikipedia.org/wiki/Voronoi_diagram">Voronoi cell</a> belonging to the ball carrier, modified slightly. Only the area starting 5 yards behind the ball carrier and within the sidelines is considered. We truncate the area behind the ball carrier because he is motivated to make forward progress. This feature models congestion — the more players around the ball carrier, the smaller this area. The Voronoi diagram includes both defensive and offensive players. 

**Team influence** is based on the approach from <a href="https://static.capabiliaserver.com/frontend/clients/barca/wp_prod/wp-content/uploads/2018/05/Wide-Open-Spaces.pdf">Fernandez et al (2018)</a>. Each player generates a 2D Gaussian, which is translated and scaled based on their current location, speed, and direction. Team influence, $T(x, y)$, is calculated as

$$ T(x, y) = \sum_{n \in \mathcal O / {0}}I(x_n, y_n) - \sum_{m \in \mathcal D}I(x_m, y_m) $$ 

where $I(x, y)$ are Gaussians generated by individual players, $\mathcal O$ is the set of offensive players, $\mathcal D$ is the set of defensive players, and “0” is the ball carrier. The model uses the Team influence at the ball carrier's $\left(x, y \right)$ location. The team influence provides a sense of whether the ball carrier is surrounded by defenders or has some protection from blockers. 

**Blocker influence** is calculated similarly to team influence, with two differences. First, no defensive players are included in the influence calculation. Second, the model uses the Blocker influence computed at the location of a given *defender*, rather than the ball carrier. This feature conveys whether a defender has blockers nearby. 

<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/spatial_features.jpg" width="1000"/></center>
<center>Figure 2. Spatial features. (Left) Team influence map, with Voronoi diagram overlaid. (Right) Blocker influence map of the same play. The offensive linemen make a front against the defensive line, and the eventual tackler has a lane of relatively low blocker influence to the ball carrier. </center>
<br>

### b. Model Performance

We generated the training dataset with 8000 made tackles and 1583 missed tackles from weeks 1-8 tracking data (excludes plays with penalties, erroneous or missing tracking data). The test dataset includes 836 made tackles and 180 missed tackles exclusively from week 9 to prevent leakage.

We used five-fold cross-validation on the training dataset to experiment with different features and optimize model hyperparameters (further details in the appendix). After training, the XGBoost model was evaluated on the test set and the results are shown in Figure 3. Some residual overfitting can be observed, but the performance on the test set is still quite good with an AUC of 0.93. 

<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/model_performance.jpg" width="1000"/></center>
<center>Figure 3. Model performance metrics. (Left) Confusion matrix from the test dataset. (Right) Receiver-Operating-Characteristic Curves. </center>
<br>

# II. Illustrating Tackle Probability During Plays

We used the model to run inference on entire plays, producing real-time tackle probabilities for each defender. The below examples highlight how to use tackle probability to identify tackle opportunities and missed opportunities.

Figure 4 shows a run where the ball carrier, Leonard Fournette, is first pursued by Aaron Donald. Although Donald is able to break through the double team, the blockers slow him down, which allows the ball carrier to escape. This prevents Donald's tackle probability from staying above the threshold for more than 0.5 seconds; therefore he does not register a tackle opportunity. Ernest Jones continues the pursuit. Despite his proximity to the ball carrier, the model does not give him a high tackle probability because of the open field ahead of Fournette and the presence of a potential blocker nearby. Jones's tackle probability rises, and becomes a tackle opportunity, when Jalen Ramsey meets and slows down the ball carrier. Both Ramsey and Jones earned a tackle assist on this play. 

<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/play_with_tackle_prob_2022110609_271.gif" style="width: 1200px"/></center>
<center>Figure 4. Leonard Fournette left end to LA 47 for 6 yards. Tackled by Ernest Jones and Jalen Ramsey. Only three defenders highlighted for clarity. </center>
<br>

Figure 5 shows an 18 yard run by Kenyan Drake. After the snap, despite the proximity of the defensive line to Drake, the model does not assign them a high tackle probability because of the high blocker influence. As Drake breaks into the open field, four separate defenders have tackle opportunities and subsequently missed opportunities (Alontae Taylor, Kaden Elliss, Malcolm Roach, and Zack Baun) before Drake is eventually tackled by Marcus Maye. It's noteworthy that on this play, only Malcolm Roach was given a missed tackle in the tackling data, which does not tell the full story of how many defenders Drake eluded. Our new "missed tackle opportunity" metric, however, is able to accurately assign blame to all four players.

<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/2022110700_2902_video.gif" style="width: 600px"/></center>
<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/play_with_tackle_prob_2022110700_2902.gif" style="width: 1200px"/></center>
<center>Figure 5. K. Drake left end to NO 45 for 18 yards. Missed tackle opportunities are highlighted with a red box on the play animation and a red X on the tackle probability plot. The final made tackle by Marcus Maye is highlighted by a green box on the animation. </center>
<br>

# III. Tackler Assessment

We ran inference on every play in the tracking data and tabulated the following metrics over the 9 week period for each defensive player: 

- **Tackle Opportunity Rate** = Tackle Opportunities / Total Active Plays
- **Missed Opportunity Rate** = Missed Tackle Opportunities / Tackle Opportunities
- **Tackle Conversion Rate** = Tackles Made or Assisted / Tackle Opportunities

Figure 6 shows these metrics for all defenders who registered more than 30 solo tackles in the tackling data. Looking at tackle opportunity rate (Figure 6a), it is clear that linebackers generate more tackle opportunities than safeties and cornerbacks. This matches intuition given the nature of their position; linebackers are often engaged in both run and pass coverage, whereas safeties mostly only appear in pass coverage or plays in which the ball carrier reaches the secondary. In the tackle conversion plot (Figure 6b), the reverse trend can be seen — safeties and cornerbacks have a higher tackle conversion ratio than linebackers. This also matches intuition, as the secondary is often more involved in man-to-man coverage. Finally, the missed opportunities plot (Figure 6c) shows how often a defender missed a tackle or did not contribute enough pressure for a teammate to make a tackle; this metric is similar between position groups. 

<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/tackle_opportunities_plot.jpg" width="1200px"/></center>
<center>Figure 6. Tackle metrics from weeks 1-9. Each data point represents a player. (a) Tackle Opportunities vs. active plays. (b) Tackles/assists made vs. Tackle Opportunities. (c) Missed Tackle Opportunities vs. Tackle Opportunities. </center>
<br>

Today we evaluate individual defenders using total tackle count, but we can now deconstruct this metric into its components — tackle opportunities and tackle conversion rate — for a much clearer picture. Table 1 shows that Jalen Pitre and Kevin Byard employ different tackling approaches, even though their tackles per play are similar. Pitre consistently creates more chances for himself (higher tackle opportunity rate), whereas Byard has fewer chances but executes tackles at an extremely high level (higher tackle conversion rate).

<table id="T_1ced8">
  <caption style="caption-side:bottom">Table 1: Top 10 secondary players, ranked by tackles per play</caption>
  <thead>
    <tr>
      <th class="index_name level0" >Player</th>
      <th class="index_name level1" >Position</th>
      <th id="T_1ced8_level0_col0" class="col_heading level0 col0" >Active Plays</th>
      <th id="T_1ced8_level0_col1" class="col_heading level0 col1" >Tackles per Play</th>
      <th id="T_1ced8_level0_col2" class="col_heading level0 col2" >Tackle Opportunity Rate</th>
      <th id="T_1ced8_level0_col3" class="col_heading level0 col3" >Tackle Conversion Rate</th>
      <th id="T_1ced8_level0_col4" class="col_heading level0 col4" >Missed Opportunity Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th id="T_1ced8_level0_row0" class="row_heading level0 row0" >Jonathan Owens</th>
      <th id="T_1ced8_level1_row0" class="row_heading level1 row0" >FS</th>
      <td id="T_1ced8_row0_col0" class="data row0 col0" >317</td>
      <td id=""T_1ced8_row0_col1" style=" background: #008066; color: white;" class="data row0 col1" >0.20</td>
      <td id=""T_1ced8_row0_col2" style=" background: #369b66; color: white;" class="data row0 col2" >0.33</td>
      <td id="T_1ced8_row0_col3" class="data row0 col3" >0.60</td>
      <td id="T_1ced8_row0_col4" class="data row0 col4" >0.11</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row1" class="row_heading level0 row1" >Derwin James</th>
      <th id="T_1ced8_level1_row1" class="row_heading level1 row1" >FS</th>
      <td id="T_1ced8_row1_col0" class="data row1 col0" >341</td>
      <td id=""T_1ced8_row1_col1" style=" background: #0c8666; color: white;" class="data row1 col1" >0.19</td>
      <td id=""T_1ced8_row1_col2" style=" background: #3b9d66; color: white;" class="data row1 col2" >0.33</td>
      <td id="T_1ced8_row1_col3" class="data row1 col3" >0.59</td>
      <td id="T_1ced8_row1_col4" class="data row1 col4" >0.16</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row2" class="row_heading level0 row2" >Damar Hamlin</th>
      <th id="T_1ced8_level1_row2" class="row_heading level1 row2" >SS</th>
      <td id="T_1ced8_row2_col0" class="data row2 col0" >256</td>
      <td id=""T_1ced8_row2_col1" style=" background: #91c866; color: black;" class="data row2 col1" >0.17</td>
      <td id=""T_1ced8_row2_col2" style=" background: #5dae66; color: white;" class="data row2 col2" >0.31</td>
      <td id="T_1ced8_row2_col3" class="data row2 col3" >0.56</td>
      <td id="T_1ced8_row2_col4" class="data row2 col4" >0.15</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row3" class="row_heading level0 row3" >DeShon Elliott</th>
      <th id="T_1ced8_level1_row3" class="row_heading level1 row3" >FS</th>
      <td id="T_1ced8_row3_col0" class="data row3 col0" >298</td>
      <td id=""T_1ced8_row3_col1" style=" background: #aad466; color: black;" class="data row3 col1" >0.17</td>
      <td id=""T_1ced8_row3_col2" style=" background: #a8d366; color: black;" class="data row3 col2" >0.27</td>
      <td id="T_1ced8_row3_col3" class="data row3 col3" >0.62</td>
      <td id="T_1ced8_row3_col4" class="data row3 col4" >0.17</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row4" class="row_heading level0 row4" >Julian Love</th>
      <th id="T_1ced8_level1_row4" class="row_heading level1 row4" >SS</th>
      <td id="T_1ced8_row4_col0" class="data row4 col0" >299</td>
      <td id=""T_1ced8_row4_col1" style=" background: #add666; color: black;" class="data row4 col1" >0.17</td>
      <td id=""T_1ced8_row4_col2" style=" background: #9cce66; color: black;" class="data row4 col2" >0.28</td>
      <td id="T_1ced8_row4_col3" class="data row4 col3" >0.60</td>
      <td id="T_1ced8_row4_col4" class="data row4 col4" >0.13</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row5" class="row_heading level0 row5" >Budda Baker</th>
      <th id="T_1ced8_level1_row5" class="row_heading level1 row5" >SS</th>
      <td id="T_1ced8_row5_col0" class="data row5 col0" >404</td>
      <td id=""T_1ced8_row5_col1" style=" background: #c5e266; color: black;" class="data row5 col1" >0.16</td>
      <td id=""T_1ced8_row5_col2" style=" background: #2f9766; color: white;" class="data row5 col2" >0.33</td>
      <td id="T_1ced8_row5_col3" class="data row5 col3" >0.49</td>
      <td id="T_1ced8_row5_col4" class="data row5 col4" >0.19</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row6" class="row_heading level0 row6" >Eddie Jackson</th>
      <th id="T_1ced8_level1_row6" class="row_heading level1 row6" >SS</th>
      <td id="T_1ced8_row6_col0" class="data row6 col0" >391</td>
      <td id=""T_1ced8_row6_col1" style=" background: #d3e966; color: black;" class="data row6 col1" >0.16</td>
      <td id=""T_1ced8_row6_col2" style=" background: #8ac466; color: black;" class="data row6 col2" >0.29</td>
      <td id="T_1ced8_row6_col3" class="data row6 col3" >0.56</td>
      <td id="T_1ced8_row6_col4" class="data row6 col4" >0.22</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row7" class="row_heading level0 row7" >L'Jarius Sneed</th>
      <th id="T_1ced8_level1_row7" class="row_heading level1 row7" >CB</th>
      <td id="T_1ced8_row7_col0" class="data row7 col0" >336</td>
      <td id=""T_1ced8_row7_col1" style=" background: #e8f466; color: black;" class="data row7 col1" >0.16</td>
      <td id=""T_1ced8_row7_col2" style=" background: #c2e066; color: black;" class="data row7 col2" >0.26</td>
      <td id="T_1ced8_row7_col3" class="data row7 col3" >0.61</td>
      <td id="T_1ced8_row7_col4" class="data row7 col4" >0.21</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row8" class="row_heading level0 row8" >Kevin Byard</th>
      <th id="T_1ced8_level1_row8" class="row_heading level1 row8" >FS</th>
      <td id="T_1ced8_row8_col0" class="data row8 col0" >337</td>
      <td id=""T_1ced8_row8_col1" style=" background: #fdfe66; color: black;" class="data row8 col1" >0.15</td>
      <td id=""T_1ced8_row8_col2" style=" background: #ffff66; color: black;" class="data row8 col2" >0.23</td>
      <td id="T_1ced8_row8_col3" class="data row8 col3" >0.68</td>
      <td id="T_1ced8_row8_col4" class="data row8 col4" >0.09</td>
    </tr>
    <tr>
      <th id="T_1ced8_level0_row9" class="row_heading level0 row9" >Jalen Pitre</th>
      <th id="T_1ced8_level1_row9" class="row_heading level1 row9" >FS</th>
      <td id="T_1ced8_row9_col0" class="data row9 col0" >338</td>
      <td id=""T_1ced8_row9_col1" style=" background: #ffff66; color: black;" class="data row9 col1" >0.15</td>
      <td id=""T_1ced8_row9_col2" style=" background: #008066; color: white;" class="data row9 col2" >0.36</td>
      <td id="T_1ced8_row9_col3" class="data row9 col3" >0.43</td>
      <td id="T_1ced8_row9_col4" class="data row9 col4" >0.35</td>
    </tr>
  </tbody>
</table>

<br>


<table id="T_dd17b">
  <caption style="caption-side:bottom">Table 2: Top 10 linebackers, ranked by tackles per play</caption>
  <thead>
    <tr>
      <th class="index_name level0" >Player</th>
      <th class="index_name level1" >Position</th>
      <th id="T_dd17b_level0_col0" class="col_heading level0 col0" >Active Plays</th>
      <th id="T_dd17b_level0_col1" class="col_heading level0 col1" >Tackles per Play</th>
      <th id="T_dd17b_level0_col2" class="col_heading level0 col2" >Tackle Opportunity Rate</th>
      <th id="T_dd17b_level0_col3" class="col_heading level0 col3" >Tackle Conversion Rate</th>
      <th id="T_dd17b_level0_col4" class="col_heading level0 col4" >Missed Opportunity Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th id="T_dd17b_level0_row0" class="row_heading level0 row0" >Alex Singleton</th>
      <th id="T_dd17b_level1_row0" class="row_heading level1 row0" >OLB</th>
      <td id="T_dd17b_row0_col0" class="data row0 col0" >206</td>
      <td id=""T_dd17b_row0_col1" style=" background: #008066; color: white;" class="data row0 col1" >0.27</td>
      <td id=""T_dd17b_row0_col2" style=" background: #008066; color: white;" class="data row0 col2" >0.48</td>
      <td id="T_dd17b_row0_col3" class="data row0 col3" >0.57</td>
      <td id="T_dd17b_row0_col4" class="data row0 col4" >0.21</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row1" class="row_heading level0 row1" >Cole Holcomb</th>
      <th id="T_dd17b_level1_row1" class="row_heading level1 row1" >ILB</th>
      <td id="T_dd17b_row1_col0" class="data row1 col0" >274</td>
      <td id=""T_dd17b_row1_col1" style=" background: #7dbe66; color: white;" class="data row1 col1" >0.24</td>
      <td id=""T_dd17b_row1_col2" style=" background: #6db666; color: white;" class="data row1 col2" >0.43</td>
      <td id="T_dd17b_row1_col3" class="data row1 col3" >0.56</td>
      <td id="T_dd17b_row1_col4" class="data row1 col4" >0.24</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row2" class="row_heading level0 row2" >Divine Deablo</th>
      <th id="T_dd17b_level1_row2" class="row_heading level1 row2" >OLB</th>
      <td id="T_dd17b_row2_col0" class="data row2 col0" >326</td>
      <td id=""T_dd17b_row2_col1" style=" background: #cee666; color: black;" class="data row2 col1" >0.22</td>
      <td id=""T_dd17b_row2_col2" style=" background: #e0f066; color: black;" class="data row2 col2" >0.38</td>
      <td id="T_dd17b_row2_col3" class="data row2 col3" >0.58</td>
      <td id="T_dd17b_row2_col4" class="data row2 col4" >0.21</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row3" class="row_heading level0 row3" >T.J. Edwards</th>
      <th id="T_dd17b_level1_row3" class="row_heading level1 row3" >ILB</th>
      <td id="T_dd17b_row3_col0" class="data row3 col0" >332</td>
      <td id=""T_dd17b_row3_col1" style=" background: #deee66; color: black;" class="data row3 col1" >0.22</td>
      <td id=""T_dd17b_row3_col2" style=" background: #ffff66; color: black;" class="data row3 col2" >0.37</td>
      <td id="T_dd17b_row3_col3" class="data row3 col3" >0.59</td>
      <td id="T_dd17b_row3_col4" class="data row3 col4" >0.23</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row4" class="row_heading level0 row4" >Tremaine Edmunds</th>
      <th id="T_dd17b_level1_row4" class="row_heading level1 row4" >ILB</th>
      <td id="T_dd17b_row4_col0" class="data row4 col0" >280</td>
      <td id=""T_dd17b_row4_col1" style=" background: #e9f466; color: black;" class="data row4 col1" >0.21</td>
      <td id=""T_dd17b_row4_col2" style=" background: #afd766; color: black;" class="data row4 col2" >0.40</td>
      <td id="T_dd17b_row4_col3" class="data row4 col3" >0.53</td>
      <td id="T_dd17b_row4_col4" class="data row4 col4" >0.17</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row5" class="row_heading level0 row5" >Jordan Hicks</th>
      <th id="T_dd17b_level1_row5" class="row_heading level1 row5" >ILB</th>
      <td id="T_dd17b_row5_col0" class="data row5 col0" >318</td>
      <td id=""T_dd17b_row5_col1" style=" background: #eaf466; color: black;" class="data row5 col1" >0.21</td>
      <td id=""T_dd17b_row5_col2" style=" background: #dfef66; color: black;" class="data row5 col2" >0.38</td>
      <td id="T_dd17b_row5_col3" class="data row5 col3" >0.56</td>
      <td id="T_dd17b_row5_col4" class="data row5 col4" >0.20</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row6" class="row_heading level0 row6" >Foyesade Oluokun</th>
      <th id="T_dd17b_level1_row6" class="row_heading level1 row6" >ILB</th>
      <td id="T_dd17b_row6_col0" class="data row6 col0" >400</td>
      <td id=""T_dd17b_row6_col1" style=" background: #f0f866; color: black;" class="data row6 col1" >0.21</td>
      <td id=""T_dd17b_row6_col2" style=" background: #f5fa66; color: black;" class="data row6 col2" >0.38</td>
      <td id="T_dd17b_row6_col3" class="data row6 col3" >0.57</td>
      <td id="T_dd17b_row6_col4" class="data row6 col4" >0.25</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row7" class="row_heading level0 row7" >Pete Werner</th>
      <th id="T_dd17b_level1_row7" class="row_heading level1 row7" >OLB</th>
      <td id="T_dd17b_row7_col0" class="data row7 col0" >322</td>
      <td id=""T_dd17b_row7_col1" style=" background: #f5fa66; color: black;" class="data row7 col1" >0.21</td>
      <td id=""T_dd17b_row7_col2" style=" background: #bede66; color: black;" class="data row7 col2" >0.40</td>
      <td id="T_dd17b_row7_col3" class="data row7 col3" >0.53</td>
      <td id="T_dd17b_row7_col4" class="data row7 col4" >0.27</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row8" class="row_heading level0 row8" >Bobby Okereke</th>
      <th id="T_dd17b_level1_row8" class="row_heading level1 row8" >ILB</th>
      <td id="T_dd17b_row8_col0" class="data row8 col0" >333</td>
      <td id=""T_dd17b_row8_col1" style=" background: #f9fc66; color: black;" class="data row8 col1" >0.21</td>
      <td id=""T_dd17b_row8_col2" style=" background: #70b866; color: white;" class="data row8 col2" >0.43</td>
      <td id="T_dd17b_row8_col3" class="data row8 col3" >0.49</td>
      <td id="T_dd17b_row8_col4" class="data row8 col4" >0.21</td>
    </tr>
    <tr>
      <th id="T_dd17b_level0_row9" class="row_heading level0 row9" >C.J. Mosley</th>
      <th id="T_dd17b_level1_row9" class="row_heading level1 row9" >ILB</th>
      <td id="T_dd17b_row9_col0" class="data row9 col0" >393</td>
      <td id=""T_dd17b_row9_col1" style=" background: #ffff66; color: black;" class="data row9 col1" >0.21</td>
      <td id=""T_dd17b_row9_col2" style=" background: #7bbd66; color: white;" class="data row9 col2" >0.42</td>
      <td id="T_dd17b_row9_col3" class="data row9 col3" >0.49</td>
      <td id="T_dd17b_row9_col4" class="data row9 col4" >0.28</td>
    </tr>
  </tbody>
</table>

One main drawback of total tackle count is that it does not penalize a player who generates a tackle opportunity but lets the ball carrier escape without attempting a tackle (see Kaden Elliss (#55) in Figure 5). Our third tackle metric, missed opportunity rate, captures this class of defensive mistake. We can now visualize how many players are generating opportunities but letting them slip away. We recommend using tackle opportunity rate in combination with missed opportunity rate to evaluate individual defender performance. Figure 7 highlights the elite and subpar tacklers in each position group.

<br>
<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/tackle_rate_rating.jpg" width="1000"/><center>
<center>Figure 7. Missed Tackle Opportunity Rate vs. Tackle Opportunity Rate for (left) the secondary and (right) linebackers. The vertical and horizontal lines run through the means of each axis.</center>

# Conclusions
Tackle opportunity, missed tackle opportunity, and tackle conversions are metrics that provide new insight into the tackling process. 
Total tackle count can be decomposed into tackle opportunities and tackle conversion rate, which provide a clearer picture of how defenders are earning their tackles. Missed tackle opportunities are an entirely new class of defensive mistake, which is not captured at all by current metrics. We use missed tackle opportunity rate in conjunction with tackle opportunity rate to evaluate tacklers. 

---

Word Count: 1994

All code is available at this **<a href="https://github.com/mpchang/tackle-probability-and-opportunity/">Github Repo</a>**

# Acknowledgements
We would like to thank Michael Lopez, Thompson Bliss, and the NFL staff members from each of the teams that provided the data and made this competition possible. 



# Appendix

## A. Model Selection and Training Process

Three different model architectures were evaluated for performance using negative log loss, F1 score, and Area Under Curve (AUC) as evaluation criteria: a convolutional neural network, a multilayer perceptron (MLP) network, and an XGBoost model. Of these, the MLP and the XGBoost models performed the best and nearly identically. However, once out-of-bounds tackling data was added into the dataset, the XGBoost model performed the best. It appears that because out-of-bounds tackling is different in nature compared to middle-of-the-field tackling, the MLP model struggled to generalize compared to the XGBoost model. 

To evaluate different input features and hyperparameters, a random search cross-validation yielded these optimum hyperparameters: 
- n_estimators (number of trees) = 250
- max_depth = 7
- eta (learning rate) = 0.1
- subsample = 0.75

In addition, to help compensate for the imbalanced dataset (8000 made tackles vs 1583 missed tackles), *scale_pos_weight* was set to 0.20. *reg_lambda* (L2 regularization) was set to 150 to combat overfitting, which was particularly detrimental to out-of-bounds tackle plays. 

We also experimented with relative weight, relative orientation, and defender age as features, but these did not contribute much to model performance and were dropped. 

## B. Model Interpretation

To understand how the model is using the input features to make predictions, we turn to SHAP (SHapley Additive exPlanations) values. SHAP values are a measure of how a feature's value affects the output value, and they can be measured on a single prediction as well as averaged over many predictions to produce a global measure of feature importance (<a href="https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf">Lundberg et al, 2017</a>). 

The SHAP values of our features are summarized in Figure A1 for the entire dataset. Each training example adds a datapoint to each feature row. A more positive SHAP value means that feature made it more likely that a tackle would be made, and vice versa. Multiple data points with the same SHAP value stack vertically, providing an indication of density. 

The model appears to be learning feature dependencies that match human intuition. For example, euclidian distance is the strongest feature to predicting a tackle, and when the tackler is close to the ball carrier (a low euclidian distance), they are more likely to make the tackle (a higher SHAP value). A low blocker influence means that defenders do not have to contest with blockers, resulting in a higher tackle probability (a higher SHAP value). Conversely, we see that whether the play is a run or a pass has little influence on the tackle probability, which also makes sense - how the play started shouldn't have any impact on the probability of a tackle in the last second. The ball carrier speed importance appears to be counterintuitive, but this is a result of how the model was trained, namely, to predict the probability of a tackle happening in the *next second*. There are a subset of plays where the defender is in front of the ball carrier, and only because the ball carrier is fast enough to reach the defender does the tackle take place *in the next second*. In general, we observe that as we increase the time to prediction beyond 1 second, the SHAP values of the speed features increase. 

<center><img src="https://raw.githubusercontent.com/mpchang/tackle-probability-and-opportunity/main/figures/model_interpretation.jpg" width="800"/></center>
<center>Fig. A1. SHAP values of every feature in our training set, in order from most important (top) to least important (bottom). </center>