# A Biomimetic Approach to the NFL Big Data Bowl: Charting the Future of NFL Secondaries with Swarm Intelligence and PASTA 🍝

## Introduction

--DeMeco picture here left pane, animation right pane--

**DeMeco Ryans**, recognized as the 2022 Assistant Coach of the Year for his role as the 49ers Defensive Coordinator, has championed the concept of swarming to the ball. This philosophy, which [continues to resonate](https://www.houstontexans.com/video/head-coach-demeco-ryans-we-talked-about-a-swarm-mentality-and-that-s-what-we-did) with Ryans in his current position as head coach of the Houston Texans, emphasizes the importance of players converging on a ball carrier with intensity.

**Guiding Question:** How can we quantify defensive secondaries' success in limiting YAC via swarm tackling?

Leveraging advanced analytics, we introduce a new metric called **PASTA (Path Analysis via Swarm-Tackle Accuracy)** that aims to answer that question using swarm intelligence.

-- correlation of PASTA vs. EPA from YAC--

# Mathematical Representation of PSO in Football Context

PASTA, grounded in Particle Swarm Optimization (PSO)$^1$, adapts this algorithm, typically inspired by the coordinated behavior of natural swarms like birds and fish(this is what makes our approach biomimetic), for football strategies. PSO excels at optimizing complex problems through iterative solution improvements, making it well-suited for simulating social behaviors of animals. In a football context, each player, akin to a member of a swarm, adjusts their position and strategy relative to others, guided by simple yet effective rules. The collective aim in this case is to minimize the distance to the ball carrier in as little time as possible, mirroring the natural swarms' foraging or hunting strategies.

PSO’s mathematical framework involves updating each 'particle's' (player's) velocity and position, considering both their **personal best moves** and the team's **collective best strategies**. The key equations are:

1. **Velocity Update Equation:**
   $$ V_{ij}(t+1) = w V_{ij}(t) + c_1 r_1 (P_{ij}(t) - X_{ij}(t)) + c_2 r_2 (P_{gj}(t) - X_{ij}(t))$$
   
This equation reflects how a player (particle) adjusts their movement speed (velocity) based on their current momentum (inertia), personal past successful maneuvers (cognitive component), and the team's overall best past maneuvers (social component). 

3. **Position Update Equation:**
   $$ X_{ij}(t+1) = X_{ij}(t) + V_{ij}(t+1) $$

This equation determines the player's new position on the field, based on their updated velocity.

Where:
- $ X_{ij}(t) $ represents the position of player $ i $ in aspect $ j $ (e.g., X or Y axis on the field) at time $ t $.
- $ V_{ij}(t) $ denotes the velocity of player $ i $ in aspect $ j $ at time $ t $.
- $ w $ is the inertia weight, balancing exploration (searching new areas) and exploitation (refining current strategies).
- $ c_1 $ and $ c_2 $ are acceleration constants that regulate the importance of personal vs. collective experience.
- $ r_1 $ and $ r_2 $ are random factors introducing variability and unpredictability in movements.
- $ P_{ij}(t) $ is the personal best position achieved by player $ i $ in aspect $ j $ until time $ t $.
- $ P_{gj}(t) $ is the best position achieved by any player in aspect $ j $ until time $ t $.

## Attacking the Ball: Adapting PSO to Football
### 1. Target Selection 
Let $ \vec{P}_i $ be the position vector of the $ i $-th defensive player, and $ \vec{T}_j $ be the position vector of the $ j $-th potential tackle point along the ball carrier's path. Define the minimum velocity requirement as $ \vec{V}_{\text{min}} = (V_{\text{min}_x}, V_{\text{min}_y}) $. The objective is to find the tackle point $ \vec{T}_{\text{best}} $ that minimizes the total score $ S $, subject to velocity and angle constraints.

The total score $ S $ for a tackle point $ \vec{T}_j $ is given by:

$$
S(\vec{T}_j) = \sum_{i=1}^{N} \left( w_d \cdot d(\vec{P}_i, \vec{T}_j) + w_{\theta} \cdot |\theta(\vec{P}_i, \vec{T}_j)| \right)
$$

where:
- $ d(\vec{P}_i, \vec{T}_j) = \vec{T}_j - \vec{P}_i $ is the Euclidean distance between defensive player $ i $ and tackle point $ j $.
- $ \theta(\vec{P}_i, \vec{T}_j) $ is the angle between the defensive player's position vector and the tackle point, calculated as $ \arctan2(T_{jy} - P_{iy}, T_{jx} - P_{ix}) $, with the constraint that $ -\theta_{\text{max}} \leq \theta \leq \theta_{\text{max}} $.
- $ w_d $ and $ w_{\theta} $ are weighting factors for distance and angle, respectively.

The velocity constraint for each defensive player-tackle point pair is:

$$
|T_{jx} - P_{ix}| \leq V_{\text{min}_x} \cdot j \quad \text{and} \quad |T_{jy} - P_{iy}| \leq V_{\text{min}_y} \cdot j
$$

If this constraint is violated, the score for that defensive player-tackle point pair is set to infinity.

Finally, the best tackle point is chosen as:

$$
\vec{T}_{\text{best}} = \underset{\vec{T}_j}{\mathrm{argmin}}\, S(\vec{T}_j)
$$


### 2. Particle Swarm Optimaztion 
![image info](images/PSO_Basics.png)
### 3. Frechet Distance

The Fréchet distance is a measure used to quantify the similarity between two curves, which in our context can represent the actual paths of the defenders and the PSO path. Let $ \mathcal{P} $ and $ \mathcal{Q} $ represent the paths of a ball carrier after the completion and a defensive back, respectively, parameterized by continuous variables. The Fréchet distance between $ \mathcal{P} $ and $ \mathcal{Q} $ is defined as the minimum "leash length" required to connect a point moving along $ \mathcal{P} $ and another point moving along $ \mathcal{Q} $, such that both points traverse their respective paths from start to finish.

Formally, the Fréchet distance $ F(\mathcal{P}, \mathcal{Q}) $ can be defined as:

$$
F(\mathcal{P}, \mathcal{Q}) = \inf_{\alpha, \beta}\max_{t \in [0,1]} \left\| \mathcal{P}(\alpha(t)) - \mathcal{Q}(\beta(t)) \right\|
$$

Here, $ \alpha(t) $ and $ \beta(t) $ are continuous non-decreasing functions mapping the interval $ [0,1] $ onto the paths $ \mathcal{P} $ and $ \mathcal{Q} $, respectively. The distance at any point in time $ t $ is given by the Euclidean distance $ \left\| \mathcal{P}(\alpha(t)) - \mathcal{Q}(\beta(t)) \right\| $. The Fréchet distance is the infimum of these distances over all possible mappings $ \alpha $ and $ \beta $.

In a football game, this metric can be useful for analyzing how closely a defensive player is able to mirror the movements of the PSO path, potentially revealing insights into defensive strategies and player effectiveness.


### 4. PASTA Calculation

We use frechet distance in yards as the denominator to quantify the difference between the defensive back's actual path and the optimal path to the ball carrier (as determined by the PSO algorithim). The numerator, is based on how many yards after catch (YAC) were given up versus expected $^2$

The quotient of these two becomes PASTA. A higher PASTA value could indicate that the DBs close promixity to the optimal path resulted in YAC under expected. 

$$ \text{Yards After Catch Under Expected (YUX)} = \text{Expected Yards After Catch} - \text{Actual Yards After Catch} $$

$$ \text{PASTA} = \frac{\text{YUX}}{\text{frechet distance per player}} $$

Using Fréchet distance per player normalizes for when there is a varying number of defensive backs across plays. Note: this metric quantifies the secondary's ability to swarm to the ball carrier **after** the ball is caught; it does not consider the DBs positioning prior to this point.

## Total Frechet Distance vs. YAC of Teams Across all Plays and All Games 

--maybe a better title here?--

[diagram showing team totals correlation]

as you can see, DeMeco Ryans 49ers are top tier in terms of sticking to the optimal path. For more data exploration check out our eda notebook here [link needed]. This chart was the main inspiration for using ben baldwins xyac model to calculate a metric called PASTA. 

## 2022 PASTA Rankings by Team and Player

[diagrams]

Captions with some further analyis

## Discussion and Applications
PASTA offers vast applications for the NFL as whole, individual teams, the media, and fans.

**Player Evaluation:**
- **Unexpected YAC Assignment:** A large PASTA value could indicate the DB's close proximity to the optimal path generated YAC under expected. While a small or negative PASTA value could indicate suboptimal pursuit of the ball carrier.
- **Synergy in the Secondary:** Locating secondary units or player groupings that work better together to limit YAC.

**Coaching Applications:**
- **Teachable Moments:** Identifying plays with significant deviations from optimal paths and using them in film study. 
- **Reinforcement of Successful Strategies:** Highlighting plays where the secondary closely followed the optimal path ot everyone hustled after getting off the optimal early. 

**Broadcasting and Rankings:**
- **On-field Graphics:** Demonstrating defensive prowess through proximity to "Biomimetic" or "AI" Path on field graphic
- **Player and Defense Rankings:** Showcasing how talented players and teams align with optimal paths.

## Conclusion
PASTA, drawing inspiration from natural efficiency and tailored for football, is poised to revolutionize NFL defensive strategies. Its capacity to assess individual players and units, particularly in expansive coverage, positions it as a versatile and potent tool in strategic defense planning. This metric provides not just tactical insights for coaches but also engaging analytical content for fans and broadcasters.

# Citations
1. J. Kennedy and R. Eberhart, "Particle swarm optimization," Proceedings of ICNN'95 - International Conference on Neural Networks, Perth, WA, Australia, 1995, pp. 1942-1948 vol.4, doi: 10.1109/ICNN.1995.488968.
2. https://opensourcefootball.com/posts/2020-09-28-nflfastr-ep-wp-and-cp-models/