# A Biomimetic Approach to the NFL Big Data Bowl: Charting the Future of NFL Defense with Swarm Intelligence and PASTA 🍝

## Introduction
> #49ers defense has a "Swarm Player Of The Day" award for , as DC DeMeco Ryans says, "whoever attacking the ball, got turnovers or made special effort on certain plays."
> LB Fred Warner was leader in clubhouse through camp. "Fred had a lot, Bosa had a lot, Kerry Hyder, they top 3"
>
> -- [Cam Inman --> Original Tweet from 2022](https://x.com/CamInman/status/1564349536499220480?s=20)


**Guiding Question:** Do defenses that limit yards after catch (YAC) embody the principles of swarming to the ball when making a tackle? 

**DeMeco Ryans**, recognized as the 2022 Assistant Coach of the Year for his role as the 49ers Defensive Coordinator, has championed the concept of swarming to the ball. This philosophy, which [continues to resonate](https://www.houstontexans.com/video/head-coach-demeco-ryans-we-talked-about-a-swarm-mentality-and-that-s-what-we-did) with Ryans in his current position as head coach of the Houston Texans, emphasizes the importance of players converging on a ball carrier with intensity. 

in the tweet (X?), DeMeco Ryans reveals a few principals for what it means to **SWARM**:

- Attacking the Ball
- Turnovers
- Special Effort

Leveraging advanced analytics, we introduce a new metric called **PASTA (Path Analysis via Swarm-Tracked Accuracy)** that aims to quantify a defense's ability to collectively swarm to a common target in an efficient way. We will use DeMeco Ryans aspects of SWARM to guide us on this journey. 

We are confident our underlying methodology offers broad applicability. The flexibility of our framework allows for tailoring to the diverse needs of coaches, players, fans, and media. This adaptability forms one of the strengths of our approach, enabling stakeholders to modify the guiding questions to suit their unique perspectives and objectives.


## Attacking the Ball: Adapting PSO to Football
PASTA, grounded in Particle Swarm Optimization (PSO)[citation needed], adapts this algorithm, typically inspired by the coordinated behavior of natural swarms like birds and fish, for football strategies. PSO excels at optimizing complex problems through iterative solution improvements, making it well-suited for simulating social behaviors of animals. In a football context, each player, akin to a member of a swarm, adjusts their position and strategy relative to others, guided by simple yet effective rules. The collective aim in this case is to minimize the distance to the ball carrier efficiently while avoiding blockers (more on that later), mirroring the natural swarms' foraging or hunting strategies.

We initially dubbed our implementation as "Adapted Particle Swarm Optimization with Time Weighting and Obstacle Avoidance, or APSOTWOA"— not catchy. PASTA is better. 

**Mathematical Representation of PSO in Football Context:**

PSO’s mathematical framework involves updating each 'particle's' (player's) velocity and position, considering both their personal best moves and the team's collective best strategies. The key equations are:

1. **Velocity Update Equation:**
   $$ V_{ij}(t+1) = w V_{ij}(t) + c_1 r_1 (P_{ij}(t) - X_{ij}(t)) + c_2 r_2 (P_{gj}(t) - X_{ij}(t))$$

   > Every player in the swarm has a velocity capped at the top speed of the ball carrier. This is not completely realistic. A beeter approach would be to set a cap based on the players combine metrics, max speed on that play, or some other player specific measure. We will address this in future iterations. 
   
This equation reflects how a player (particle) adjusts their movement speed (velocity) based on their current momentum (inertia), personal past successful maneuvers (cognitive component), and the team's overall best past maneuvers (social component). 

3. **Position Update Equation:**
   $$ X_{ij}(t+1) = X_{ij}(t) + V_{ij}(t+1) $$

This equation determines the player's new position on the field, based on their updated velocity.

Where:
- $ X_{ij}(t) $ represents the position of player $ i $ in aspect $ j $ (e.g., X or Y axis on the field) at time $ t $.
- $ V_{ij}(t) $ denotes the velocity of player $ i $ in aspect $ j $ at time $ t $.
- $ w $ is the inertia weight, balancing exploration (searching new areas) and exploitation (refining current strategies).
- $ c_1 $ and $ c_2 $ are acceleration constants that regulate the importance of personal vs. collective experience.
- $ r_1 $ and $ r_2 $ are random factors introducing variability and unpredictability in movements.
- $ P_{ij}(t) $ is the personal best position achieved by player $ i $ in aspect $ j $ until time $ t $.
- $ P_{gj}(t) $ is the best position achieved by any player in aspect $ j $ until time $ t $.

At every iteration (**which matches the number of frames in a play**) the particles in the swarm are rewarded and penalized based on how well they optimize for the personal and global best position in terms satisfying the objective. In our use case, the objective is to mimize distance to a pre-determined target along the ball carrier's path, but as we discuss in later sections, this objective function can be modified to meet a wide array of football related demands on a week-to-week basis. 

### Additional Adaptions for Football Context
We adapt the typical PSO Algoritm to the available data by adding the following elements:
1. **Target Selection:**: Our algorithm first selects the best collective common target for the particles in the swarm toward.
> Target selection currently does not account for the speed it takes to cover the distance from the player to the target location. As it stands, our algorithm only performs realistically on plays where the target location is toward the end of the play or is able to cover the required distance in the given number of frame iterations. In short,the quality of the common target decays over time. 

3. **Obstacle Avoidance:** Our algorithm considers other players on the field as obstacles so not to draw optimal paths through other players on the way to the target. 
4. **Time-Weighting:** Our algorithm considerst the future positions of obstacles and the ball carrier when calcualting the optimal path. It does this in the same way an actual defesnive player may predict the future positions of the other players on the field by anticipating where their future position may be.
[Tims diagram here]

We process the tracking data frame by frame. For each frame in the dataset, the adpated PSO algorithm calculates a new, "swarm optimal" set of x and y coordinates. From this we can build our **swarm paths**. 

For a more technical breakdown of the parameters of adaptation of PSO, please see APPENDIX A. Additionally, the full code is available here (github)


## Data Pipeline
For our analysis, we used the nfl tracking data from weeks 1-9 in the 2022 season provided by the competition. We process this data and feed it into our piepline. Here is a link to a flow chart of our pipeline in its entirety [link to drawio or github]. To answer our guiding question, we only need the plays that had a completed pass, the ball carrier path and speed and FS, CB, and SS's path and speed. Once we filter the data, the algorithm anaylze each play individually, frame by frame and outputs the **swarm paths**. 

## Analysis of a Play 
animation of play here 

breief over view of the analysis of this particular play. Explain frechet distance calculation. 
> Target selection does not update frame by frame. We made a decision for simplicitys sake to keep it as a stationary target. We are investigating ways to impliment PSO with a moving target via concepts from this paper[citation needed]

Because we calculate frechet distance frame by frame, we have the ability to to analyze players deviation at different segments of the play. Beginning, middle, specific frames... whatever is needed for that weeks coaching point or rankings discussion. Additionally, trimming the play into different segments reduces the overall decay of our target selection. This flexibility gives it an advantage over a metric like yards after catch, which is at the play level. Since we want to evaluate our metric against an established metric (yards after catch) and model(xyac from ben baldwin [citationneeded] , nflfastR, we will need to aggregate our frechet distances to the play -player level. From there we willa ggregate to the team level and compare the two metrics, which conveniently are both measured in yards. 

## Total Frechet Distance vs. YAC of Teams Across all Plays and All Games 

[diagram showing team totals correlation]

as you can see, DeMeco Ryans 49ers are top tier in terms of sticking to the optimal path. For more data exploration check out our eda notebook here [link needed]. This chart was the main inspiration for using ben baldwins xyac model to calculate a metric called PASTA. 

### PASTA Calculation
$$ \text{Yards After Catch Under Expected (YUX) per play} = \text{Expected Yards After Catch(XYAC) per play} - \text{Actual Yards After Catch(AYAC) per play} $$

$$ \text{PASTA} = \frac{\text{YUX}}{\text{frechet distance per player}} $$

- **Optimal Path:** Calculated for each player based on the game environment
- **Frechet Distance:** Measures deviation from the optimal path in yards
- **PASTA Index:** Quantifies efficiency in limiting yards after catch in following the optimal swarm path. In our specific instance the agents make up 2- 7 members of the secondary.
  
One of the features of the way the target is chosen is that it will recommend a optimal number of members of the swarm. this could be tuned to always include all FS, SS, and CBs for example, but as of right now it chooses those that are close enough to eachother and the target to effect the target location. We control for this in our rankings section by dividing taking the average frechet distance across all players on a given play. 

### Anyway, without further ado here are the 2022 PASTA Rankings by Team and Player

[diagrams]

Captions with some further analyis

## Discussion and Applications
PASTA offers vast applications for the NFL as whole, individual teams, the media, and fans.

**Coaching Applications:**
- **Teachable Moments:** Identifying plays with significant deviations from optimal paths and using them in film study. 
- **Reinforcement of Successful Strategies:** Highlighting plays where the secondary closely followed the optimal path ot everyone hustled after getting off the optimal early. 

**Broadcasting and Rankings:**
- **On-field Graphics:** Demonstrating defensive prowess through proximity to "Biomimetic" or "AI" Path on field graphic
- **Player and Defense Rankings:** Showcasing how talented players and teams align with optimal paths.

## Future Considerations
- **Generalizing to Other Groups:** It would be interesting to analyze different position units, entire teams, or relevant combinations of players on different types of plays and different types of expected yards models. 
- **Covering Potential Receivers:** Exploring PASTA's application to broader defensive scenarios, such as secondaries converging on a reciever in anticipation of an incoming pass.
- **Hyperparameter Tuning:** Refining parameters to tailor to specific team strategies and in house analytics using straties like GridSearch. When we do this we can revisit the random components and add them in without throwing off the entire algorithm. 
- **Incorporating Other Models:** Considering integration with other distance minimization models like STRAIN for enhanced analysis and perhaps better path optimization.
  
## Limitations and Ongoing Development (delete?) -- i like the in place notes better? 
While PASTA is promising, it has areas for growth:
- **Max Speed Limitation:** A constraint that may not fully mirror real-life scenarios but is still effective. Given more data and time, we could fine tune this to constrain more specifically to individual player profiles but we also suspect this would do little to improve this as a tool. 
- **Target Selection:** Focusing on refining how targets are selected and pursued by the defense. Ideally, we would like to incorporate ideas from this paper [citation neede] that utilizes particle swarm optimization on moving targets. or we would like to incorporate velocity in a more realistic manner(??? not explaining this well) . both are complex problems to solve, but are worth pursuing.  

## Conclusion
PASTA, drawing inspiration from natural efficiency and tailored for football, is poised to revolutionize NFL defensive strategies. Its capacity to assess individual players and units, particularly in expansive coverage, positions it as a versatile and potent tool in strategic defense planning. This metric provides not just tactical insights for coaches but also engaging analytical content for fans and broadcasters.

# APPENDIX A - PARAMETERS
### Parameters in Adaptation Process
To illustrate the flexibility of our adpated PSO approach, please see the key parameters used in our implementtion:

- **Play:** The algorithm optimizes the path for a single play

- **Objective Function:** The function that determines target selection, considers obstacle avoidance, and minimizes the distance to the ball carrier.
  
- **Ball Carrier Identification (ball_carrier_id):** Identifies the ball carrier at the moment they get a handoff or catch a pass.

- **Positional Group:** Can be adjusted for any group of players.

- **Max Iterations:** This parameter defines the maximum number of iterations or steps the algorithm takes in a single play optimization. It ensures that the algorithm converges within a reasonable time frame.

- **Threshold Stop:** The threshold stop parameter determines how close to the ball carrier the algorithm should stop optimizing. It sets the minimum acceptable distance between the defender and the ball carrier before the optimization process concludes.

- **Inertia Weight (`w`):** In PSO, the inertia weight determines the trade-off between the particle's current velocity and its historical velocity.

- **Acceleration Constants (`c1` and `c2`):** These constants control the influence of personal and global best positions on each particle's movement. Similar to the inertia weight, we also used the golden ratio for `c1` and `c2`. This choice was made to ensure that the particle's movement is influenced proportionally by both its personal best position and the global best position.
  - **We chose to use the golden ratio (`φ`) as the ratio for `w`, `c1`, and `c2` based on the findings from this paper [citation needed]. In short this ratio strikes a balance between exploration and exploitation of the solution space. The golden ratio is known for its ability to promote convergence while preventing premature convergence.

- **Obstacle Avoidance Factor:** In football, players must navigate through obstacles, including opposing players and blockers. This parameter determines how effectively a defender can navigate through such obstacles. It is fine-tuned based on the player's position and the positions of potential obstacles.

- **Time Weighting:** To account for predicting future positions of players, a time weighting parameter is introduced. This parameter estimates the best intercept angle by factoring in the expected future positions of the ball carrier and other players.