## I. Introduction
Once the ball is in the air, all bets are off. It becomes a race to see who can get to the ball in time, and who can put themselves in position to make a play. For elite quarterbacks and play callers, recognizing the abilities of these players allows them to plan around or for them, and have a better understanding of the area they cover on the field. 

As such, we propose the metric: Presence.

This metric tracks the ability of a player to make a play at any given moment while the ball is in the air, and can be used to define the area coverable by them. 



We believe it could be incredibly insightful to know which players have a better chance of putting themselves in those positions. 

Not every successful rep is defined by a catch, pass breakup, or interception, but rather there is a middle ground where you can do everything right and still get beat.

We aim to identify players who put themselves in position to make a play, regardless of the true outcome. 


## II. Motivation
We wanted to narrow in on this idea that a player can be best described by the space they take up because of the shift from a trove of highly physical players such as Calvin Johnson and Julio Jones to a game of speed, with players like Tyreek Hill, Davante Adams, Jamarr Chase, and Justin Jefferson being dominant because of their route running and agility. A player's ability to reach a spot on the field that nobody else could in such a short time can be the difference between an incompletion and a touchdown. That extra step makes a difference, and we wanted to explore how teams can use that knowledge to exploit weaknesses and play to their receivers' strengths. 

## III. Defining Presence
Different players possess very different skill sets, and the one we want to narrow in on is how much space on the field players can realistically control. 

Given a combination of factors, you could put a hundred different players in the same situation where they need to get to the ball. Based on how difficult of a play it is, only some players may be able to reach the ball and be in position to make a play.

These players have what we call a greater “presence” on the field. They may be able to cover an additional five yards on any given play, and that is what gives them an edge. They are more likely to be able to make a play on the ball because they are able to make it to the ball.
    
This report will identify players in various positions who are able to outperform their position group at being in position to make a play. 


The way we define success is being within a yard and a half of the landing spot of the ball at the end of the play. A frame is considered a "success frame" if the final frame is within that yard and a half, and a "failure frame" otherwise. 

For instance, on this play, Trey Walker III got a hand on the ball despite Watson catching it. We would still consider this a successful rep, because Watson had to make a spectacular play to come down with the ball. We are primarily looking at players' abilities to put themselves in position to make a play.

![](https://github.com/MPuram12/BigDataBowl2025/blob/main/Watson%20Catch.gif?raw=true)


## IV. Model
Features:
- t; time to ball landing (seconds)
- d; distance to ball (yards)
- v; velocity (mph)
- angle; angle between current path and optimal path to ball
- a; acceleration
- pID; player ID

Target: 
- reached_ball; player within 1.5 yards of ball in last frame

We used an h2o deep learning model: (reached_ball ~ t + d + v + angle + a + pID).

We also created two models, one with pID and one without. This was so we could compare the predictions based on the player's effect.

## V. Results
Performance:
With pID:
- 97.5% accuracy
    - 2% error predicting nos, 4% error predicting yesses.
Without pID:
- 95.5% accuracy
    - 4% error predicting nos, 8% error predicting yesses.


Effect;    prob(reach | pID) - prob(reach | no pID)

Week 5 2023, Bengals @ Cardinals, 13:47 remaining in 3rd Quarter

At the moment of the throw:

prob(reach | J Chase) = .951

prob(reach | Avg Player) = .264

Effect = .687

![](https://github.com/MPuram12/BigDataBowl2025/blob/main/chase%20play%20video.gif?raw=true)


In this play, our model immediately recognizes Jamarr Chase as having an exceptionally above average likelihood of making this play (95% as opposed to 26% if not considering pID). It also recognizes that none of the defenders have a chance at making this play as soon as the ball is thrown. 

![](https://github.com/MPuram12/BigDataBowl2025/blob/main/chase%20play.gif?raw=true)

We can further visualize this effect by looking at an elite player such as chase, compared to a player with one of the lower effects, Wandale Robinson. 

Chase's reachable area is deeper, wider, and overall much better in this setting (2 seconds of air time, moving 15mph). 

![](https://github.com/MPuram12/BigDataBowl2025/blob/main/heatmaps.png?raw=true)


NameError: name 'chase_new' is not defined

Additionally, we can look at top players by their total effect. Because total effect isn't the most intuitive stat, we are instead showing players' average effects (but the top 10 is by total effect, which can be interpreted as the 10 players with the most total value added across all frames). Below are the top ten cornerbacks, wide receivers, and safeties.

![](https://github.com/MPuram12/BigDataBowl2025/blob/main/leaderboards.png?raw=true)

## VI. Limitations
- Spotty data; some catches have player >3 yards away from landing spot.
- Doesn’t perform well on plays like the one to the right, especially when player makes sudden last-second movement such as diving, especially back towards the ball.
- No ball tracking
- Unclear when ball is caught on a given play
- Not much pre-existing info about whether pass was “contested” by defender, so had to use our own definition.


For instance, Reynolds was listed as >3 yards away from ball on this catch. Our model considered him as not reaching the ball on this play, and said his probability to reach was 0. 


![](https://github.com/MPuram12/BigDataBowl2025/blob/main/reynolds%20catch.png?raw=true)

## VII. Next Steps
- Animate player areas
- Create Kaggle Report
- More visualization work
- Model improvements (if possible)
