# Targeted Receiver Data

This notebook is a demo of Targeted Receiver data. It gives the targetted receiver for each gameId and playId in the data. `targetNflId` maps to `nflId` in the `players.csv` and `week[x].csv` files.

The `targetNflId` column is missing if there the ball is thrown away, spiked or if there is no clear target on the play.

## Loading Libraries

In [1]:
library(tidyverse)
library(repr)
library(tm)
library(ggrepel)


#turning off warnings
options(warn=-1)

#setting plot width and height
options(repr.plot.width=15, repr.plot.height = 10)

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.0 ──

[32m✔[39m [34mggplot2[39m 3.3.2     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.0.3     [32m✔[39m [34mdplyr  [39m 1.0.2
[32m✔[39m [34mtidyr  [39m 1.1.2     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 1.3.1     [32m✔[39m [34mforcats[39m 0.5.0

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()

Loading required package: NLP


Attaching package: ‘NLP’


The following object is masked from ‘package:ggplot2’:

    annotate


The following object is masked from ‘package:httr’:

    content




## Reading Data

In [2]:
#includes play-by-play info on specific plays
df_plays <- read_csv("../input/nfl-big-data-bowl-2021/plays.csv",
                    col_types = cols())

#includes background info for players
df_players <- read_csv("../input/nfl-big-data-bowl-2021/players.csv",
                      col_types = cols())

#includes targetted receiver by play
df_targetedReceiver <- read_csv("../input/nfl-big-data-bowl-2021-bonus/targetedReceiver.csv",
                      col_types = cols())

#includes schedule info for games
df_games <- read_csv("../input/nfl-big-data-bowl-2021/games.csv",
                    col_types = cols())

In [3]:
df_plays <- inner_join(df_plays,
                      df_targetedReceiver,
                      by = c('playId', 'gameId'))
head(df_plays)

gameId,playId,playDescription,quarter,down,yardsToGo,possessionTeam,playType,yardlineSide,yardlineNumber,⋯,gameClock,absoluteYardlineNumber,penaltyCodes,penaltyJerseyNumbers,passResult,offensePlayResult,playResult,epa,isDefensivePI,targetNflId
<dbl>,<dbl>,<chr>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,⋯,<time>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<lgl>,<dbl>
2018090600,75,(15:00) M.Ryan pass short right to J.Jones pushed ob at ATL 30 for 10 yards (M.Jenkins).,1,1,15,ATL,play_type_pass,ATL,20,⋯,15:00:00,90,,,C,10,10,0.2618273,False,2495454
2018090600,146,"(13:10) M.Ryan pass incomplete short right to C.Ridley (J.Mills, J.Hicks).",1,1,10,ATL,play_type_pass,PHI,39,⋯,13:10:00,49,,,I,0,0,-0.3723598,False,2560854
2018090600,168,(13:05) (Shotgun) M.Ryan pass incomplete short left to D.Freeman.,1,2,10,ATL,play_type_pass,PHI,39,⋯,13:05:00,49,,,I,0,0,-0.7027787,False,2543583
2018090600,190,(13:01) (Shotgun) M.Ryan pass deep left to J.Jones to PHI 6 for 33 yards (R.Darby).,1,3,10,ATL,play_type_pass,PHI,39,⋯,13:01:00,49,,,C,33,33,3.04753,False,2495454
2018090600,256,(10:59) (Shotgun) M.Ryan pass incomplete short right to D.Freeman.,1,3,1,ATL,play_type_pass,PHI,1,⋯,10:59:00,11,,,I,0,0,-0.8422719,False,2543583
2018090600,320,(10:10) (Shotgun) N.Foles pass short left to N.Agholor to PHI 8 for 4 yards (R.Alford).,1,2,8,PHI,play_type_pass,PHI,4,⋯,10:10:00,14,,,C,4,4,-0.3440965,False,2552600


In [4]:
##Reading tracking data (needs to be done iteratively)

#weeks of NFL season
weeks <- seq(1, 17)

#blank dataframe to store tracking data
df_tracking <- data.frame()

#iterating through all weeks
for(w in weeks){
    
    #temperory dataframe used for reading week for given iteration
    df_tracking_temp <- read_csv(paste0("../input/nfl-big-data-bowl-2021/week",w,".csv"),
                                col_types = cols())
    
    #storing temporary dataframe in full season dataframe
    df_tracking <- bind_rows(df_tracking_temp, df_tracking)                            
    
}

In [5]:
#Standardizing tracking data so its always in direction of offense vs raw on-field coordinates.
df_tracking <- df_tracking %>%
                mutate(x = ifelse(playDirection == "left", 120-x, x),
                       y = ifelse(playDirection == "left", 160/3 - y, y))

In [6]:
#merging plays and tracking data
df_merged <- inner_join(df_games,
                        df_plays,
                        by = c("gameId" = "gameId"))

#merging games data to previously merged frame
df_merged <- inner_join(df_merged,
                        df_tracking,
                        by = c("gameId" = "gameId",
                               "playId" = "playId"))

In [7]:
colnames(df_merged)
unique(df_merged$position)

In [8]:
# create df for only players who are able to earn a target and strip it down to contain only necessary data

plyrsWithTargets <- df_merged %>%

select( gameId, playId, gameClock, defendersInTheBox, passResult,
       targetNflId, time, x, y, s, a, dis, event, epa, frameId,
       nflId, displayName, position, route ) %>%

filter( position == 'WR' | position == 'HB' | position == 'QB' | position == 'TE' | position == 'FB' | position == 'RB') %>%

mutate( targetedReceiver = ifelse(nflId == targetNflId, 1, 0) )

head(plyrsWithTargets)

# save completion probability for later
# defenses forcing the lowest target probabilities are likely the best at defending the pass
# combine that with how they have to defend, on average to eliminate the ability of the team's pass rush

# there's use in a target probability/completion probability model in how
# you can compare the difference in QBs/DB reads when looking at the difference between the completion probability
# at the time of the ball being released and the ball arriving
# i.e. a QB can be throwing players open/reading the D well if they create greater comp %'s between the release and arrival time frame
# the same is true of the inverse with defenses
# if they can decrease comp probs when the ball is in the air that is good

gameId,playId,gameClock,defendersInTheBox,passResult,targetNflId,time,x,y,s,a,dis,event,epa,frameId,nflId,displayName,position,route,targetedReceiver
<dbl>,<dbl>,<time>,<dbl>,<chr>,<dbl>,<dttm>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.27,26.663333,0.0,0.01,0.02,,0.2618273,1,310,Matt Ryan,QB,,0
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.65,9.173333,0.02,0.03,0.01,,0.2618273,1,2495454,Julio Jones,WR,HITCH,1
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,29.22,17.183333,0.0,0.0,0.0,,0.2618273,1,2533040,Mohamed Sanu,WR,HITCH,0
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,21.75,26.703333,0.01,0.01,0.0,,0.2618273,1,2543583,Devonta Freeman,RB,,0
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.71,31.503333,0.01,0.02,0.01,,0.2618273,1,2555415,Austin Hooper,TE,OUT,0
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,24.87,26.623333,0.01,0.01,0.01,,0.2618273,1,2559033,Ricky Ortiz,FB,FLAT,0


In [9]:
# use positioning of QB to get receiver distances from QB at each frame to help determine target prob

qbPositioning <- plyrsWithTargets %>%

select(gameId, playId, x, y, frameId, position) %>%

filter(position == 'QB') %>%

select(gameId, playId, x, y, frameId)

colnames(qbPositioning) <- list('gameId', 'playId', 'qbX', 'qbY', 'frameId')

head(qbPositioning)

gameId,playId,qbX,qbY,frameId
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
2018090600,75,28.27,26.66333,1
2018090600,75,28.27,26.66333,2
2018090600,75,28.27,26.67333,3
2018090600,75,28.27,26.67333,4
2018090600,75,28.27,26.68333,5
2018090600,75,28.27,26.68333,6


IDEA: use a little geometry to calculate a receiver's angle from the QB.
Find some functions to calculate the angle between two points.

So, calculate euclidean distance between QB and receiver and angle given by:

m = y2-y1 / x2-x1 (where m is the slope)

tan(theta) = m

theta = arctan(m)

In [10]:
plyrsWithTargets <- inner_join(plyrsWithTargets,
                              qbPositioning,
                              by=c('gameId', 'playId', 'frameId'))

In [11]:
plyrsWithTargets <- plyrsWithTargets %>%

filter(position != 'QB')

In [12]:
plyrsWithTargets <- plyrsWithTargets %>%

mutate(
distanceFromQB = sqrt((x-qbX)^2 + (y-qbY)^2), # euclidean distance from QB
angleToQB = atan((y - qbY) / (x - qbX)) * 57.2958 # must convert from radians to degrees
)

In [13]:
head(plyrsWithTargets)

# The angleToQB is so wonky at first is bc they're on the line of scrimmage

# how does this play into the risk reward tradeoff between epa and distance from qb and angle to qb?
# i.e. what does it look like if we regress angleToQB on epa? Obviously, linear regression won't work unless
# we mathemtcially adjust to normalize the angle
# normalizing the anlge effects any signal the handedness of QBs (mostly right-handed) yields for target %
# meaning, qb's are less likely to target to their weak side probably
# this is why I should just plot the relationship and see what happens

gameId,playId,gameClock,defendersInTheBox,passResult,targetNflId,time,x,y,s,⋯,frameId,nflId,displayName,position,route,targetedReceiver,qbX,qbY,distanceFromQB,angleToQB
<dbl>,<dbl>,<time>,<dbl>,<chr>,<dbl>,<dttm>,<dbl>,<dbl>,<dbl>,⋯,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.65,9.173333,0.02,⋯,1,2495454,Julio Jones,WR,HITCH,1,28.27,26.66333,17.494128,-88.7553793
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,29.22,17.183333,0.0,⋯,1,2533040,Mohamed Sanu,WR,HITCH,0,28.27,26.66333,9.527481,-84.2774692
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,21.75,26.703333,0.01,⋯,1,2543583,Devonta Freeman,RB,,0,28.27,26.66333,6.520123,-0.3515036
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.71,31.503333,0.01,⋯,1,2555415,Austin Hooper,TE,OUT,0,28.27,26.66333,4.859959,84.8056014
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,24.87,26.623333,0.01,⋯,1,2559033,Ricky Ortiz,FB,FLAT,0,28.27,26.66333,3.400235,0.6740371
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.63,9.163333,0.03,⋯,2,2495454,Julio Jones,WR,HITCH,1,28.27,26.66333,17.503702,-88.8215419


In [14]:
plyrsWithTargets %>%
filter(gameId == 2018090600 & playId == 75 & displayName == 'Austin Hooper')

gameId,playId,gameClock,defendersInTheBox,passResult,targetNflId,time,x,y,s,⋯,frameId,nflId,displayName,position,route,targetedReceiver,qbX,qbY,distanceFromQB,angleToQB
<dbl>,<dbl>,<time>,<dbl>,<chr>,<dbl>,<dttm>,<dbl>,<dbl>,<dbl>,⋯,<dbl>,<dbl>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.71,31.50333,0.01,⋯,1,2555415,Austin Hooper,TE,OUT,0,28.27,26.66333,4.859959,84.8056
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.7,31.49333,0.01,⋯,2,2555415,Austin Hooper,TE,OUT,0,28.27,26.66333,4.849103,84.91258
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.71,31.49333,0.01,⋯,3,2555415,Austin Hooper,TE,OUT,0,28.27,26.67333,4.840041,84.78417
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:14,28.71,31.49333,0.01,⋯,4,2555415,Austin Hooper,TE,OUT,0,28.27,26.67333,4.840041,84.78417
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:15,28.72,31.49333,0.01,⋯,5,2555415,Austin Hooper,TE,OUT,0,28.27,26.68333,4.831004,84.65528
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:15,28.72,31.48333,0.01,⋯,6,2555415,Austin Hooper,TE,OUT,0,28.27,26.68333,4.821048,84.64421
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:15,28.71,31.49333,0.01,⋯,7,2555415,Austin Hooper,TE,OUT,0,28.27,26.67333,4.840041,84.78417
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:15,28.71,31.49333,0.01,⋯,8,2555415,Austin Hooper,TE,OUT,0,28.27,26.67333,4.840041,84.78417
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:15,28.71,31.49333,0.01,⋯,9,2555415,Austin Hooper,TE,OUT,0,28.27,26.66333,4.85,84.79491
2018090600,75,15:00:00,7,C,2495454,2018-09-07 01:07:15,28.7,31.48333,0.0,⋯,10,2555415,Austin Hooper,TE,OUT,0,28.27,26.66333,4.839142,84.90208


In [15]:
# plyrsWithTargets %>%

# ggplot(aes(x=angleToQB, y=epa)) +
# geom_point() +
# geom_smooth()

In [None]:
write.csv(plyrsWithTargets, 'plyrsWithTargets.csv')