# NFL tracking: wrangling, Voronoi, and sonars

Python for [NFL tracking: wrangling, Voronoi, and sonars](https://www.kaggle.com/statsbymichaellopez/nfl-tracking-wrangling-voronoi-and-sonars), and thanks for sharing

# Dealing with player tracking data

One initial but tricky issue when working with NFL player tracking data is that sometimes players are moving left to right down the field, and other times those same players are moving right to left. In this walk-through, I do the following:

1. Standardize player tracking coordinates so that offensive units are moving in the same direction throughout the entirety of the game, while also sharing tweaks that should be made regarding team names.

2. Create a sample Voronoi diagram using tracking coordinates. 

3. Standardize the direction variable for the ball carrier, and provide video to represent of that standardization manifests itself within a play.

4. Create angle charts for player movement at a snapshot of time within a play. 

5. Create sonar maps for player direction.

## 1. Standardizing coordinates

One recommended technique for working with this data is to standardize play directions, so that the offensive team is always moving in the same direction. This idea is particularly important for this year's Big Data Bowl event on Kaggle, where many participants will look to engineer football-specific features. 

To start off, let's read in the data, creating a dummy variable for plays moving left to right (`ToLeft`), as well as an indicator for whether or not the player is the ball carrier (`IsBallCarrier`). 

Next, I add in a few tweaks to account for different factors between the `PossessionTeam`/`FieldPosition` variables and the `HomeTeamAbbr`/`VisitorTeamAbbr`.

In [None]:
import sys,os

import pandas as pd, numpy as np, matplotlib.pyplot as plt

%matplotlib inline

import warnings
warnings.filterwarnings("ignore")

In [None]:
train  = pd.read_csv('/kaggle/input/nfl-big-data-bowl-2020/train.csv')
train['ToLeft'] = train.PlayDirection.apply(lambda play_direction:play_direction=='left')
train['IsBallCarrier'] = train[['NflId','NflIdRusher']].apply(lambda row:row.NflId==row.NflIdRusher, axis=1)

team_abbr_map = {
    'ARI':'ARZ',
    'BAL':'BLT',
    'CLE':'CLV',
    'HOU':'HST'
}
train.VisitorTeamAbbr = train.VisitorTeamAbbr.apply(lambda vta:team_abbr_map[vta] if vta in team_abbr_map.keys() else vta)
train.HomeTeamAbbr = train.HomeTeamAbbr.apply(lambda vta:team_abbr_map[vta] if vta in team_abbr_map.keys() else vta)

Let's see what six plays look like. Below, I take three plays in each direction

In [None]:
sample_chart_v1 = train[train.ToLeft==True][['PlayId','ToLeft']].sample(3).merge(train, how='inner')
sample_chart_v1 = sample_chart_v1.append(train[train.ToLeft==False][['PlayId','ToLeft']].sample(3).merge(train, how='inner'))
sample_chart_v1

In [None]:
plt.figure(figsize=(30, 15))
plt.suptitle('Sample plays')
i=1
for gp,chance in sample_chart_v1.groupby('PlayId'):
    play_id = gp
    rusher = chance[chance.NflId==chance.NflIdRusher]
    home = chance[chance.Team=='home']
    away = chance[chance.Team=='away']
    #yard_line_left = offense.YardLine.iloc[0]+10 # yard_line 加10偏移量，这个10是左侧的达阵区
    #yard_line_right = offense.YardLine.iloc[0]+2*(50-offense.YardLine.iloc[0])+10
    #yard_line = yard_line_left if np.abs(yard_line_left-rusher.X.iloc[0])<=(yard_line_right-rusher.X.iloc[0]) else yard_line_right
    
    plt.subplot(3,2,i)
    i+=1
    plt.xlim(0,120)# 0~120已经包含了达阵区，实际场内只有100码，码线也是0~100的范围
    plt.ylim(-10,63)
    plt.scatter(list(home.X),list(home.Y),marker='o',c='red',s=55,alpha=0.5,label='HOME')
    plt.scatter(list(away.X),list(away.Y),marker='o',c='green',s=55,alpha=0.5,label='AWAY')
    plt.scatter(list(rusher.X),list(rusher.Y),marker='o',c='black',s=30,label='RUSHER')
    
    for line in range(10,130,10):
        plt.plot([line,line],[-100,100],c='black',linewidth=1,linestyle=':')
    
    #plt.plot([yard_line,yard_line],[-100,100],c='orange',linewidth=1.5)
    plt.plot([10,10],[-100,100],c='green',linewidth=3) # down zone left
    plt.plot([110,110],[-100,100],c='green',linewidth=3) # down zone right
    
    plt.title(play_id)
    plt.legend()

plt.show()

It's really hard to tell which team is on offense! Even though the ball carrier is highlighted in black, the inconsistency from one play to the next is less than ideal. Sometimes the away team is on offense, other times the home team is on offense. And they're both potentially moving left or moving right. 

Our ultimate goal will be to ensure that the offensive team (`PossessionTeam`) is moving left to right, even if in the raw data, the offense is moving right to left. 

The following set of code will get us there.

In [None]:
train_1 = train.copy()
train_1['TeamOnOffense'] = train_1[['PossessionTeam','HomeTeamAbbr']].apply(lambda row:'home' if row.PossessionTeam==row.HomeTeamAbbr else 'away', axis=1)
train_1['IsOnOffense'] = train_1[['TeamOnOffense','Team']].apply(lambda row:row.TeamOnOffense==row.Team , axis=1)
train_1['YardsFromOwnGoal'] = train_1[['FieldPosition','PossessionTeam','YardLine']].apply(lambda row:row.YardLine if row.FieldPosition==row.PossessionTeam else 50+(50-row.YardLine), axis=1)
train_1['YardsFromOwnGoal'] = train_1[['YardsFromOwnGoal','YardLine']].apply(lambda row:50 if row.YardLine==50 else row.YardsFromOwnGoal, axis=1)
train_1['X_std'] = train_1[['ToLeft','X']].apply(lambda row:120-row.X-10 if row.ToLeft else row.X-10, axis=1)
train_1['Y_std'] = train_1[['ToLeft','Y']].apply(lambda row:160/3-row.Y if row.ToLeft else row.Y, axis=1)
train_1[['Team','FieldPosition','PossessionTeam','TeamOnOffense','IsOnOffense','YardLine','YardsFromOwnGoal','X','X_std','Y','Y_std']].sample(10)

In [None]:
sample_chart_v2 = sample_chart_v1.merge(train_1, how='inner')
sample_chart_v2

Taking the above code, we'll plot the standardized `X` and `Y` variables, similar to what we did above. We also created the `YardsFromOwnGoal` variable -- in football language, this is the line of scrimmage. More importantly, we've ensured the `YardsFromOwnGoal`, unlike `YardLine`, treats each side of the field differently. For example, if `YardLine == 25`, that could mean the possession team has 75 yards to go for a touchdown, or it could mean it has 25 yards to go for a touchdown. 

Anyways, here's a set of plots where each team is moving left to right. 

In [None]:
plt.figure(figsize=(30, 15))
plt.suptitle("Sample plays, standardized, Offense moving left to right")
plt.xlabel("Distance from offensive team's own end zone")
plt.ylabel("Y coordinate")

i=1
for gp,chance in sample_chart_v2.groupby('PlayId'):
    play_id = gp
    rusher = chance[chance.NflId==chance.NflIdRusher].iloc[0]
    offense = chance[chance.IsOnOffense]
    defense = chance[~chance.IsOnOffense]
    
    plt.subplot(3,2,i)
    i+=1
    plt.xlim(0,120)
    plt.ylim(-10,63)
    
    plt.scatter(offense.X_std,offense.Y_std,marker='o',c='red',s=55,alpha=0.5,label='OFFENSE')
    plt.scatter(defense.X_std,defense.Y_std,marker='o',c='green',s=55,alpha=0.5,label='DEFENSE')
    plt.scatter([rusher.X_std],[rusher.Y_std],marker='o',c='black',s=30,label='RUSHER')
    
    for line in range(10,130,10):
        plt.plot([line,line],[-100,100],c='silver',linewidth=0.8,linestyle='-')
    
    plt.plot([rusher.YardsFromOwnGoal,rusher.YardsFromOwnGoal],[-100,100],c='black',linewidth=1.5,linestyle=':')
    plt.plot([10,10],[-100,100],c='black',linewidth=2)
    plt.plot([110,110],[-100,100],c='black',linewidth=2)
    
    plt.title(play_id)
    plt.legend()

plt.show()

In the above plot, the blue team is always on offense, and the red team is always on defense. The blue team is also always moving left to right. The dotted line corresponds to where the play started -- if `Yards == 0`, that means that the ball carrier was tackled on or about the dotted line. If `Yards > 0`, the ball carrier ended the play to the right of the dotted line, and if `Yards < 0`, the ball carrier ended the play to the left of the dotted line.

## 2. Player space and Voronoi areas

One critical element of the contest will be deriving football-specific features that correspond to the amount of space that the ball carrier has. The more space around him (in particular, the more space in front of him as he moves from left to right), the easier it is for him to gain yards. 

As one way to think about space, Voronoi diagrams partition each location on the field to the nearest player. It's a start as one thinks about where a running back could move.

In [None]:
def voronoi_finite_polygons_2d(vor, radius=None):
    """
    Reconstruct infinite voronoi regions in a 2D diagram to finite
    regions.

    Parameters
    ----------
    vor : Voronoi
        Input diagram
    radius : float, optional
        Distance to 'points at infinity'.

    Returns
    -------
    regions : list of tuples
        Indices of vertices in each revised Voronoi regions.
    vertices : list of tuples
        Coordinates for revised Voronoi vertices. Same as coordinates
        of input vertices, with 'points at infinity' appended to the
        end.

    """

    if vor.points.shape[1] != 2:
        raise ValueError("Requires 2D input")

    new_regions = []
    new_vertices = vor.vertices.tolist()

    center = vor.points.mean(axis=0)
    if radius is None:
        radius = vor.points.ptp().max()

    # Construct a map containing all ridges for a given point
    all_ridges = {}
    for (p1, p2), (v1, v2) in zip(vor.ridge_points, vor.ridge_vertices):
        all_ridges.setdefault(p1, []).append((p2, v1, v2))
        all_ridges.setdefault(p2, []).append((p1, v1, v2))

    # Reconstruct infinite regions
    for p1, region in enumerate(vor.point_region):
        vertices = vor.regions[region]

        if all(v >= 0 for v in vertices):
            # finite region
            new_regions.append(vertices)
            continue

        # reconstruct a non-finite region
        ridges = all_ridges[p1]
        new_region = [v for v in vertices if v >= 0]

        for p2, v1, v2 in ridges:
            if v2 < 0:
                v1, v2 = v2, v1
            if v1 >= 0:
                # finite ridge: already in the region
                continue

            # Compute the missing endpoint of an infinite ridge

            t = vor.points[p2] - vor.points[p1] # tangent
            t /= np.linalg.norm(t)
            n = np.array([-t[1], t[0]])  # normal

            midpoint = vor.points[[p1, p2]].mean(axis=0)
            direction = np.sign(np.dot(midpoint - center, n)) * n
            far_point = vor.vertices[v2] + direction * radius

            new_region.append(len(new_vertices))
            new_vertices.append(far_point.tolist())

        # sort region counterclockwise
        vs = np.asarray([new_vertices[v] for v in new_region])
        c = vs.mean(axis=0)
        angles = np.arctan2(vs[:,1] - c[1], vs[:,0] - c[0])
        new_region = np.array(new_region)[np.argsort(angles)]

        # finish
        new_regions.append(new_region.tolist())

    return new_regions, np.asarray(new_vertices)

In [None]:
from scipy.spatial import Voronoi

plt.figure(figsize=(12, 8))
plt.suptitle("Sample plays, standardized, Offense moving left to right")
plt.xlabel("Distance from offensive team's own end zone")
plt.ylabel("Y coordinate")

sample_20171120000963 = train_1[train_1.PlayId==20171120000963].copy()
for gp,chance in sample_20171120000963.groupby('PlayId'):
    play_id = gp
    rusher = chance[chance.NflId==chance.NflIdRusher].iloc[0]
    offense = chance[chance.IsOnOffense]
    defense = chance[~chance.IsOnOffense]
    
    plt.subplot(1,1,1)
    i+=1
    
    x_min, x_max = chance.X_std.min()-2, chance.X_std.max()+2
    y_min, y_max = chance.Y_std.min()-2, chance.Y_std.max()+2
    #plt.xlim(8,50) # 特定
    plt.xlim(x_min,x_max)
    #plt.ylim(5,40) # 特定
    plt.ylim(y_min,y_max)
    #plt.plot([x_min,x_min,x_max,x_max,x_min],[y_min,y_max,y_max,y_min,y_min],c='black',linewidth=1.5)
    
    vor = Voronoi(np.array([[row.X_std,row.Y_std] for index, row in chance.iterrows()]))
    regions, vertices = voronoi_finite_polygons_2d(vor)
    for region in regions:
        polygon = vertices[region]
        plt.plot(*zip(*polygon),c='black',alpha=0.8)
    
    plt.scatter(offense.X_std,offense.Y_std,marker='o',c='green',s=55,alpha=0.5,label='OFFENSE')
    plt.scatter(defense.X_std,defense.Y_std,marker='o',c='red',s=55,alpha=0.5,label='DEFENSE')
    plt.scatter([rusher.X_std],[rusher.Y_std],marker='o',c='black',s=30,label='RUSHER')
    
    for line in range(10,130,10):
        plt.plot([line,line],[-100,100],c='silver',linewidth=0.8,linestyle='-')
    
    plt.plot([rusher.YardsFromOwnGoal,rusher.YardsFromOwnGoal],[-100,100],c='black',linewidth=1.5,linestyle=':')
    plt.plot([10,10],[-100,100],c='black',linewidth=2)
    plt.plot([110,110],[-100,100],c='black',linewidth=2)
    
    plt.title(play_id)
    plt.legend()

plt.show()

Certainly, the area around the ball carrier (and perhaps the area around his teammates (the players in blue) will be related to the Yards gained. 

Want a VIP hint? Consider a spatial diagram weights the area on the field by player speed. See [http://www.lukebornn.com/papers/fernandez_ssac_2018.pdf](http://www.lukebornn.com/papers/fernandez_ssac_2018.pdf) for one example in soccer, and last year's winning Big Data Bowl entry ([link](https://operations.nfl.com/media/3670/big-data-bowl-sfu.pdf)). 

Alternatively, the Voronoi diagram below removes all the players but the ball carrier (below). This allows the focus to be on the running back, as his teammates shouldn't exactly be counting against him. 

In [None]:
# plt.figure(figsize=(12, 8))
# plt.suptitle("Sample plays, standardized, Offense moving left to right")
# plt.xlabel("Distance from offensive team's own end zone")
# plt.ylabel("Y coordinate")

for gp,chance in sample_20171120000963.groupby('PlayId'):
    play_id = gp
    rusher = chance[chance.NflId==chance.NflIdRusher].iloc[0]
    offense = chance[chance.IsOnOffense]
    defense = chance[~chance.IsOnOffense]
    
#     plt.subplot(1,1,1)
    i+=1
    x_min, x_max = chance.X_std.min()-2, chance.X_std.max()+2
    y_min, y_max = chance.Y_std.min()-2, chance.Y_std.max()+2
#     plt.xlim(x_min,x_max)
#     plt.ylim(y_min,y_max)
    
    vor = Voronoi(np.array([[row.X_std,row.Y_std] for index, row in defense.append(rusher).iterrows()]))
    from scipy.spatial import voronoi_plot_2d
    fig = voronoi_plot_2d(vor,show_vertices=False,point_size=0.1,linewidth=2)
    fig.set_size_inches(12,8)
    #vor = Voronoi(np.array([[row.X_std,row.Y_std] for index, row in offense.iterrows()]))
    #vor = Voronoi(np.array([[row.X_std,row.Y_std] for index, row in chance.iterrows()]))
#     regions, vertices = voronoi_finite_polygons_2d(vor)
#     for region in regions:
#         polygon = vertices[region]
#         plt.plot(*zip(*polygon),c='black',alpha=0.8)
    
    #plt.scatter(offense.X_std,offense.Y_std,marker='o',c='green',s=55,alpha=0.5,label='OFFENSE')
    plt.scatter(defense.X_std,defense.Y_std,marker='o',c='red',s=55,alpha=0.5,label='DEFENSE')
    plt.scatter([rusher.X_std],[rusher.Y_std],marker='o',c='black',s=30,label='RUSHER')
    
    for line in range(10,130,10):
        plt.plot([line,line],[-100,100],c='silver',linewidth=0.8,linestyle='-')
    
    plt.plot([rusher.YardsFromOwnGoal,rusher.YardsFromOwnGoal],[-100,100],c='black',linewidth=1.5,linestyle=':')
    plt.plot([10,10],[-100,100],c='black',linewidth=2)
    plt.plot([110,110],[-100,100],c='black',linewidth=2)
    
#     plt.title(play_id)
#     plt.legend()

plt.show()

## 3. Player direction

Next, our goal is to make use of the `Direction` variable. The direction in which a player is moving is based on the change in his `X` and `Y` coordinates over time, and as shown in the [schema](https://www.kaggle.com/c/nfl-big-data-bowl-2020/data), this is relative to the side in which his team is moving and the movement of each play. 

There are two primary reason for standardizing `Direction`.

First, if we want to use player direction on its own as a variable, **a statistical model is going to have a difficult time untangling the raw data feed.** For example, 135 degrees and 315 degrees could be the same direction, or they could be different, depending on the `Direction` variabile, which indicates where the entire offensive unit is facing. 

Second, standardizing directions is needed if we are to replicate the soccer paper ([link](http://www.lukebornn.com/papers/fernandez_ssac_2018.pdf)) referenced above. For example, identifying the **fraction of space in front of the ball carrier that the offense owns (or that the ball carrier owns)** would assuredly be a variable that would lead to more accurate model projections. This is potentially easier to parse out when direction is standardized.

I'm going to standardize `Dir` as follows:

- 0 degrees: the offensive player is moving completely to his left  
- 90 degrees: the offensive player is moving straight ahead, towards opponent end zone
- 180 degrees: the offensive player is moving completely to his right
- 270 degrees: the offensive player is moving backwards, towards his own team's end zone (this is generally bad)

First, I filter to only include the ball carrier (`IsBallCarrier`, as defined above). I later standardize direction for all players, but I'm starting with the ball carrier because his movement is likely most relevant to predictions.  

Next, I make a histogram, faceted by `PlayDirection` to highlight the funny behavior of ball carrier directions. 

In [None]:
df_bc = train_1[train_1.IsBallCarrier][['DisplayName','PossessionTeam','PlayId','Dir','ToLeft','PlayDirection','IsOnOffense','X_std','Y_std','YardsFromOwnGoal','Down','Distance','Yards']].copy()

In [None]:
import seaborn as sns
plt.figure(figsize=(10, 5))
g = sns.FacetGrid(df_bc, col="ToLeft", height=4, aspect=.5)
g = g.map(plt.hist, "Dir")
plt.show()

On plays to the left, angles are clustered between 180 and 360, with a handful of angles around 0 to 15 degrees. This can be tricky --  5 degrees should be treated like 365 degrees, even though it doesn't show up that way on the chart. Similarly, among players moving to the right `PlayDirection == "right"`, there are a handful of player angles around 350 degrees with most players facing between 0 and 180 degrees. 

The first step in my standardizations are to clean up the above histograms. There are assuredly ways to do this whole thing in one line of code, but I'm erring on the side of being cautious. 

In [None]:
df_bc['Dir_std_1'] = df_bc[['ToLeft','Dir']].apply(lambda row:row.Dir+360 if (row.ToLeft and row.Dir<90) else row.Dir, axis=1)
df_bc['Dir_std_1'] = df_bc[['ToLeft','Dir','Dir_std_1']].apply(lambda row:row.Dir-360 if ((not row.ToLeft) and row.Dir>270) else row.Dir_std_1, axis=1)

In [None]:
plt.figure(figsize=(10, 5))
g = sns.FacetGrid(df_bc, col="ToLeft", height=4, aspect=.5)
g = g.map(plt.hist, "Dir_std_1")
plt.show()

The plots above certainly look more symmetric.

To complete the standardization, I subtract each standardized direction (`Dir_std_1`) from 180 among plays where the offense was moving to the left. Thus, all angles in `Dir_std_2` will correspond to the offense moving right. 

In [None]:
df_bc['Dir_std_2'] = df_bc[['ToLeft','Dir_std_1']].apply(lambda row:row.Dir_std_1-180 if row.ToLeft else row.Dir_std_1, axis=1)

In [None]:
plt.figure(figsize=(10, 5))
g = sns.FacetGrid(df_bc, col="ToLeft", height=4, aspect=.5)
g = g.map(plt.hist, "Dir_std_2")
plt.show()

The above distributions certainly look similar within each facet.

As examples, let's take a look at two plays. First, here's a 14-yard run from Mike Tolbert (`PlayId == 20170910001102`). In the data, we see:

In [None]:
df_bc[df_bc.PlayId==20170910001102][['PlayId','DisplayName','Dir','ToLeft','PlayDirection','X_std','Y_std','YardsFromOwnGoal','Dir_std_1','Dir_std_2','Yards']]

On screen, we see the following play

![](https://media.giphy.com/media/ckZnw8Nm0bwbRm99fl/giphy.gif)

At the moment of the handoff (when a ball carrier gets the ball), Tolbert is moving as much to the left as he is moving forward. In our code, a few aspects of the play line up:

- `ToLeft == FALSE` and `PlayDirection == "right"`: the offense is moving right. Note that this is dependent on the angle of the camera at each home stadia, but here, it matches up
- `X_std == 73.38` and `YardsFromOwnGoal == 78`. The offense started the play 78 yards from its own goalline, but Tolbert got the ball a few yards behind that mark. Were Tolbert to have been tackled right where he took the handoff, it would've been a loss of 4 or 5 yards (e.g., `Yards == -4`). Instead, he moved 14 yards past the line of scrimmage. In this example, the maximum yards gained for Tolbert would be (100-`YardsFromOwnGoal` = 22). 
- The standardized direction does not change from the original direction. 0 degrees would represent Tolbert moving directly to his left, 90 degrees would be Tolbert moving straight ahead, and 180 degrees would be Tolbert moving directly to his right. If Tolbert was at 270 degrees on this standardized variable, he'd be running the wrong way!

Here's a second play.

In [None]:
df_bc[df_bc.PlayId==20170910000081][['PlayId','DisplayName','Dir','ToLeft','PlayDirection','X_std','Y_std','YardsFromOwnGoal','Dir_std_1','Dir_std_2','Yards']]

In this play, Tolbert's teammate LeSean McCoy takes the ball and moves to his right.

![](https://media.giphy.com/media/mBwPxbGLGNjpLSpenx/giphy.gif)

Even though the same team has the ball in the same game as the play above, the offense is now moving right to left (`PlayDirection == "left")`. 

Next, despite the fact that the play occured in nearly the same spot on the field as the Tolbert carry above, the standardized `X_std == 19.35`, indicating that McCoy received the ball 19 yards from his own goal. In this example, the maximum yards gained for McCoy would be (100-`YardsFromOwnGoal` = 75). 

Finally, for McCoy's `Dir == 327.14`. This could be confusing to think about -- but when standardized, `Dir_std_2 == 147.14`, which means he's moving to his right while also moving upfield. 

Altogether, hopefully this clarifies some of how the standardized coordinates, yardline, and player direction (`Dir`) work on a football field. 

## 4. Angle charts using player direction

To show how the standardization of `Dir` would work for all players on a given play, let's expand across the larger data set of all players on the Mike Tolbert running play above. Each player will be shown with an arrow that corresponds to how fast he was moving, and in which direction, at the time of handoff. *Note*: speed (`S`, in yards per second) and direction (`Dir_std_2`, my standardized direction) are used to create each vector. R uses radians instead of degrees (hence the multiplication by `pi/180`), and I subtract each angle from 90 (`90-Dir_std_2)`) to ensure that the angles match the direction in which teams are moving on my figure. 

Roughly, the end of the arrow corresponds to where players are expected to end up a second after the handoff. 


In [None]:
train_2 = train_1.copy()
train_2['Dir_std_1'] = train_2[['ToLeft','Dir']].apply(lambda row:row.Dir+360 if (row.ToLeft and row.Dir<90) else row.Dir, axis=1)
train_2['Dir_std_1'] = train_2[['ToLeft','Dir','Dir_std_1']].apply(lambda row:row.Dir-360 if ((not row.ToLeft) and row.Dir>270) else row.Dir_std_1, axis=1)
train_2['Dir_std_2'] = train_2[['ToLeft','Dir_std_1']].apply(lambda row:row.Dir_std_1-180 if row.ToLeft else row.Dir_std_1, axis=1)
train_2['X_std_end'] = train_2[['Dir_std_2','X_std','S']].apply(lambda row:row.S*np.cos((90-row.Dir_std_2)*np.pi/180.)+row.X_std, axis=1)
train_2['Y_std_end'] = train_2[['Dir_std_2','Y_std','S']].apply(lambda row:row.S*np.sin((90-row.Dir_std_2)*np.pi/180.)+row.Y_std, axis=1)

sample_20170910001102 = train_2[train_2.PlayId==20170910001102].copy()
sample_20170910001102

In [None]:
plt.figure(figsize=(12, 8))
plt.suptitle("Playid:20170910001102")
plt.xlabel("Distance from offensive team's own end zone")
plt.ylabel("Y coordinate")

for gp,chance in sample_20170910001102.groupby('PlayId'):
    play_id = gp
    rusher = chance[chance.NflId==chance.NflIdRusher].iloc[0]
    offense = chance[chance.IsOnOffense]
    defense = chance[~chance.IsOnOffense]
    
    plt.subplot(1,1,1)
    i+=1
    
    x_min, x_max = chance.X_std.min()-5, chance.X_std.max()+5
    y_min, y_max = chance.Y_std.min()-5, chance.Y_std.max()+5
    plt.xlim(x_min,x_max)
    plt.ylim(y_min,y_max)
    
    plt.scatter(offense.X_std,offense.Y_std,marker='o',c='green',s=55,alpha=0.5,label='OFFENSE')
    plt.scatter(defense.X_std,defense.Y_std,marker='o',c='red',s=55,alpha=0.5,label='DEFENSE')
    plt.scatter([rusher.X_std],[rusher.Y_std],marker='o',c='black',s=30,label='RUSHER')
    
    for idx, row in chance.iterrows():
        _color='black' if row.IsBallCarrier else('green' if row.IsOnOffense else 'red')
        plt.arrow(row.X_std,row.Y_std,row.X_std_end-row.X_std,row.Y_std_end-row.Y_std,width=0.05,head_width=0.3,ec=_color,fc=_color)
    
    for line in range(10,130,10):
        plt.plot([line,line],[-100,100],c='silver',linewidth=0.8,linestyle='-')
    
    plt.plot([rusher.YardsFromOwnGoal,rusher.YardsFromOwnGoal],[-100,100],c='black',linewidth=1.5,linestyle=':')
    plt.plot([10,10],[-100,100],c='black',linewidth=2)
    plt.plot([110,110],[-100,100],c='black',linewidth=2)
    
    plt.title(play_id)
    plt.legend()

plt.show()

If you review the Tolbert handoff above, each players' movement at handoff should look similar to what's depicted above. 

## 5. Running back sonars

Soccer analyst Eliot McKinley has a series of plots ([Ex 1](https://twitter.com/etmckinley/status/1046389278153068545), [Ex 2, with code](https://github.com/etmckinley/PassSonar)) that I'll shamelessly replicate with our ball carrier data. The idea of the sonar is to highlight the directions where ball carriers tend to move, while also providing some context regarding the success of those movements.

As a relatively basic definition of play success, I'll use `IsSuccess`, defined below, which corresponds to whether or not the ball carrier gained half of the required yardage (or more) needed for a first down (`Yards >= Distance/2`) on any first or second down rush, and whether or not a first down was gained (`Yards >= Distance`) for any third or fourth down rush. 

In [None]:
df_bc['IsSuccess'] = df_bc[['Down','Distance','Yards']].apply(lambda row: row.Yards>=(row.Distance/2) if row.Down in [1,2] else row.Yards>=row.Distance, axis=1)

Next, building on Eliot's code, I'll make one sonar for each team, representing where each team tends to run (and how well they do when they run in each direction). 

In [None]:
df_bc['AngleRound'] = df_bc.Dir_std_2.apply(lambda ds2:np.round(ds2/15)*15)

In [None]:
plt.figure(figsize=(30, 70))
# plt.suptitle("Team senor(FAKE)")
# plt.xlabel("Team")
# plt.ylabel("Success rate")
plt.subplots_adjust(wspace=0.5, hspace=0.5)
i=1
for idx,row in df_bc[['PossessionTeam','AngleRound','IsSuccess']].groupby(['PossessionTeam']):
    plt.subplot(9,4,i)
    i+=1
    row.groupby('AngleRound').IsSuccess.mean().plot.bar()
    plt.title(row.PossessionTeam.iloc[0])
    plt.legend()
plt.show()

Teams that tend to run the ball up the middle include Baltimore, Tennessee, Denver, and Dallas, while Seattle, Kansas City, and the LA Rams tend to ball carriers moving more towards the sideline. Cleveland tended to have more run plays with the ball carrier moving right, while Minnesota was more left-side heavy. 

If this notebook help u a little bit, please upvote me, and let more kagger get this, thx, :)