# Identifying Coverage Scheme Among Defensive Backs
In this notebook, I implement the method to identify coverage scheme (man or zone) based on the paper *"Unsupervised Methods for Identifying Pass Coverage Among Defensive Backs with NFL Player Tracking Data"*. The main contribution of the paper is to define a set of features from the tracking data that distinguish between "man" and "zone" coverage of defensive backs using unsupervised learning techniques. 

The method can simply be described as follow:
1. **Feature Generation**: Define a set of features from the tracking data that distinguish between "man" and "zone" coverage.
2. **Clustering**: Use mixture models to create clusters corresponding to each group, allowing us to provide probabilistic assignments to each coverage type (or cluster).

Our focus is to analyze the coverage scheme of cornerbacks. Generally, cornerbacks are more adept at providing close coverage on wide receivers and defending passes.The position requires speed and agility, and the ability to track a receiver (in man coverage) or occupy a space and read the quarterback (in zone coverage).

PLEASE UPVOTE if you like this kernel. It will keep me motivated to post more kernel :).

In [None]:
import pandas as pd
import numpy as np
import os
import seaborn as sns
from ipywidgets import interact, fixed
from sklearn.mixture import GaussianMixture
from sklearn.preprocessing import MinMaxScaler
from sklearn.cluster import KMeans

%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib import animation
from matplotlib.animation import FFMpegWriter
pd.set_option('max_columns', 100)

import dateutil
from math import radians
from IPython.display import Video

import warnings
warnings.filterwarnings('ignore')

# 1.0. Dataset Preparation

I use week 1, week 2, and week 5 tracking data. More data can be added if needed.

In [None]:
plays = pd.read_csv('../input/nfl-big-data-bowl-2021/plays.csv')
week1 = pd.read_csv('../input/nfl-big-data-bowl-2021/week1.csv')
week2 = pd.read_csv('../input/nfl-big-data-bowl-2021/week2.csv')
week5 = pd.read_csv('../input/nfl-big-data-bowl-2021/week5.csv')
week = pd.concat([week1, week2, week5], ignore_index=True)

In the paper, the features are estimated at different point throughout the play that corresponds to on-field events. These times periods are:
* Before snap
* Between snap and ball thrown
* After ball thrown

This kernel will use the same approach. Thus, we will make adjustment to our dataset to reflect this by modifying the `event` variable in the dataset.

In [None]:
weekArray = np.array(week)
previousEvent = 'ball_snap'
for i, instance in enumerate(weekArray):
    event = instance[8]
    frameId = instance[13]
    if (previousEvent == 'ball_snap' and event != 'ball_snap') or frameId == 1:
        weekArray[i][8] = 'ball_snap'
        previousEvent = 'ball_snap'
    elif (event == 'ball_snap'):
        previousEvent = 'between_snap'
    elif (previousEvent == 'between_snap' and event != 'pass_forward'):
        weekArray[i][8] = 'between_snap'
        previousEvent = 'between_snap'
    elif (event == 'pass_forward'):
        weekArray[i][8] = 'after_thrown'
        previousEvent = 'after_thrown'
    elif (previousEvent == 'after_thrown' and frameId != 1):
        weekArray[i][8] = 'after_thrown'
        previousEvent = 'after_thrown'
        
weekMod = pd.DataFrame(weekArray, columns=week.columns)
week['event'] = weekMod['event']
weekMod = week

# 2.0. Features Generation
In the paper, there are 11 features that can be estimated at different points throughout the play that characterize the type of coverage scheme:
- `varX`: Variance in the x coordinate
- `varY`: Variance in the y coordinate
- `varS`: Variance in the speed
- `offVar`: Variance in the distance from the nearest offensive player at every frame
- `defVar`: Variance in the distance from the nearest defensive player at every frame
- `offMean`: Mean distance from the nearest offensive player at every frame
- `defMean`: Mean distance from the nearest defensive player at every frame
- `offDirVar`: Variance in the difference in de- grees of the direction of motion between the player and the nearest offensive player
- `offDirMean`: Mean difference in degrees of the direction of motion between the player and the nearest offensive player
- `ratMean`: Mean ratio of the distance to the nearest offensive player and the distance from the nearest offensive player to the nearest defensive player
- `ratVar`: Variance of the ratio of the distance to the nearest offensive player and the distance from the nearest offensive player to the nearest defensive player

These features are specific for defensive backs. But, I'll generate features for all players (defensive and offensive) to use for further analysis. So, a bit of generalization is required to do that. For instance, `offVar` is basically the variance in the distance from the nearest opponent player and `defVar` is the variance in the distance from the nearest team mate. So, I change the name of some features a little bit:
- `varX`: Variance in the x coordinate
- `varY`: Variance in the y coordinate
- `varS`: Variance in the speed
- `oppVar`: Variance in the distance from the nearest opponent player at every frame
- `mateVar`: Variance in the distance from the nearest team mate at every frame
- `oppMean`: Mean distance from the nearest opponent player at every frame
- `mateMean`: Mean distance from the nearest team mate at every frame
- `oppDirVar`: Variance in the difference in degrees of the direction of motion between the player and the nearest opponent player
- `oppDirMean`: Mean difference in degrees of the direction of motion between the player and the nearest opponent player
- `ratMean`: Mean ratio of the distance to the nearest opponent player and the distance from the nearest opponent player to the nearest team mate
- `ratVar`: Variance of the ratio of the distance to the nearest opponent player and the distance from the nearest opponent player to the nearest team mate

## 2.1. Variance in the (x,y) coordinate and speed

In [None]:
varX = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['x'].agg(['var']).reset_index().rename(columns={"var": "varX"})
varY = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['y'].agg(['var']).reset_index().rename(columns={"var": "varY"})
varS = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['s'].agg(['var']).reset_index().rename(columns={"var": "varS"})

## 2.2. Mean and variance distance from the nearest opponent player

In [None]:
groupedWeek = weekMod.groupby(['gameId', 'playId', 'frameId'])
playerXY = {}
for name, group in groupedWeek:
    playerXY[name] = []
    for row in group.iterrows():
        data = [row[1]['nflId'], row[1]['team'], row[1]['x'], row[1]['y'], row[1]['dir']]
        playerXY[name].append(data)

features = list(weekMod.columns)
weekArray = np.array(weekMod)
minOppDist = []
for player in weekArray:
    if player[features.index('team')] != 'football':
        opponentPositions = playerXY[(player[features.index('gameId')], player[features.index('playId')], player[features.index('frameId')])]
        distances = []
        directions = []
        opponents = []
        xs = []
        ys = []
        for oppPos in opponentPositions: 
            if player[features.index('team')] != oppPos[1] and player[features.index('team')] != 'football' and oppPos[1] != 'football':
                dx = (player[features.index('x')] - oppPos[2])**2
                dy = (player[features.index('y')] - oppPos[3])**2
                dist = np.sqrt(dx+dy)
                distances.append(dist)
                directions.append(oppPos[4])
                opponents.append(oppPos[0])
                xs.append(oppPos[2])
                ys.append(oppPos[3])
        minDist = min(distances)
        closestOpponent = opponents[np.argmin(distances)]
        opponentDir = directions[np.argmin(distances)]
        opponentX = xs[np.argmin(distances)]
        opponentY = ys[np.argmin(distances)]
        summary = [player[features.index('gameId')], player[features.index('playId')], player[features.index('frameId')], player[features.index('nflId')], minDist, closestOpponent, opponentDir, opponentX, opponentY]
        minOppDist.append(summary)
        
minOppDist = pd.DataFrame(minOppDist, columns=['gameId', 'playId', 'frameId', 'nflId', 'oppMinDist', 'closestOpp(nflId)', 'oppDir', 'oppX', 'oppY'])
weekMod = pd.merge(weekMod, minOppDist, how='left', on=['gameId', 'frameId', 'playId', 'nflId'])
oppVar = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['oppMinDist'].agg(['var']).reset_index().rename(columns={"var": "oppVar"})
oppMean = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['oppMinDist'].agg(['mean']).reset_index().rename(columns={"mean": "oppMean"})

## 2.3. Mean and variance distance from the nearest team mate

In [None]:
features = list(weekMod.columns)
weekArray = np.array(weekMod)
minMateDist = []
for player in weekArray:
    if player[features.index('team')] != 'football':
        matePositions = playerXY[(player[features.index('gameId')], player[features.index('playId')], player[features.index('frameId')])]
        distances = []
        mates = []
        xs = []
        ys = []
        for matePos in matePositions: 
            if player[features.index('team')] == matePos[1] and player[features.index('nflId')] != matePos[0] and player[features.index('team')] != 'football' and matePos[1] != 'football':
                dx = (player[features.index('x')] - matePos[2])**2
                dy = (player[features.index('y')] - matePos[3])**2
                dist = np.sqrt(dx+dy)
                distances.append(dist)
                mates.append(matePos[0])
                xs.append(oppPos[2])
                ys.append(oppPos[3])
        minDist = min(distances)
        closestMate = mates[np.argmin(distances)]
        mateX = xs[np.argmin(distances)]
        mateY = ys[np.argmin(distances)]
        summary = [player[features.index('gameId')], player[features.index('playId')], player[features.index('frameId')], player[features.index('nflId')], minDist, closestMate, mateX, mateY]
        minMateDist.append(summary)
        
minMateDist = pd.DataFrame(minMateDist, columns=['gameId', 'playId', 'frameId', 'nflId', 'mateMinDist', 'closestMate(nflId)', 'mateX', 'mateY'])
weekMod = pd.merge(weekMod, minMateDist, how='left', on=['gameId', 'frameId', 'playId', 'nflId'])
mateVar = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['mateMinDist'].agg(['var']).reset_index().rename(columns={"var": "mateVar"})
mateMean = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['mateMinDist'].agg(['mean']).reset_index().rename(columns={"mean": "mateMean"})

## 2.4. Mean and variance difference in degrees of the direction of motion between the player and the nearest opponent player

In [None]:
diffDir = np.absolute(weekMod['dir'] - weekMod['oppDir'])
weekMod['diffDir'] = diffDir
oppDirVar = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['diffDir'].agg(['var']).reset_index().rename(columns={"var": "oppDirVar"})
oppDirMean = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['diffDir'].agg(['mean']).reset_index().rename(columns={"mean": "oppDirMean"})

## 2.5. Mean and variance ratio of the distance to the nearest opponent player and the distance from the nearest opponent player to the nearest team mate

In [None]:
ratio = weekMod['oppMinDist'] / np.sqrt((weekMod['oppX'] - weekMod['mateX'])**2 + (weekMod['oppY'] - weekMod['mateY'])**2)
weekMod['oppMateDistRatio'] = ratio
oppMateDistRatioMean = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['oppMateDistRatio'].agg(['mean']).reset_index().rename(columns={"mean": "meanOppMateDistRatio"})
oppMateDistRatioVar = weekMod.groupby(['gameId', 'playId', 'event', 'nflId'])['oppMateDistRatio'].agg(['var']).reset_index().rename(columns={"var": "varOppMateDistRatio"})

Then, we combine all of the generated features into our main dataframe `weekMod`.

In [None]:
features = [varX, varY, varS, oppVar, oppMean, mateVar, mateMean, oppDirVar, oppDirMean, oppMateDistRatioMean, oppMateDistRatioVar]
for feature in features:
    weekMod = pd.merge(weekMod, feature, how='left', on=['gameId', 'event', 'playId', 'nflId'])

# 2.0. Clustering
We use Gaussian mixture model, a type of clustering algorithm that fits a mixture of probability density functions to a dataset, where each density is representative of a single group or cluster. There are two key benefits to using mixture models: 
1. Mixture models yield “soft” cluster assignments (i.e. probabilistic labels for each cluster), allowing us to quantify how certain we are when assigning man coverage or zone coverage labels. 
2. Mixture models are density-based, statistical models that estimate a empirical probability distribution from real data.

Therefore, we can use our mixture model to make man vs. zone predictions for each defensive back on each play in real-time throughout the course of a football game.

Due to the high degree of colinearity in the feature space, we use an unconstrained “VVV” covariance structure ("full") when fitting the model. 

In [None]:
# Set the train dataset
X = weekMod.loc[weekMod['position'] == 'CB'][weekMod.columns[30:]].dropna()
xTrain = X.drop_duplicates()

# Scale the data
scaler = MinMaxScaler()
scaler.fit(xTrain)
xTrainScaled = scaler.transform(xTrain)

# Set and train the Gaussian mixture model
gmm = GaussianMixture(n_components=2, covariance_type='full', random_state=42)
gmm.fit(xTrainScaled)

# Make class prediction and probability estimation
pred = gmm.predict(X)
prob = gmm.predict_proba(X)

# Join the class prediction and probability estimation into our main dataframe
X['cluster'] = pred
X['cluster_prob'] = prob[:,0]
weekFin = weekMod.join(X[['cluster', 'cluster_prob']])

# 3.0. Visualization
We visualize the coverage scheme of cornerbacks to see the results of our coverage scheme prediction. The visualization is based on:
* kernel created by Rob Mulia (@robikscube, see the kernel [here](https://www.kaggle.com/robikscube/nfl-big-data-bowl-plotting-player-position))
* kernel created by me (see the kernel [here](https://www.kaggle.com/ar2017/nfl-big-data-bowl-2021-animating-players-movement))

In [None]:
def create_football_field(linenumbers=True,
                          endzones=True,
                          highlight_line=False,
                          highlight_line_number=55,
                          highlight_first_down_line=False,
                          yards_to_go=10,
                          highlighted_name='Line of Scrimmage',
                          fifty_is_los=False,
                          figsize=(12, 6.33)):
    """
    Function that plots the football field for viewing plays.
    Allows for showing or hiding endzones.
    """
    rect = patches.Rectangle((0, 0), 120, 53.3, linewidth=0.1,
                             edgecolor='r', facecolor='darkgreen', zorder=0)

    fig, ax = plt.subplots(1, figsize=figsize)
    ax.add_patch(rect)

    plt.plot([10, 10, 10, 20, 20, 30, 30, 40, 40, 50, 50, 60, 60, 70, 70, 80,
              80, 90, 90, 100, 100, 110, 110, 120, 0, 0, 120, 120],
             [0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3,
              53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 53.3, 0, 0, 53.3],
             color='white')
    if fifty_is_los:
        plt.plot([60, 60], [0, 53.3], color='gold')
        plt.text(62, 50, '<- Player Yardline at Snap', color='gold')
    # Endzones
    if endzones:
        ez1 = patches.Rectangle((0, 0), 10, 53.3,
                                linewidth=0.1,
                                edgecolor='r',
                                facecolor='blue',
                                alpha=0.2,
                                zorder=0)
        ez2 = patches.Rectangle((110, 0), 120, 53.3,
                                linewidth=0.1,
                                edgecolor='r',
                                facecolor='blue',
                                alpha=0.2,
                                zorder=0)
        ax.add_patch(ez1)
        ax.add_patch(ez2)
    plt.xlim(0, 120)
    plt.ylim(-5, 58.3)
    plt.axis('off')
    if linenumbers:
        for x in range(20, 110, 10):
            numb = x
            if x > 50:
                numb = 120 - x
            plt.text(x, 5, str(numb - 10),
                     horizontalalignment='center',
                     fontsize=20,  # fontname='Arial',
                     color='white')
            plt.text(x - 0.95, 53.3 - 5, str(numb - 10),
                     horizontalalignment='center',
                     fontsize=20,  # fontname='Arial',
                     color='white', rotation=180)
    if endzones:
        hash_range = range(11, 110)
    else:
        hash_range = range(1, 120)

    for x in hash_range:
        ax.plot([x, x], [0.4, 0.7], color='white')
        ax.plot([x, x], [53.0, 52.5], color='white')
        ax.plot([x, x], [22.91, 23.57], color='white')
        ax.plot([x, x], [29.73, 30.39], color='white')

    if highlight_line:
        hl = highlight_line_number + 10
        plt.plot([hl, hl], [0, 53.3], color='yellow')
        #plt.text(hl + 2, 50, '<- {}'.format(highlighted_name),
        #         color='yellow')
        
    if highlight_first_down_line:
        fl = hl + yards_to_go
        plt.plot([fl, fl], [0, 53.3], color='yellow')
        #plt.text(fl + 2, 50, '<- {}'.format(highlighted_name),
        #         color='yellow')
    return fig, ax

In [None]:
def calculate_dx_dy_arrow(x, y, angle, speed, multiplier):
    if angle <= 90:
        angle = angle
        dx = np.sin(radians(angle)) * multiplier * speed
        dy = np.cos(radians(angle)) * multiplier * speed
        return dx, dy
    if angle > 90 and angle <= 180:
        angle = angle - 90
        dx = np.sin(radians(angle)) * multiplier * speed
        dy = -np.cos(radians(angle)) * multiplier * speed
        return dx, dy
    if angle > 180 and angle <= 270:
        angle = angle - 180
        dx = -(np.sin(radians(angle)) * multiplier * speed)
        dy = -(np.cos(radians(angle)) * multiplier * speed)
        return dx, dy
    if angle > 270 and angle <= 360:
        angle = 360 - angle
        dx = -np.sin(radians(angle)) * multiplier * speed
        dy = np.cos(radians(angle)) * multiplier * speed
        return dx, dy
    
        
def animate_player_movement(weekData, playId, gameId):
    playData = pd.read_csv('../input/nfl-big-data-bowl-2021/plays.csv')
    
    playHome = weekData.query('gameId==' + str(gameId) + ' and playId==' + str(playId) + ' and team == "home"')
    playAway = weekData.query('gameId==' + str(gameId) + ' and playId==' + str(playId) + ' and team == "away"')
    playFootball = weekData.query('gameId==' + str(gameId) + ' and playId==' + str(playId) + ' and team == "football"')
    
    playHome['time'] = playHome['time'].apply(lambda x: dateutil.parser.parse(x).timestamp()).rank(method='dense')
    playAway['time'] = playAway['time'].apply(lambda x: dateutil.parser.parse(x).timestamp()).rank(method='dense')
    playFootball['time'] = playFootball['time'].apply(lambda x: dateutil.parser.parse(x).timestamp()).rank(method='dense')
    
    maxTime = int(playAway['time'].unique().max())
    minTime = int(playAway['time'].unique().min())
    
    yardlineNumber = playData.query('gameId==' + str(gameId) + ' and playId==' + str(playId))['yardlineNumber'].item()
    yardsToGo = playData.query('gameId==' + str(gameId) + ' and playId==' + str(playId))['yardsToGo'].item()
    absoluteYardlineNumber = playData.query('gameId==' + str(gameId) + ' and playId==' + str(playId))['absoluteYardlineNumber'].item() - 10
    playDir = playHome.sample(1)['playDirection'].item()
    
    if (absoluteYardlineNumber > 50):
        yardlineNumber = 100 - yardlineNumber
    if (absoluteYardlineNumber <= 50):
        yardlineNumber = yardlineNumber
        
    if (playDir == 'left'):
        yardsToGo = -yardsToGo
    else:
        yardsToGo = yardsToGo
    
    fig, ax = create_football_field(highlight_line=True, highlight_line_number=yardlineNumber, highlight_first_down_line=True, yards_to_go=yardsToGo)
    playDesc = playData.query('gameId==' + str(gameId) + ' and playId==' + str(playId))['playDescription'].item()
    plt.title(f'Game # {gameId} Play # {playId} \n {playDesc}')
    
    def update_animation(time):
        patch = []
        
        homeX = playHome.query('time == ' + str(time))['x']
        homeY = playHome.query('time == ' + str(time))['y']
        homeNum = playHome.query('time == ' + str(time))['jerseyNumber']
        homeOrient = playHome.query('time == ' + str(time))['o']
        homeDir = playHome.query('time == ' + str(time))['dir']
        homeSpeed = playHome.query('time == ' + str(time))['s']
        homePosition = playHome.query('time == ' + str(time))['position']
        homeCluster = playHome.query('time == ' + str(time))['cluster']
        homeClusterProb = playHome.query('time == ' + str(time))['cluster_prob']
        patch.extend(plt.plot(homeX, homeY, 'o',c='gold', ms=20, mec='white', zorder=3))
        
        # Home players' jersey number 
        for x, y, num in zip(homeX, homeY, homeNum):
            patch.append(plt.text(x, y, int(num), va='center', ha='center', color='black', size='medium'))
            
        # Home players' orientation
        for x, y, orient in zip(homeX, homeY, homeOrient):
            dx, dy = calculate_dx_dy_arrow(x, y, orient, 1, 1)
            patch.append(plt.arrow(x, y, dx, dy, color='gold', width=0.5, shape='full'))
            
        # Home players' direction
        for x, y, direction, speed in zip(homeX, homeY, homeDir, homeSpeed):
            dx, dy = calculate_dx_dy_arrow(x, y, direction, speed, 1)
            patch.append(plt.arrow(x, y, dx, dy, color='black', width=0.25, shape='full'))
        
        # CB coverage scheme
        for x, y, pos, cluster, prob in zip(homeX, homeY, homePosition, homeCluster, homeClusterProb):
            if pos == 'CB':
                patch.append(plt.text(x, y-6, "P(zone)={}\nP(man)={}".format(prob, 1-prob), va='bottom', ha='center', color='black', size='small',zorder=2, bbox=dict(facecolor='gold', edgecolor='white', pad=2.0)))
        
        awayX = playAway.query('time == ' + str(time))['x']
        awayY = playAway.query('time == ' + str(time))['y']
        awayNum = playAway.query('time == ' + str(time))['jerseyNumber']
        awayOrient = playAway.query('time == ' + str(time))['o']
        awayDir = playAway.query('time == ' + str(time))['dir']
        awaySpeed = playAway.query('time == ' + str(time))['s']
        awayPosition = playAway.query('time == ' + str(time))['position']
        awayCluster = playAway.query('time == ' + str(time))['cluster']
        awayClusterProb = playAway.query('time == ' + str(time))['cluster_prob']
        patch.extend(plt.plot(awayX, awayY, 'o',c='orangered', ms=20, mec='white', zorder=3))
        
        # Away players' jersey number 
        for x, y, num in zip(awayX, awayY, awayNum):
            patch.append(plt.text(x, y, int(num), va='center', ha='center', color='white', size='medium'))
            
        # Away players' orientation
        for x, y, orient in zip(awayX, awayY, awayOrient):
            dx, dy = calculate_dx_dy_arrow(x, y, orient, 1, 1)
            patch.append(plt.arrow(x, y, dx, dy, color='orangered', width=0.5, shape='full'))
        
        # Away players' direction
        for x, y, direction, speed in zip(awayX, awayY, awayDir, awaySpeed):
            dx, dy = calculate_dx_dy_arrow(x, y, direction, speed, 1)
            patch.append(plt.arrow(x, y, dx, dy, color='black', width=0.25, shape='full'))
        
        # CB coverage scheme
        for x, y, pos, cluster, prob in zip(awayX, awayY, awayPosition, awayCluster, awayClusterProb):
            if pos == 'CB':
                patch.append(plt.text(x, y-6, "P(zone)={}\nP(man)={}".format(prob, 1-prob), va='bottom', ha='center', color='white', size='small', zorder=2, bbox=dict(facecolor='orangered', edgecolor='white', pad=2.0)))
        
        # Football location
        footballX = playFootball.query('time == ' + str(time))['x']
        footballY = playFootball.query('time == ' + str(time))['y']
        patch.extend(plt.plot(footballX, footballY, 'o', c='black', ms=10, mec='white', zorder=3, data=playFootball.query('time == ' + str(time))['team']))
        
        
        return patch
    
    ims = [[]]
    for time in np.arange(minTime, maxTime+1):
        patch = update_animation(time)
        ims.append(patch)
        
    anim = animation.ArtistAnimation(fig, ims, repeat=False)
    
    return anim

Let's visualize one of the play.

In [None]:
anim = animate_player_movement(weekFin, 75, 2018090600)
writer = FFMpegWriter(fps=10)
anim.save('animation_notrail.mp4', writer=writer)
Video("animation_notrail.mp4")

# 4.0. Opportunities for Improvement
The paper uses NFL 2019 Big Data Bowl dataset, which does not have player orientation information. The paper identifies two features involving the players’ orientations on the field that may help distinguish man vs. zone coverage:
1. Orientation of the defensive back relative to the line of scrimmage at different points throughout the play.
2. Orientation of the cornerback relative to the corresponding offensive player at different points throughout the play.

I will update this kernel to include these two features because our the 2021 dataset has information about players' orientation.


The framework can be used to analyze the coverage types of safeties and, when appropriate, linebackers. However, a complete analysis of these positions would require the design of new features specific to the safety position and the patterns of motion of safeties in relation to their teammates and opponents.