# Python Field Control Model
Here is an implementation I quickly put together of the paper "*Wide Open Spaces: A statistical technique for measuring space creation in professional soccer*" for NFL data. This has not yet been completely error checked, but should work on most of the data, barring any outliers. I have made a few changes to the model which I will go over in more detail later, but the core is exactly the same. I will cover how this model is built mathematically in the near future, so that you can create changes to it to better reflect American Football rather than soccer.

Markdown of this notebook and further explanations will be added over the next week

# Setup
Before we get into the model, let's first load in all of the necessary libraries, and the dataset that we will be using.

## Library Imports

In [None]:
# Magic
# %matplotlib inline

# Utility Libraries
from datetime import datetime
import pytz

# HTML 
from IPython.display import HTML

# Computation Libraries
import numpy as np
import pandas as pd
import scipy.stats as stats
from scipy.spatial.distance import pdist, squareform

# Plotting libraries
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib import animation, rc
from matplotlib.patches import Rectangle, Arrow

# Graph Libraries
import networkx as nx

## Dataset imports
Usual dataset imports using *read_csv()*. I am only loading in week 1 data for this notebook as it is an example

In [None]:
path_shared = '/kaggle/input/nfl-big-data-bowl-2021/{}'

games_df = pd.read_csv(path_shared.format('games.csv'))
plays_df = pd.read_csv(path_shared.format('plays.csv'))
players_df = pd.read_csv(path_shared.format('players.csv'))
week_df = pd.read_csv(path_shared.format('week1.csv'))

I will perform the field control model for a single play. Let's select all of the positional data for a single play in a single game. You can change **game_idx** and **play_idx** to see whatever play you wish. Let's use *df.head()* to show the first few rows of the dataset that we will be making a control model from.

There are a few things to note here. All of the positional units are in yards. The velocity is in $\frac{yards}{s}$, and the acceleration is in $\frac{yards}{s^2}$. At each time point, we will have a row for every entity of interest in that particular play. Note that all 22 players on the field are not shown, only the players that are relevant to the pass play. **o** stands for the orientation of the player (i.e. which way their trunk is facing) and **dir** is the movement direction of the player. These two values will rarely be the same, so it is important to keep track of both of them during our model. Also note that these angles are in degrees, calculated from due north, clockwise. Generally, angles will be measured from due east counter-clockwise, so we must be careful to either remap these angles, or change model formulation to ensure we are doing the correct thing.

Also, not that the width of the pitch is 120 by 53.3. This will be useful to plot the pitch in the correct aspect ratio.

In [None]:
# Select the game and play that you wish to see in week 1

game_idx = 5
play_idx = -15

unique_game_ids = week_df.gameId.unique()
unique_play_ids = week_df[week_df.gameId == unique_game_ids[game_idx]].playId.unique()

play_df = week_df[week_df.playId == unique_play_ids[play_idx]].sort_values(by = 'time')
play_df.head()

# Animation
## Animation Class

First let us make an animation that will show all of the entities moving across the pitch over time through the play. We are in luck that we have timestamps for each of the timepoints when the measurement was taken. This allows us to make a very precise estimate as to what the sampling rate is, and thus we can set an accurate framerate for the animation to simulate the play at "real time". This is calculated on line 26 as `self._mean_interval_ms`. 

In this animation we plot:
- Player positions
- Line of scrimmage
- Ball
- Velocity vectors of the players
- Orientation of players
- Player last name
- Player position
- Player number

On small screens this may be too much information, but this just shows how you can do it. Feel free to remove as you wish. I will not go into this animation class in too much detail as it is not the focus of this notebook. Feel free to go through it, it is mostly just using matplotlib and basic math to show what we want.

In [None]:
class AnimatePlay:
    def __init__(self, play_df, plot_size_len) -> None:
        """Initializes the datasets used to animate the play.

        Parameters
        ----------
        play_df : DataFrame
            Dataframe corresponding to the play information for the play that requires
            animation. This data will come from the weeks dataframe and contains position
            and velocity information for each of the players and the football.

        Returns
        -------
        None
        """
        self._MAX_FIELD_Y = 53.3
        self._MAX_FIELD_X = 120
        self._MAX_FIELD_PLAYERS = 22
        
        self._CPLT = sns.color_palette("husl", 2)
        self._frame_data = play_df
        self._times = sorted(play_df.time.unique())
        self._stream = self.data_stream()
        
        self._date_format = "%Y-%m-%dT%H:%M:%S.%fZ" 
        self._mean_interval_ms = np.mean([delta.microseconds/1000 for delta in np.diff(np.array([pytz.timezone('US/Eastern').localize(datetime.strptime(date_string, self._date_format)) for date_string in self._times]))])
        
        self._fig = plt.figure(figsize = (plot_size_len, plot_size_len*(self._MAX_FIELD_Y/self._MAX_FIELD_X)))

        self._ax_field = plt.gca()
        
        self._ax_home = self._ax_field.twinx()
        self._ax_away = self._ax_field.twinx()
        self._ax_jersey = self._ax_field.twinx()

        self.ani = animation.FuncAnimation(self._fig, self.update, frames=len(self._times), interval = self._mean_interval_ms, 
                                          init_func=self.setup_plot, blit=False)
        
        plt.close()
       
    @staticmethod
    def set_axis_plots(ax, max_x, max_y) -> None:
        ax.xaxis.set_visible(False)
        ax.yaxis.set_visible(False)

        ax.set_xlim([0, max_x])
        ax.set_ylim([0, max_y])
        
    @staticmethod
    def convert_orientation(x):
        return (-x + 90)%360
    
    @staticmethod
    def polar_to_z(r, theta):
        return r * np.exp( 1j * theta)
    
    @staticmethod
    def deg_to_rad(deg):
        return deg*np.pi/180
        
    def data_stream(self):
        for time in self._times:
            yield self._frame_data[self._frame_data.time == time]
    
    def setup_plot(self): 
        self.set_axis_plots(self._ax_field, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        
#         ball_snap_df = self._frame_data[(self._frame_data.event == 'ball_snap') & (self._frame_data.team == 'football')]
#         self._ax_field.axvline(ball_snap_df.x.to_numpy()[0], color = 'k', linestyle = '--')
        
        self.set_axis_plots(self._ax_home, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        self.set_axis_plots(self._ax_away, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        self.set_axis_plots(self._ax_jersey, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        
        for idx in range(10,120,10):
            self._ax_field.axvline(idx, color = 'k', linestyle = '-', alpha = 0.05)
            
        self._scat_field = self._ax_field.scatter([], [], s = 100, color = 'black')
        self._scat_home = self._ax_home.scatter([], [], s = 500, color = self._CPLT[0], edgecolors = 'k')
        self._scat_away = self._ax_away.scatter([], [], s = 500, color = self._CPLT[1], edgecolors = 'k')
        
        self._scat_jersey_list = []
        self._scat_number_list = []
        self._scat_name_list = []
        self._a_dir_list = []
        self._a_or_list = []
        for _ in range(self._MAX_FIELD_PLAYERS):
            self._scat_jersey_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'white'))
            self._scat_number_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))
            self._scat_name_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))
            
            self._a_dir_list.append(self._ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))
            self._a_or_list.append(self._ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))
            
        return (self._scat_field, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list)
        
    def update(self, anim_frame):
        pos_df = next(self._stream)
        
        for label in pos_df.team.unique():
            label_data = pos_df[pos_df.team == label]

            if label == 'football':
                self._scat_field.set_offsets(np.hstack([label_data.x, label_data.y]))
            elif label == 'home':
                self._scat_home.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'away':
                self._scat_away.set_offsets(np.vstack([label_data.x, label_data.y]).T)

        jersey_df = pos_df[pos_df.jerseyNumber.notnull()]
        
        for (index, row) in pos_df[pos_df.jerseyNumber.notnull()].reset_index().iterrows():
            self._scat_jersey_list[index].set_position((row.x, row.y))
            self._scat_jersey_list[index].set_text(row.position)
            self._scat_number_list[index].set_position((row.x, row.y+1.9))
            self._scat_number_list[index].set_text(int(row.jerseyNumber))
            self._scat_name_list[index].set_position((row.x, row.y-1.9))
            self._scat_name_list[index].set_text(row.displayName.split()[-1])
            
            player_orientation_rad = self.deg_to_rad(self.convert_orientation(row.o))
            player_direction_rad = self.deg_to_rad(self.convert_orientation(row.dir))
            player_speed = row.s
            
            player_vel = np.array([np.real(self.polar_to_z(player_speed, player_direction_rad)), np.imag(self.polar_to_z(player_speed, player_direction_rad))])
            player_orient = np.array([np.real(self.polar_to_z(2, player_orientation_rad)), np.imag(self.polar_to_z(2, player_orientation_rad))])
            
            self._a_dir_list[index].remove()
            self._a_dir_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_vel[0], player_vel[1], color = 'k'))
            
            self._a_or_list[index].remove()
            self._a_or_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_orient[0], player_orient[1], color = 'grey', width = 2))
                
        
        return (self._scat_field, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list)

## Animation Result
We can now use this class to create an animation of the entire play. Just instantiate the class, then convert to JS.

In [None]:
animated_play = AnimatePlay(play_df, 20)
HTML(animated_play.ani.to_jshtml())

# Pitch Control Model
## Background
The pitch control model that we will be implementing consists of two main components.

1. Individual player influence
2. Team Pitch Control

Team pitch control falls into place relatively easily once we can model individual player influence, so most of our efforts will be concentrated on point 1.

## Distance From Ball
Let's think about how much influence any one player can have given their individual distance from the ball. If we imagine two players standing still and facing each other, 5 yards apart, and one throws the football, how much can the other possible move and still catch the football? Without accounting for other variables that may influence this like individual player acceleration, ball speed, etc., we can say maybe 2 yards on either side, and maybe 1 yard forward and backward. This will give us a 'region of influence' for the receiving player. In a very simple case, we can model a region like this as a gaussian distribution, with the mean centered at the player, and a covariance describing this region. This will be highly subjective based on modelling assumptions, but we can model something as follows. Note that when we create the covariance matrix, I am assuming no covariance. The variance occurs only on the individual axes, meaning the covariance matrix is diagonal.

Red refers to the receiver, and black refers to the thrower. We are currently only drawing the influence of the receiver.

In [None]:
# Player positions
player_throw = np.array([0, 0])
player_catch = np.array([5, 0])

# Setup required parameters of multivariate normal
sigma = np.array([[1**2, 0], [0, 2**2]])
mu = player_catch

# Setup Random Variable with underlying distribution of multivariate normal
rv = stats.multivariate_normal(mu, sigma)

# Create grid of points to pass into multivariate normal
x, y = np.mgrid[-5:10:.01, -5:5:.01]
pos = np.dstack((x, y))

# Plot players and 
fig = plt.figure(figsize = (10, 10))
ax = plt.gca()
ax_influence = ax.twinx()

# Plot the figure
ax_influence.contourf(x, y, rv.pdf(pos), alpha = 0.3, cmap = 'Reds')

ax.scatter(player_throw[0], player_throw[1], s = 300, color = 'k', label = 'thrower')
ax.scatter(player_catch[0], player_catch[1], s = 300, color = 'r', label = 'receiver')

ax.set_xlim([-5, 10])
ax.set_ylim([-5, 5])

ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
ax_influence.yaxis.set_visible(False)

ax.set

fig.show()

How about the person throwing the ball? How much influence will they have at the moment they throw the ball? Again this would be dependent on many different factors, but let's keep it simple for now. At this moment in time, they could probably influence about 1 yard in any direction. Note that we are not looking at the influence gained from throwing the ball, but rather at the physical influence of any player as a function of distance from the ball. If we think of black as a wide receiver who has just caught the ball, all he can really do is run. We are assuming he has some influence on the space 1 yard in any direction around him. Let's plot this on our graph as well

In [None]:
# Player positions
player_throw = np.array([0, 0])
player_catch = np.array([5, 0])

# Setup required parameters of multivariate normal
sigma = np.array([[1**2, 0], [0, 2**2]])
mu = player_catch

sigma_throw = np.array([[1**2, 0], [0, 1**2]])
mu_throw = player_throw

# Setup Random Variable with underlying distribution of multivariate normal
rv = stats.multivariate_normal(mu, sigma)
rv_throw = stats.multivariate_normal(mu_throw, sigma_throw)

# Create grid of points to pass into multivariate normal
x, y = np.mgrid[-5:10:.01, -5:5:.01]
pos = np.dstack((x, y))

# Plot players and 
fig = plt.figure(figsize = (10, 10))
ax = plt.gca()
ax_influence = ax.twinx()

# Plot the figure
ax_influence.contourf(x, y, rv.pdf(pos), alpha = 0.3, cmap = 'Reds')
ax_influence.contourf(x, y, rv_throw.pdf(pos), alpha = 0.3, cmap = 'Greys')


ax.scatter(player_throw[0], player_throw[1], s = 300, color = 'k', label = 'thrower')
ax.scatter(player_catch[0], player_catch[1], s = 300, color = 'r', label = 'receiver')

ax.set_xlim([-5, 10])
ax.set_ylim([-5, 5])

ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
ax_influence.yaxis.set_visible(False)

ax.set

fig.show()

So now we can see the phenomenon that we want to model. The further away you are from the ball, the wider your region of influence will be, but it will also be weaker. This is because if you are far away from the ball, it will take time for the ball to reach you. In this time, you can move further. Influence is a very abstract concept, so feel free to model with any assumptions you like, but we will model this model using this assumption. The further you are from the ball, the more influence you will have, up to a certain distance. For soccer this makes a lot of sense as the game is continuous, however for American football, we may have to change this. Once a receiver catches the ball, another receiver on the other end of the field will have no influence on the play. We have not done this yet, but this is a weakness of the current model.

### Distance from the ball influence function
The function for distance from the ball vs influence is set relatively arbitrarily for now. I have done something similar to the function that is provided in the paper, but this would need to be tuned using subject matter experts from the NFL. The current function we will use is

$f(x) = 4 + (\frac{6}{18^2})(x^2) \textrm{   if   } x < 18$

$f(x) = 10 \textrm{   otherwise   }$

Note that after 18 yards, the radius of influence is constant.

In [None]:
x = np.arange(0, 30, 0.1)

@np.vectorize
def radius_influence(x):
    assert x >= 0

    if x <= 18:
        return 4 + (6/(18**2))*(x**2)
    else:
        return 10
    
plt.plot(x, radius_influence(x), 'k')
plt.xlabel('Distance From Ball')
plt.ylabel('Influence Radius')
plt.title('Influence Radius as a function of distance from ball')

plt.show()

## Modelling how velocity affects region of influence
As a player is running in a high velocity in a specific direction, their region of influence should change to reflect the fact that it would be difficult for them to influence anything that is not in line with or nearly in line with the direction they are running. However they can also influence a much larger region in the direction that they are running because they are already running at a high velocity. This means we need to create a region of influence that is elongated in the region that the player is running, as a function of speed. When they are stationary, the variances in either direction should be the same, but as they start running, the difference between the variances should increase. For now, we will just focus on what occurs when the player runs on the basis axes. We will worry about direction later in this notebook.

Apart from just the covariance, the speed should also affect the mean of our region of influence. Someone who is running full speed should have less influence on the ground behind them, and more on the ground in front of them. We can model this by shifting the mu in the movement direction as a function of velocity.

Let's plot this out, first by taking the low hanging fruit and plot out the mean shift. I will draw an arrow to signify the velocity of the player. We will shift the mean according to the following function

$\mathbf{\mu^*} = \mathbf{\mu} + 0.5\mathbf{v}$

For a player at the position $(0, 0)$ with a velocity vector $(6, 0)$ and no changes to the covariance matrix yet will have a covariance matrix as follows

In [None]:
# Player positions
player_throw = np.array([0, 0])

sigma_throw = np.array([[1**2, 0], [0, 1**2]])
mu_throw = player_throw
vel_throw = np.array([6, 0])

mu_throw = mu_throw + 0.5*vel_throw

rv_throw = stats.multivariate_normal(mu_throw, sigma_throw)

# Create grid of points to pass into multivariate normal
x, y = np.mgrid[-10:10:.01, -10:10:.01]
pos = np.dstack((x, y))

# Plot players and 
fig = plt.figure(figsize = (10, 10))
ax = plt.gca()
ax_influence = ax.twinx()

# Plot the figure
ax_influence.contourf(x, y, rv_throw.pdf(pos), alpha = 0.3, cmap = 'Greys')

ax.scatter(player_throw[0], player_throw[1], s = 300, color = 'k', label = 'thrower')
ax.arrow(0, 0, 6, 0, head_width = 0.2, color = 'k')

ax.set_xlim([-10, 10])
ax.set_ylim([-10, 10])

ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
ax_influence.yaxis.set_visible(False)

ax.set

fig.show()

Now for the covariance, we need to stretch the covariance matrix along the axis that we are moving, and shrink it in the orthogonal direction as a function of velocity and distance from the ball. Since we know our covariance matrix is diagonal, this is easy as we just need to worry about the individual variances. For now, let's take the distance to the ball to be 0. The bigger the distance, the wider the covariance should be in either direction. 

The initial covariance matrix we will use will be created from our distance-from-ball function $f_d(\cdot)$.

$
\Sigma=
  \begin{bmatrix}
    f_d(t) & 0 \\
    0 & f_d(t)
  \end{bmatrix}
$

We will increase this in the direction of movement (in this case in the $x$ direction). To incorporate a weighting with speed, we will take the maximum possible speed and create a weighting function. The maximum possible recorded speed as of the time of writing this notebook is 23.1 mph set by Raheem Mostert of the 49ers. This corresponds to approximately 11.3 yards per second. We don't want this to be a linear weighting either. At lower speeds, it is still easier to change direction than at higher speeds so we want the weighting to be exponential. If a player reaches a maximum speed, we want the variance to be $f_d(t)$. If the player is at 0 speed, we want the variance to be $\frac{f_d(t)}{2}$. This seems to have been set relatively arbitrarily by the authors, but if anyone has a reason as to why they did it this way, please let me know. In either case, you can tune the hyperparameters to ensure you are getting the correct radius of influence.

To meet the requirements of the speed weighting function, we can do $f_s(s) = \frac{x^2}{11.3^2}$. Squaring will provide the effect we require. You can compare this with a linear weighting in the graph below

In [None]:
x = np.arange(0,13,0.1)

plt.plot(x/13, 'k', label = 'linear')
plt.plot(x**2/13**2, 'r', label = 'exponential')

plt.legend()

plt.show()

The covariance can now just be written as a simple weighting

$
\Sigma=
  \begin{bmatrix}
    \frac{f_d(t) + f_d(t) \cdot f_s(s)}{2} & 0 \\
    0 & \frac{f_d(t) - f_d(t) \cdot f_s(s)}{2}
  \end{bmatrix}
$

Plotting this, we get

In [None]:
def speed_weighting(s):
    return (s**2)/(11.3**2)

player_speed = 6

# Player positions
player_throw = np.array([0, 0])

sigma_throw = np.array([[(radius_influence(0) + radius_influence(0)*speed_weighting(player_speed))/2, 0], [0, (radius_influence(0) - radius_influence(0)*speed_weighting(player_speed))/2]])
mu_throw = player_throw
vel_throw = np.array([player_speed, 0])

mu_throw = mu_throw + 0.5*vel_throw

rv_throw = stats.multivariate_normal(mu_throw, sigma_throw)

# Create grid of points to pass into multivariate normal
x, y = np.mgrid[-10:10:.01, -10:10:.01]
pos = np.dstack((x, y))

# Plot players and 
fig = plt.figure(figsize = (10, 10))
ax = plt.gca()
ax_influence = ax.twinx()

# Plot the figure
ax_influence.contourf(x, y, rv_throw.pdf(pos), alpha = 0.3, cmap = 'Greys')

ax.scatter(player_throw[0], player_throw[1], s = 300, color = 'k', label = 'thrower')
ax.arrow(0, 0, player_speed, 0, head_width = 0.2, color = 'k')

ax.set_xlim([-10, 10])
ax.set_ylim([-10, 10])

ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
ax_influence.yaxis.set_visible(False)

ax.set

fig.show()

Note how the covariance extends based on the speed of the player. This is exactly the behaviour we wanted.

## Modelling how direction affects region of influence
While this works if the player is moving in the x or y direction, we need to reassess how we create the covariance matrix to work for a player travelling in any direction. To do this, we can use the spectral theorem knowing that our covariance matrix is symmetric and real.

>**Spectral Theorem**:
Given $A \in n\times n$ is real, symmetric matrix, it can be factorized into $A=V S V^\top$ where the columns of $V$ contain orthonormal eigenvectors, and the diagonal of $\Sigma$ contains the corresponding eigenvalues.

The spectral theorem is used as a basies for PCA, and the eigenvectors will encode the orthogonal directions of highest variance. I will not go into the proof here, but for a very good tutorial on PCA, please see this [paper](https://arxiv.org/pdf/1404.1100.pdf) by Jonathan Shlens. Now, knowing what we know, we can work on backwards engineering a valid covariance matrix. 

First let's assume that we are taking our orientation from the x axis moving counter clockwise. By this I mean we would start at vector $(1, 0)$, and then rotate counter clockwise. Be aware that the data given by NFL starts at vector $(0, 1)$ and rotates clockwise. We will attempt to recreate $V$ first. As the columns of V will correspond to the two largest orthogonal degrees of variance, we can just think of V as a an orthonormal rotation matrix.

To better illustrate this, let us do an example. Assume that the direction is $\pi/4$ radians, or 45 degrees. We want to move our vector $(1, 0)$ to the corresponding point at 45 degrees on the unit circle. Using basic trigonometry, this is $(\cos(\theta), \sin(\theta))$. Similarly, we want to move the vector $(0, 1)$ to the corresponding point at 135 degrees. Again using basic trigonometry, this corresponds to $(-\sin(\theta), \cos(\theta))$. Both of these directions are unit normal (they are on the unit circle) and orthogonal (they are perpendicular). As such they are valid for use as orthonormal eigenvectors. As such, we can create our matrix as

$
V=
  \begin{bmatrix}
    \cos(\theta) & -\sin(\theta) \\
    \sin(\theta) & \cos(\theta)
  \end{bmatrix}
$

We will use the same eigenvalues for our matrix that we did last time. This will stretch the rotate the principle components in the corresponding direction. As a reminder, this matrix is

$
S=
  \begin{bmatrix}
    \frac{f_d(t) + f_d(t) \cdot f_s(s)}{2} & 0 \\
    0 & \frac{f_d(t) - f_d(t) \cdot f_s(s)}{2}
  \end{bmatrix}
$

Using the spectral theorem, we can now write out our new covariance matrix as

$\Sigma = VSV^\top$

To show this in an example, I will first plot the rotation of the bases, and then the rotated multivariate gaussian.

In [None]:
fig = plt.figure(figsize = (10, 10))
ax = plt.gca()

theta = np.pi/4

ax.arrow(0,0,1,0, color = 'k', head_width = 0.05)
ax.arrow(0,0,0,1, color = 'k', head_width = 0.05)
ax.arrow(0,0,np.cos(theta),np.sin(theta), color = 'r', head_width = 0.05)
ax.arrow(0,0,-np.sin(theta),np.cos(theta), color = 'r', head_width = 0.05)

ax.set_xlim([-1.5, 1.5])
ax.set_ylim([-1.5, 1.5])

ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)

fig.show()


In [None]:
def speed_weighting(s):
    return (s**2)/(11.3**2)

player_speed = 6
influence_rad = np.pi/4

# Player positions
player_throw = np.array([0, 0])

sigma_throw = np.array([[(radius_influence(0) + radius_influence(0)*speed_weighting(player_speed))/2, 0], [0, (radius_influence(0) - radius_influence(0)*speed_weighting(player_speed))/2]])
mu_throw = player_throw
vel_throw = np.array([np.sqrt(18), np.sqrt(18)])

R = np.array([[np.cos(influence_rad), -np.sin(influence_rad)],[np.sin(influence_rad), np.cos(influence_rad)]])

sigma_rotated = R@sigma_throw@R.T

mu_throw = mu_throw + 0.5*vel_throw

# rv_throw = stats.multivariate_normal(mu_throw, sigma_throw)
rv_throw = stats.multivariate_normal(mu_throw, sigma_rotated)

# Create grid of points to pass into multivariate normal
x, y = np.mgrid[-10:10:.01, -10:10:.01]
pos = np.dstack((x, y))

# Plot players and 
fig = plt.figure(figsize = (10, 10))
ax = plt.gca()
ax_influence = ax.twinx()

# Plot the figure
ax_influence.contourf(x, y, rv_throw.pdf(pos), alpha = 0.3, cmap = 'Reds')

ax.scatter(player_throw[0], player_throw[1], s = 300, color = 'r', label = 'thrower')
ax.arrow(0, 0, vel_throw[0], vel_throw[1], head_width = 0.2, color = 'r')

ax.set_xlim([-10, 10])
ax.set_ylim([-10, 10])

ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)
ax_influence.yaxis.set_visible(False)

ax.set

fig.show()

Note how using the above formulation, we can now easily rotate the regions of influence as we like!

That is it! That brings to a close the section of regions of player influence. We now create such a region of influence for every single player in the team, and then we can move on to the pitch control section.

## Team Pitch Control
Let's now imagine that you have computed individual regions of influence for each of the 

In [None]:
class AnimatePlayPitchControl(AnimatePlay):
    def __init__(self, play_df, plot_size_len, show_control = True) -> None:
        super().__init__(play_df, plot_size_len)
        """Initializes the datasets used to animate the play.

        Parameters
        ----------
        play_df : DataFrame
            Dataframe corresponding to the play information for the play that requires
            animation. This data will come from the weeks dataframe and contains position
            and velocity information for each of the players and the football.

        Returns
        -------
        None
        """
        self._MAX_PLAYER_SPEED = 11.3
        self._X, self._Y, self._pos = self.generate_data_grid()
        
        self._ax_football = self._ax_field.twinx()
        
        self._show_control = show_control
        plt.close()
    
    @staticmethod
    @np.vectorize
    def radius_influence(x):
        assert x >= 0

        if x <= 18:
            return 4 + (6/(18**2))*(x**2)
        else:
            return 10
        
    def generate_data_grid(self, N = 120):
        # Our 2-dimensional distribution will be over variables X and Y
        X = np.linspace(0, self._MAX_FIELD_X, N)
        Y = np.linspace(0, self._MAX_FIELD_Y, N)
        X, Y = np.meshgrid(X, Y)

        # # Mean vector and covariance matrix
        # mu = np.array([0., 1.])
        # Sigma = np.array([[ 1. , -0.5], [-0.5,  1.5]])

        # Pack X and Y into a single 3-dimensional array
        pos = np.empty(X.shape + (2,))
        pos[:, :, 0] = X
        pos[:, :, 1] = Y
        
        return X, Y, pos
    
    @staticmethod
    def sigmoid(x, k):
        return 1 / (1 + np.exp(-k*x))

    @staticmethod
    def weighted_angle(x1, x2, w):
        def normalize(v):
            norm=np.linalg.norm(v, ord=1)
            if norm==0:
                norm=np.finfo(v.dtype).eps
            return v/norm

        norm_weighted = w*normalize(x1) + (1-w)*normalize(x2)

        return np.arctan2(norm_weighted[1], norm_weighted[0]) % (2*np.pi)
        
    @staticmethod
    def multivariate_gaussian(pos, mu, Sigma):
        """Return the multivariate Gaussian distribution on array pos.

        pos is an array constructed by packing the meshed arrays of variables
        x_1, x_2, x_3, ..., x_k into its _last_ dimension.

        """

        n = mu.shape[0]
        Sigma_det = np.linalg.det(Sigma)
        Sigma_inv = np.linalg.inv(Sigma)
        N = np.sqrt((2*np.pi)**n * Sigma_det)
        # This einsum call calculates (x-mu)T.Sigma-1.(x-mu) in a vectorized
        # way across all the input variables.
        fac = np.einsum('...k,kl,...l->...', pos-mu, Sigma_inv, pos-mu)

        return np.exp(-fac / 2) / N
        
    def generate_sigma(self, influence_rad, player_speed, distance_from_football):
        R = np.array([[np.cos(influence_rad), -np.sin(influence_rad)],[np.sin(influence_rad), np.cos(influence_rad)]])

        speed_ratio = (player_speed**2)/(self._MAX_PLAYER_SPEED**2)

        S = np.array([[self.radius_influence(distance_from_football) + (self.radius_influence(distance_from_football)*speed_ratio), 0], 
        [0, self.radius_influence(distance_from_football) - (self.radius_influence(distance_from_football)*speed_ratio)]])
        
        return R@(S**2)@R.T
    
    def generate_mu(self, player_position, player_vel):
        return player_position + 0.5*player_vel
    
    def setup_plot(self): 
        self.set_axis_plots(self._ax_field, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        
        ball_snap_df = self._frame_data[(self._frame_data.event == 'ball_snap') & (self._frame_data.team == 'football')]
        self._ax_field.axvline(ball_snap_df.x.to_numpy()[0], color = 'k', linestyle = '--')
        
        self.set_axis_plots(self._ax_home, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        self.set_axis_plots(self._ax_away, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        self.set_axis_plots(self._ax_jersey, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        self.set_axis_plots(self._ax_football, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        
        for idx in range(10,120,10):
            self._ax_field.axvline(idx, color = 'k', linestyle = '-', alpha = 0.05)
            
        self._scat_football = self._ax_football.scatter([], [], s = 100, color = 'black')
        self._scat_home = self._ax_home.scatter([], [], s = 500, color = self._CPLT[0], edgecolors = 'k')
        self._scat_away = self._ax_away.scatter([], [], s = 500, color = self._CPLT[1], edgecolors = 'k')
        
        self._scat_jersey_list = []
        self._scat_number_list = []
        self._scat_name_list = []
        self._a_dir_list = []
        self._a_or_list = []
        self._inf_contours_list = []
        for _ in range(self._MAX_FIELD_PLAYERS):
            self._scat_jersey_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'white'))
            self._scat_number_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))
            self._scat_name_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))
            
            self._a_dir_list.append(self._ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))
            self._a_or_list.append(self._ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))
            
            if not self._show_control:
                self._inf_contours_list.append(self._ax_field.contourf([0, 0], [0, 0], [[0,0],[0,0]]))
        
        if self._show_control:
            self._pitch_control_contour = self._ax_field.contourf([0, 0], [0, 0], [[0,0],[0,0]])
            
        return (self._scat_football, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list)
        
    def update(self, anim_frame):
        pos_df = next(self._stream)
        
        for label in pos_df.team.unique():
            label_data = pos_df[pos_df.team == label]

            if label == 'home':
                self._scat_home.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'away':
                self._scat_away.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'football':
                self._scat_football.set_offsets(np.hstack([label_data.x, label_data.y]))

        jersey_df = pos_df[pos_df.jerseyNumber.notnull()]
        
        inf_home_team = 0
        inf_away_team = 0
        
        for (index, row) in pos_df[pos_df.jerseyNumber.notnull()].reset_index().iterrows():
            self._scat_jersey_list[index].set_position((row.x, row.y))
            self._scat_jersey_list[index].set_text(row.position)
            self._scat_number_list[index].set_position((row.x, row.y+1.9))
            self._scat_number_list[index].set_text(int(row.jerseyNumber))
            self._scat_name_list[index].set_position((row.x, row.y-1.9))
            self._scat_name_list[index].set_text(row.displayName.split()[-1])
            
            player_orientation_rad = self.deg_to_rad(self.convert_orientation(row.o))
            player_direction_rad = self.deg_to_rad(self.convert_orientation(row.dir))
            player_speed = row.s
            player_position = np.array([row.x, row.y])
            player_acc = row.a
            
            speed_w = player_speed/self._MAX_PLAYER_SPEED
            
            player_vel = np.array([np.real(self.polar_to_z(player_speed, player_direction_rad)), np.imag(self.polar_to_z(player_speed, player_direction_rad))])
            player_orient = np.array([np.real(self.polar_to_z(2, player_orientation_rad)), np.imag(self.polar_to_z(2, player_orientation_rad))])
            
            influence_rad = self.weighted_angle(player_vel, player_orient, speed_w)
            
            distance_from_football = np.sqrt((pos_df[pos_df.displayName == 'Football'].x - player_position[0])**2 + ((pos_df[pos_df.displayName == 'Football'].y - player_position[1]))**2).to_numpy()[0]
            
            self._a_dir_list[index].remove()
            self._a_dir_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_vel[0], player_vel[1], color = 'k'))
            
            self._a_or_list[index].remove()
            self._a_or_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_orient[0], player_orient[1], color = 'grey', width = 2))
            
            sigma = self.generate_sigma(influence_rad, player_speed, distance_from_football)
            mu = self.generate_mu(player_position, player_vel)
            
            Z = self.multivariate_gaussian(self._pos, mu, sigma)
            Z_coarse = np.where(Z > 0.001, Z, np.nan)
            
            if not self._show_control:
                for cont_info in self._inf_contours_list[index].collections:
                    cont_info.remove()
            
            if row.team == 'home':
                if self._show_control:
                    inf_home_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Reds', levels = 10, alpha = 0.1)
            elif row.team == 'away':
                if self._show_control:
                    inf_away_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Greens', levels = 10, alpha = 0.1)
        
        if self._show_control:
            for cont_info in self._pitch_control_contour.collections:
                    cont_info.remove()
                
            self._pitch_control_contour = self._ax_field.contourf(self._X, self._Y, self.sigmoid(inf_away_team/len(pos_df[pos_df.team=='away']) - inf_home_team/len(pos_df[pos_df.team=='home']),k = 1000), levels = 50, cmap='PiYG', vmin = 0.45, vmax = 0.55, alpha = 0.7)
            
#             self._fig.colorbar(self._pitch_control_contour, extend='min', shrink=0.9, ax=self._ax_field)
        
        return (self._scat_football, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list)

In [None]:
animated_play = AnimatePlayPitchControl(play_df, 20, show_control=False)
HTML(animated_play.ani.to_jshtml())

# Assigning Coverage
My current goal as we continue this notebook, is to figure out how to best assign coverages to each of the defenders. In a naive approach, this should not be too difficult. We can assign single coverage using some distance metric, and be done. However, I want to account for a few other factors in my model.

1. I will assume that each player has a primary and secondary cover target
2. I want to take into account where the quarterback is looking
3. I want to take into account which receiver is a bigger "threat". This will be achieved by using receiver target probabilities as a proxy
4. I want to take into account where the defender is looking

All of this can be done by creating some sort of weighted function matrix, which will take into account all of the assumptions. However, I have always wanted to get a little into graph theory, so instead we will be using a max-flow min-cost algorithm that will generate a bipartite graph. This will not be a matching because we are allowing for double coverage. I will then create a primary and secondary target for each of the defenders with importance probabilities, and we can plot it in our animations to see if it makes sense. Note that the coverages we get should be identical to the weighted function matrix. The weights on each of the graph edges will represent the function that takes into account all of the assumptions shown above.

The first step, is to create receiver probabilities using our influence function created above

## Receiver Probabilities
As we already have influence areas for each of the players, creating the receiver probabilities should be reasonably straight forward. There are 2 variables we need to account for

1. How "free" is a specific receiver
2. Is the quarterback looking to throw their way

The two things that we need to note, it that firstly these probabilities only really matter up until the moment the ball is released. After that. the quarterback will be looking where the ball went, and the probabilities will be faulty. Additionally, we would need to add in a distance from ball metric and distance from quarterback metric to reweight probabilities. Once the ball has landed in a receiver's hand, we know with 100% target what the receiver probability is. We can expand on these points later.

With these two values, we can construct a probability of being thrown to for each of the players on the field that adheres to the 3 axioms of probability. How free a specific player is will depend on his influence gaussian, and that of the entire defending team. We will create a basic score during each frame of the play by subtracting the influence gaussian of the defensive team from that of the offensive player. We can zero out all negative values, and the aggregate remaining score will be the "freedom score" of that player.

Below I show an implemented example of the probabilities for each of the players during a certain frame of the play that we have been observing. Note the red probability scores on top of the players. These probabilities will add up to one across the receiving players.

In [None]:
frame_df = play_df[play_df.frameId == 30]

off_team = frame_df[frame_df.position == 'QB']['team'].values[0]
def_team = 'home' if off_team == 'away' else 'away'

@np.vectorize
def radius_influence(x):
    assert x >= 0

    if x <= 18:
        return 4 + (6/(18**2))*(x**2)
    else:
        return 10

def generate_data_grid(N = 120):
    # Our 2-dimensional distribution will be over variables X and Y
    X = np.linspace(0, _MAX_FIELD_X, N)
    Y = np.linspace(0, _MAX_FIELD_Y, N)
    X, Y = np.meshgrid(X, Y)

    # # Mean vector and covariance matrix
    # mu = np.array([0., 1.])
    # Sigma = np.array([[ 1. , -0.5], [-0.5,  1.5]])

    # Pack X and Y into a single 3-dimensional array
    pos = np.empty(X.shape + (2,))
    pos[:, :, 0] = X
    pos[:, :, 1] = Y

    return X, Y, pos

def sigmoid(x, k):
    return 1 / (1 + np.exp(-k*x))

def weighted_angle(x1, x2, w):
    def normalize(v):
        norm=np.linalg.norm(v, ord=1)
        if norm==0:
            norm=np.finfo(v.dtype).eps
        return v/norm

    norm_weighted = w*normalize(x1) + (1-w)*normalize(x2)

    return np.arctan2(norm_weighted[1], norm_weighted[0]) % (2*np.pi)

def multivariate_gaussian(pos, mu, Sigma):
    """Return the multivariate Gaussian distribution on array pos.

    pos is an array constructed by packing the meshed arrays of variables
    x_1, x_2, x_3, ..., x_k into its _last_ dimension.

    """

    n = mu.shape[0]
    Sigma_det = np.linalg.det(Sigma)
    Sigma_inv = np.linalg.inv(Sigma)
    N = np.sqrt((2*np.pi)**n * Sigma_det)
    # This einsum call calculates (x-mu)T.Sigma-1.(x-mu) in a vectorized
    # way across all the input variables.
    fac = np.einsum('...k,kl,...l->...', pos-mu, Sigma_inv, pos-mu)

    return np.exp(-fac / 2) / N

def generate_sigma(influence_rad, player_speed, distance_from_football):
    R = np.array([[np.cos(influence_rad), -np.sin(influence_rad)],[np.sin(influence_rad), np.cos(influence_rad)]])

    speed_ratio = (player_speed**2)/(_MAX_PLAYER_SPEED**2)

    S = np.array([[radius_influence(distance_from_football) + (radius_influence(distance_from_football)*speed_ratio), 0], 
    [0, radius_influence(distance_from_football) - (radius_influence(distance_from_football)*speed_ratio)]])

    return R@(S**2)@R.T

def generate_mu(player_position, player_vel):
    return player_position + 0.5*player_vel

def set_axis_plots(ax, max_x, max_y) -> None:
    ax.xaxis.set_visible(False)
    ax.yaxis.set_visible(False)

    ax.set_xlim([0, max_x])
    ax.set_ylim([0, max_y])

def convert_orientation(x):
    return (-x + 90)%360

def polar_to_z(r, theta):
    return r * np.exp( 1j * theta)

def deg_to_rad(deg):
    return deg*np.pi/180

plot_size_len = 20
_show_control = False

_MAX_PLAYER_SPEED = 11.3
_MAX_FIELD_Y = 53.3
_MAX_FIELD_X = 120
_MAX_FIELD_PLAYERS = 22

_CPLT = sns.color_palette("husl", 2)
_frame_data = play_df
_times = sorted(play_df.time.unique())
# _stream = data_stream()

_date_format = "%Y-%m-%dT%H:%M:%S.%fZ" 
_mean_interval_ms = np.mean([delta.microseconds/1000 for delta in np.diff(np.array([pytz.timezone('US/Eastern').localize(datetime.strptime(date_string, _date_format)) for date_string in _times]))])

_fig = plt.figure(figsize = (plot_size_len, plot_size_len*(_MAX_FIELD_Y/_MAX_FIELD_X)))

_ax_field = plt.gca()

_ax_home = _ax_field.twinx()
_ax_away = _ax_field.twinx()
_ax_jersey = _ax_field.twinx()

_X, _Y, _pos = generate_data_grid()
_ax_football = _ax_field.twinx()

set_axis_plots(_ax_field, _MAX_FIELD_X, _MAX_FIELD_Y)
        
ball_snap_df = _frame_data[(_frame_data.event == 'ball_snap') & (_frame_data.team == 'football')]
_ax_field.axvline(ball_snap_df.x.to_numpy()[0], color = 'k', linestyle = '--')

set_axis_plots(_ax_home, _MAX_FIELD_X, _MAX_FIELD_Y)
set_axis_plots(_ax_away, _MAX_FIELD_X, _MAX_FIELD_Y)
set_axis_plots(_ax_jersey, _MAX_FIELD_X, _MAX_FIELD_Y)
set_axis_plots(_ax_football, _MAX_FIELD_X, _MAX_FIELD_Y)

for idx in range(10,120,10):
    _ax_field.axvline(idx, color = 'k', linestyle = '-', alpha = 0.05)
    
_scat_football = _ax_football.scatter([], [], s = 100, color = 'black')
_scat_home = _ax_home.scatter([], [], s = 500, color = _CPLT[0], edgecolors = 'k')
_scat_away = _ax_away.scatter([], [], s = 500, color = _CPLT[1], edgecolors = 'k')

_scat_jersey_list = []
_scat_number_list = []
_scat_name_list = []
_scat_prob_list = []
_a_dir_list = []
_a_or_list = []
_inf_contours_list = []
for _ in range(_MAX_FIELD_PLAYERS):
    _scat_jersey_list.append(_ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'white'))
    _scat_number_list.append(_ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))
    _scat_name_list.append(_ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))
    _scat_prob_list.append(_ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'red'))

    _a_dir_list.append(_ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))
    _a_or_list.append(_ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))

    if not _show_control:
        _inf_contours_list.append(_ax_field.contourf([0, 0], [0, 0], [[0,0],[0,0]]))
        
pos_df = frame_df
        
for label in pos_df.team.unique():
    label_data = pos_df[pos_df.team == label]

    if label == 'home':
        _scat_home.set_offsets(np.vstack([label_data.x, label_data.y]).T)
    elif label == 'away':
        _scat_away.set_offsets(np.vstack([label_data.x, label_data.y]).T)
    elif label == 'football':
        _scat_football.set_offsets(np.hstack([label_data.x, label_data.y]))

jersey_df = pos_df[pos_df.jerseyNumber.notnull()]

inf_home_team = 0
inf_away_team = 0

off_team = pos_df[pos_df.position == 'QB']['team'].values[0]
def_team = 'home' if off_team == 'away' else 'away'

Z_def = np.zeros((_pos.shape[0], _pos.shape[1]))

Z_off = []
for (index, row) in pos_df[pos_df.jerseyNumber.notnull()].sort_values(by = 'displayName').reset_index().iterrows():
    _scat_jersey_list[index].set_position((row.x, row.y))
    _scat_jersey_list[index].set_text(row.position)
    _scat_number_list[index].set_position((row.x, row.y+1.9))
    _scat_number_list[index].set_text(int(row.jerseyNumber))
    _scat_name_list[index].set_position((row.x, row.y-1.9))
    _scat_name_list[index].set_text(row.displayName.split()[-1])

    player_orientation_rad = deg_to_rad(convert_orientation(row.o))
    player_direction_rad = deg_to_rad(convert_orientation(row.dir))
    player_speed = row.s
    player_position = np.array([row.x, row.y])
    player_acc = row.a

    speed_w = player_speed/_MAX_PLAYER_SPEED

    player_vel = np.array([np.real(polar_to_z(player_speed, player_direction_rad)), np.imag(polar_to_z(player_speed, player_direction_rad))])
    player_orient = np.array([np.real(polar_to_z(2, player_orientation_rad)), np.imag(polar_to_z(2, player_orientation_rad))])

    influence_rad = weighted_angle(player_vel, player_orient, speed_w)

    distance_from_football = np.sqrt((pos_df[pos_df.displayName == 'Football'].x - player_position[0])**2 + ((pos_df[pos_df.displayName == 'Football'].y - player_position[1]))**2).to_numpy()[0]

    _a_dir_list[index].remove()
    _a_dir_list[index] = _ax_field.add_patch(Arrow(row.x, row.y, player_vel[0], player_vel[1], color = 'k'))

    _a_or_list[index].remove()
    _a_or_list[index] = _ax_field.add_patch(Arrow(row.x, row.y, player_orient[0], player_orient[1], color = 'grey', width = 2))

    sigma = generate_sigma(influence_rad, player_speed, distance_from_football)
    mu = generate_mu(player_position, player_vel)

    Z = multivariate_gaussian(_pos, mu, sigma)
    Z_coarse = np.where(Z > 0.001, Z, np.nan)
    
    if row.team == def_team:
        Z_def += np.where(Z > 0.001, Z, 0)
        
    if row.team == off_team and row.position != 'QB':
        Z_off.append(np.where(Z > 0.001, Z, 0))
#         print(row.displayName)
        
    if row.team == off_team and row.displayName == 'A.J. Green':
        Z_test = np.where(Z > 0.001, Z, 0)
#         print(row.displayName)

    if not _show_control:
        for cont_info in _inf_contours_list[index].collections:
            cont_info.remove()

    if row.team == 'home':
        if _show_control:
            inf_home_team += Z
        else:
            _inf_contours_list[index] = _ax_field.contourf(_X, _Y, Z_coarse, cmap='Reds', levels = 10, alpha = 0.1)
    elif row.team == 'away':
        if _show_control:
            inf_away_team += Z
        else:
            _inf_contours_list[index] = _ax_field.contourf(_X, _Y, Z_coarse, cmap='Greens', levels = 10, alpha = 0.1)

tot_off_freedom = list(np.clip(np.array(Z_off) - Z_def, 0, None))
tot_area = np.sum(tot_off_freedom)
# print(tot_area)
# for (index, row) in pos_df[pos_df.jerseyNumber.notnull()].reset_index().iterrows():
#     if row.team == off_team and row.position != 'QB':
#         tot_area += np.sum(np.clip(np.where(Z > 0.001, Z, 0) - Z_def, 0, None))
        
for (index, row) in pos_df[pos_df.jerseyNumber.notnull()].sort_values(by = 'displayName').reset_index().iterrows():
    if row.team == off_team and row.position != 'QB':
        _scat_prob_list[index].set_position((row.x, row.y+3.9))
        x = np.sum(tot_off_freedom.pop(0))
#         print(x, " ", row.displayName)
        
        _scat_prob_list[index].set_text(f"{x/tot_area:.2f}")

To take a small peek into what is going on, I will show you the underlying free area that is used to calculate the freedom score. This underlying area is found by subtracting all of the influence functions from the defensive players. Each of the offensive players have such an area, and from this, we calculate the probabilities as shown in the graph above.

In [None]:
plt.figure(figsize = (20, 10))
plt.imshow(np.flip(Z_test, axis = 0), aspect = 'auto')
plt.imshow(np.clip(np.flip(Z_test - Z_def, axis = 0), 0, None), aspect = 'auto', cmap = 'Greens')

ax = plt.gca()

ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)

plt.title("Free Area for A.J. Green on Frame 30")

plt.show()

# print(np.sum(np.clip(np.flip(Z_test - Z_def, axis = 0), 0, None)))

In [None]:
class AnimatePlayPitchControlProb(AnimatePlayPitchControl):
    def __init__(self, play_df, plot_size_len, show_control = False) -> None:
        super().__init__(play_df, plot_size_len, show_control)
    
    def setup_plot(self):
        ax_list = list(super().setup_plot())
        
        self._scat_prob_list = []
        
        temp_player_df = self._frame_data[self._frame_data.event == 'ball_snap']
        
        self._off_team = temp_player_df[temp_player_df.position == 'QB']['team'].values[0]
        self._def_team = 'home' if self._off_team == 'away' else 'away'
        
        for (index, row) in temp_player_df[(temp_player_df.team == self._off_team) & (temp_player_df.position != 'QB')].sort_values(by = 'displayName').reset_index().iterrows():
            self._scat_prob_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'red'))
        
        return (*ax_list, *self._scat_prob_list)
    
    def update(self, anim_frame):
        pos_df = next(self._stream)
        
        for label in pos_df.team.unique():
            label_data = pos_df[pos_df.team == label]

            if label == 'home':
                self._scat_home.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'away':
                self._scat_away.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'football':
                self._scat_football.set_offsets(np.hstack([label_data.x, label_data.y]))

        jersey_df = pos_df[pos_df.jerseyNumber.notnull()]
        
        inf_home_team = 0
        inf_away_team = 0

        Z_def = np.zeros((_pos.shape[0], _pos.shape[1]))

        Z_off = []
        
        players_df = pos_df[pos_df.jerseyNumber.notnull()].sort_values(by = 'displayName').reset_index()
        
        for (index, row) in players_df.iterrows():
            self._scat_jersey_list[index].set_position((row.x, row.y))
            self._scat_jersey_list[index].set_text(row.position)
            self._scat_number_list[index].set_position((row.x, row.y+1.9))
            self._scat_number_list[index].set_text(int(row.jerseyNumber))
            self._scat_name_list[index].set_position((row.x, row.y-1.9))
            self._scat_name_list[index].set_text(row.displayName.split()[-1])
            
            player_orientation_rad = self.deg_to_rad(self.convert_orientation(row.o))
            player_direction_rad = self.deg_to_rad(self.convert_orientation(row.dir))
            player_speed = row.s
            player_position = np.array([row.x, row.y])
            player_acc = row.a
            
            speed_w = player_speed/self._MAX_PLAYER_SPEED
            
            player_vel = np.array([np.real(self.polar_to_z(player_speed, player_direction_rad)), np.imag(self.polar_to_z(player_speed, player_direction_rad))])
            player_orient = np.array([np.real(self.polar_to_z(2, player_orientation_rad)), np.imag(self.polar_to_z(2, player_orientation_rad))])
            
            influence_rad = self.weighted_angle(player_vel, player_orient, speed_w)
            
            distance_from_football = np.sqrt((pos_df[pos_df.displayName == 'Football'].x - player_position[0])**2 + ((pos_df[pos_df.displayName == 'Football'].y - player_position[1]))**2).to_numpy()[0]
            
            self._a_dir_list[index].remove()
            self._a_dir_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_vel[0], player_vel[1], color = 'k'))
            
            self._a_or_list[index].remove()
            self._a_or_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_orient[0], player_orient[1], color = 'grey', width = 2))
            
            sigma = self.generate_sigma(influence_rad, player_speed, distance_from_football)
            mu = self.generate_mu(player_position, player_vel)
            
            Z = self.multivariate_gaussian(self._pos, mu, sigma)
            Z_coarse = np.where(Z > 0.001, Z, np.nan)
            
            if row.team == self._def_team:
                Z_def += np.where(Z > 0.001, Z, 0)

            if row.team == self._off_team and row.position != 'QB':
                Z_off.append(np.where(Z > 0.001, Z, 0))
            
            if not self._show_control:
                for cont_info in self._inf_contours_list[index].collections:
                    cont_info.remove()
            
            if row.team == 'home':
                if self._show_control:
                    inf_home_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Reds', levels = 10, alpha = 0.1)
            elif row.team == 'away':
                if self._show_control:
                    inf_away_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Greens', levels = 10, alpha = 0.1)
        
        if self._show_control:
            for cont_info in self._pitch_control_contour.collections:
                    cont_info.remove()
                
            self._pitch_control_contour = self._ax_field.contourf(self._X, self._Y, self.sigmoid(inf_away_team/len(pos_df[pos_df.team=='away']) - inf_home_team/len(pos_df[pos_df.team=='home']),k = 1000), levels = 50, cmap='PiYG', vmin = 0.45, vmax = 0.55, alpha = 0.7)
            
        tot_off_freedom = list(np.clip(np.array(Z_off) - Z_def, 0, None))
        tot_area = np.sum(tot_off_freedom)
        
        for (index, row) in players_df[(players_df.team == self._off_team) & (players_df.position != 'QB')].sort_values(by = 'displayName').reset_index().iterrows():
            self._scat_prob_list[index].set_position((row.x, row.y+3.9))
            self._scat_prob_list[index].set_text(f"{np.sum(tot_off_freedom.pop(0))/tot_area:.2f}")
            
        return (self._scat_football, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list, *self._scat_prob_list)

In [None]:
animated_play = AnimatePlayPitchControlProb(play_df, 20, show_control=True)
HTML(animated_play.ani.to_jshtml())

To tell if the quarterback is looking their way is slightly trickier. Unfortunately, the tracking device used to obtain this data is limited to only showing the orientation of the trunk of any individual player. However, I think in general, the quarterback will be looking over their shoulder, around 30 degrees to to the orientation of their trunk. I will leave out left handed QBs for now, but will need to account for that at some point. We will use this as a proxy for the viewing direction of the quarterback, and assign a radial gaussian to ensure we have a little margin for error. Then we can multiply this radial gaussian by the freedom scores of each of the offensive players, and recalculate our probabilities for each of the offensive players. To choose the Gaussian variance, I will try and mimic the human vision, which is about 120 degrees FOV. I don't want to give too much weighting to where the quarterback is looking, so 

The only left handed QB I currently know of that is left handed is Tua, with Kellen Moore having retired in 2017. If we check for all of the QBs in the players list, they are all right handed, so we can forget about left handed QBs.

In [None]:
from scipy.stats import norm

x = np.linspace(-np.pi, np.pi, num = 500)

y_gauss = norm.pdf(x, loc = 0, scale = 0.7)

plt.figure(figsize = (10, 5))

plt.plot(x*180/np.pi, y_gauss/np.max(y_gauss), 'r')

plt.xlabel('Degrees')
plt.ylabel('Weighting')

plt.title("QB Viewing Angle Weighting")

plt.show()

In [None]:
plt.figure(figsize = (20, 10))

ax = plt.gca()
ax.plot([0, 1], [0, 1])

In [None]:
class AnimatePlayPitchControlProbQB(AnimatePlayPitchControlProb):
    def __init__(self, play_df, plot_size_len, show_control = False) -> None:
        super().__init__(play_df, plot_size_len, show_control)
        
        self._GAUSS_SCALE_RAD = 0.7
        self._VIEW_ANGLE = 30
        self._VIEW_WEIGHT_MAX = np.max(norm.pdf(np.linspace(-np.pi, np.pi, num = 500), loc = 0, scale = self._GAUSS_SCALE_RAD))
        self._QB_VIEW_WEIGHTING = 0.7
        
    def setup_plot(self):
        ax_list = list(super().setup_plot())
        
        self._qb_target = self._ax_field.plot([], [], color = 'grey', linewidth = 3)[0]
        self._qb_target_secondary = self._ax_field.plot([], [], color = 'grey', linestyle = '--')[0]
        
        return (*ax_list, self._qb_target, self._qb_target_secondary)
    
    @staticmethod
    def sigmoid(x, a1 = 0.5, a2 = 15):
        return 1/(1 + np.exp(-a1*(x - a2)))
    
    def update(self, anim_frame):
        pos_df = next(self._stream)
        
        for label in pos_df.team.unique():
            label_data = pos_df[pos_df.team == label]

            if label == 'home':
                self._scat_home.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'away':
                self._scat_away.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'football':
                self._scat_football.set_offsets(np.hstack([label_data.x, label_data.y]))

        jersey_df = pos_df[pos_df.jerseyNumber.notnull()]
        
        inf_home_team = 0
        inf_away_team = 0

        Z_def = np.zeros((_pos.shape[0], _pos.shape[1]))

        Z_off = []
        
        players_df = pos_df[pos_df.jerseyNumber.notnull()].sort_values(by = 'displayName').reset_index()
        
        for (index, row) in players_df.iterrows():
            self._scat_jersey_list[index].set_position((row.x, row.y))
            self._scat_jersey_list[index].set_text(row.position)
            self._scat_number_list[index].set_position((row.x, row.y+1.9))
            self._scat_number_list[index].set_text(int(row.jerseyNumber))
            self._scat_name_list[index].set_position((row.x, row.y-1.9))
            self._scat_name_list[index].set_text(row.displayName.split()[-1])
            
            player_orientation_rad = self.deg_to_rad(self.convert_orientation(row.o))
            player_direction_rad = self.deg_to_rad(self.convert_orientation(row.dir))
            player_speed = row.s
            player_position = np.array([row.x, row.y])
            player_acc = row.a
            
            speed_w = player_speed/self._MAX_PLAYER_SPEED
            
            player_vel = np.array([np.real(self.polar_to_z(player_speed, player_direction_rad)), np.imag(self.polar_to_z(player_speed, player_direction_rad))])
            player_orient = np.array([np.real(self.polar_to_z(2, player_orientation_rad)), np.imag(self.polar_to_z(2, player_orientation_rad))])
            
            influence_rad = self.weighted_angle(player_vel, player_orient, speed_w)
            
            distance_from_football = np.sqrt((pos_df[pos_df.displayName == 'Football'].x - player_position[0])**2 + ((pos_df[pos_df.displayName == 'Football'].y - player_position[1]))**2).to_numpy()[0]
            
            self._a_dir_list[index].remove()
            self._a_dir_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_vel[0], player_vel[1], color = 'k'))
            
            self._a_or_list[index].remove()
            self._a_or_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_orient[0], player_orient[1], color = 'grey', width = 2))
            
            sigma = self.generate_sigma(influence_rad, player_speed, distance_from_football)
            mu = self.generate_mu(player_position, player_vel)
            
            Z = self.multivariate_gaussian(self._pos, mu, sigma)
            Z_coarse = np.where(Z > 0.001, Z, np.nan)
            
            if row.team == self._def_team:
                Z_def += np.where(Z > 0.001, Z, 0)

            if row.team == self._off_team and row.position != 'QB':
                Z_off.append(np.where(Z > 0.001, Z, 0))
                
            if row.position == 'QB':
                qb_position = np.array([row.x, row.y])
                qb_orientation = player_orientation_rad
                qb_view_orientation = player_orientation_rad + (self.deg_to_rad(self._VIEW_ANGLE))
                
#                 print(qb_orientation*180/np.pi)
                
            if not self._show_control:
                for cont_info in self._inf_contours_list[index].collections:
                    cont_info.remove()
            
            if row.team == 'home':
                if self._show_control:
                    inf_home_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Reds', levels = 10, alpha = 0.1)
            elif row.team == 'away':
                if self._show_control:
                    inf_away_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Greens', levels = 10, alpha = 0.1)
        
        if self._show_control:
            for cont_info in self._pitch_control_contour.collections:
                    cont_info.remove()
                
            self._pitch_control_contour = self._ax_field.contourf(self._X, self._Y, self.sigmoid(inf_away_team/len(pos_df[pos_df.team=='away']) - inf_home_team/len(pos_df[pos_df.team=='home']),k = 1000), levels = 50, cmap='PiYG', vmin = 0.45, vmax = 0.55, alpha = 0.7)
            
        tot_off_freedom = list(np.clip(np.array(Z_off) - Z_def, 0, None))
        
        for (index, row) in players_df[(players_df.team == self._off_team) & (players_df.position != 'QB')].sort_values(by = 'displayName').reset_index().iterrows():
            diff_vector = np.array([row.x, row.y]) - qb_position
            angle_from_qb = self.deg_to_rad(np.arctan2(diff_vector[-1], diff_vector[0])*180/np.pi)%(2*np.pi)
            
#             if row.displayName == 'A.J. Green':
#                 print(angle_from_qb*180/np.pi)
#                 print(self.convert_orientation(np.arctan2(diff_vector[-1], diff_vector[0])*180/np.pi))
            
            weighting = norm.pdf(angle_from_qb, loc = qb_view_orientation, scale = self._GAUSS_SCALE_RAD)/self._VIEW_WEIGHT_MAX
        
            weighting_distance = self.sigmoid(np.linalg.norm(np.array([row.x, row.y]) - qb_position))
            
            tot_off_freedom[index] = weighting_distance*((1 - self._QB_VIEW_WEIGHTING)*tot_off_freedom[index] + self._QB_VIEW_WEIGHTING*tot_off_freedom[index]*weighting)
        
        tot_area = np.sum(tot_off_freedom)
        
        prob_list = []
        plot_list = []
        for (index, row) in players_df[(players_df.team == self._off_team) & (players_df.position != 'QB')].sort_values(by = 'displayName').reset_index().iterrows():
            self._scat_prob_list[index].set_position((row.x, row.y+3.9))
            cur_prob = np.sum(tot_off_freedom.pop(0))/tot_area
            prob_list.append(cur_prob)
            plot_list.append(([row.x, qb_position[0]], [row.y, qb_position[-1]]))
            self._scat_prob_list[index].set_text(f"{cur_prob:.2f}")
            
        self._qb_target.set_data(plot_list[np.argsort(prob_list)[-1]][0], plot_list[np.argsort(prob_list)[-1]][1])
        self._qb_target_secondary.set_data(plot_list[np.argsort(prob_list)[-2]][0], plot_list[np.argsort(prob_list)[-2]][1])
            
        return (self._scat_football, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list, *self._scat_prob_list, self._qb_target, self._qb_target_secondary)

We don't want the likehlihood for players that are close to the quarterback to be high because o

In [None]:
a1 = 0.5
a2 = 15

x = np.linspace(-10, 40, 100) 
sigmoid = lambda x, a1, a2: 1/(1 + np.exp(-a1*(x - a2)))

plt.plot(x, sigmoid(x, a1, a2))

It should be noted that all of this only works on actual plays. If there are trick plays, then we shall have issues because the visualizations expect the quarterback to be the one throwing the ball for all the probabilities to make sense.

In [None]:
class AnimatePlayPitchControlFullAnnotation(AnimatePlayPitchControlProbQB):
    def __init__(self, play_df, plot_size_len, show_control = False) -> None:
        super().__init__(play_df, plot_size_len, show_control)
        
        self._OFFENSE_POS = ['OL','OG','LG','RG','C' ,'OT','LT','RT','TE','WR','QB','HB','RB','TB','FB']
        self._SPECIAL_POS = ['P', 'K', 'LS', 'H']
        self._DEFENSE_POS = ['SS','FS', 'CB','DB','S','SAF','DE','DT','NT','DL','ILB','OLB','MLB','LB']
        
        self._def_player_num = len(self._frame_data[self._frame_data.position.isin(self._DEFENSE_POS)].nflId.unique())
        self._MAX_COVERAGE = 2 # Max Offensive players a defender can cover
        self._MAX_TARGETED = self._def_player_num # Max number of defensive players that can target any one offensive player
        
#         self._Cov_G = None # Placeholder for graph we will use to assign coverages
        
        self._MIN_DIST_FOR_DOUBLE = 10
        
        self._Cov_G = nx.DiGraph()
        
#         print(self._frame_data.frameId.unique()[:3])
        
#         self._frame_data = self._frame_data[self._frame_data.frameId.isin(self._frame_data.frameId.unique()[3:])]
        
#         self._GAUSS_SCALE_RAD = 0.7
#         self._VIEW_ANGLE = 30
#         self._VIEW_WEIGHT_MAX = np.max(norm.pdf(np.linspace(-np.pi, np.pi, num = 500), loc = 0, scale = self._GAUSS_SCALE_RAD))
#         self._QB_VIEW_WEIGHTING = 0.7
        
    def setup_plot(self):
        ax_list = list(super().setup_plot())
        
#         self._qb_target = self._ax_field.plot([], [], color = 'grey', linewidth = 3)[0]
#         self._qb_target_secondary = self._ax_field.plot([], [], color = 'grey', linestyle = '--')[0]
        
        self._def_coverage_list = []
        for _ in range(self._def_player_num):
            for _ in range(self._MAX_COVERAGE):
                self._def_coverage_list.append(self._ax_field.plot([], [], color = 'k', linestyle = '-', linewidth = 0.5)[0])
                
        print(self._def_coverage_list)
        
        return (*ax_list, *self._def_coverage_list)
    
    @staticmethod
    def sigmoid(x, a1 = 0.5, a2 = 15):
        return 1/(1 + np.exp(-a1*(x - a2)))
    
    def update(self, anim_frame):
        pos_df = next(self._stream)
        
        for label in pos_df.team.unique():
            label_data = pos_df[pos_df.team == label]

            if label == 'home':
                self._scat_home.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'away':
                self._scat_away.set_offsets(np.vstack([label_data.x, label_data.y]).T)
            elif label == 'football':
                self._scat_football.set_offsets(np.hstack([label_data.x, label_data.y]))

        jersey_df = pos_df[pos_df.jerseyNumber.notnull()]
        
        inf_home_team = 0
        inf_away_team = 0

        Z_def = np.zeros((_pos.shape[0], _pos.shape[1]))

        Z_off = []
        
        players_df = pos_df[pos_df.jerseyNumber.notnull()].sort_values(by = 'displayName').reset_index()
        
        for (index, row) in players_df.iterrows():
            self._scat_jersey_list[index].set_position((row.x, row.y))
            self._scat_jersey_list[index].set_text(row.position)
            self._scat_number_list[index].set_position((row.x, row.y+1.9))
            self._scat_number_list[index].set_text(int(row.jerseyNumber))
            self._scat_name_list[index].set_position((row.x, row.y-1.9))
            self._scat_name_list[index].set_text(row.displayName.split()[-1])
            
            player_orientation_rad = self.deg_to_rad(self.convert_orientation(row.o))
            player_direction_rad = self.deg_to_rad(self.convert_orientation(row.dir))
            player_speed = row.s
            player_position = np.array([row.x, row.y])
            player_acc = row.a
            
            speed_w = player_speed/self._MAX_PLAYER_SPEED
            
            player_vel = np.array([np.real(self.polar_to_z(player_speed, player_direction_rad)), np.imag(self.polar_to_z(player_speed, player_direction_rad))])
            player_orient = np.array([np.real(self.polar_to_z(2, player_orientation_rad)), np.imag(self.polar_to_z(2, player_orientation_rad))])
            
            influence_rad = self.weighted_angle(player_vel, player_orient, speed_w)
            
            distance_from_football = np.sqrt((pos_df[pos_df.displayName == 'Football'].x - player_position[0])**2 + ((pos_df[pos_df.displayName == 'Football'].y - player_position[1]))**2).to_numpy()[0]
            
            self._a_dir_list[index].remove()
            self._a_dir_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_vel[0], player_vel[1], color = 'k'))
            
            self._a_or_list[index].remove()
            self._a_or_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_orient[0], player_orient[1], color = 'grey', width = 2))
            
            sigma = self.generate_sigma(influence_rad, player_speed, distance_from_football)
            mu = self.generate_mu(player_position, player_vel)
            
            Z = self.multivariate_gaussian(self._pos, mu, sigma)
            Z_coarse = np.where(Z > 0.001, Z, np.nan)
            
            if row.team == self._def_team:
                Z_def += np.where(Z > 0.001, Z, 0)

            if row.team == self._off_team and row.position != 'QB':
                Z_off.append(np.where(Z > 0.001, Z, 0))
                
            if row.position == 'QB':
                qb_position = np.array([row.x, row.y])
                qb_orientation = player_orientation_rad
                qb_view_orientation = player_orientation_rad + (self.deg_to_rad(self._VIEW_ANGLE))
                
#                 print(qb_orientation*180/np.pi)
                
            if not self._show_control:
                for cont_info in self._inf_contours_list[index].collections:
                    cont_info.remove()
            
            if row.team == 'home':
                if self._show_control:
                    inf_home_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Reds', levels = 10, alpha = 0.1)
            elif row.team == 'away':
                if self._show_control:
                    inf_away_team += Z
                else:
                    self._inf_contours_list[index] = self._ax_field.contourf(self._X, self._Y, Z_coarse, cmap='Greens', levels = 10, alpha = 0.1)
        
        if self._show_control:
            for cont_info in self._pitch_control_contour.collections:
                    cont_info.remove()
                
            self._pitch_control_contour = self._ax_field.contourf(self._X, self._Y, self.sigmoid(inf_away_team/len(pos_df[pos_df.team=='away']) - inf_home_team/len(pos_df[pos_df.team=='home']),k = 1000), levels = 50, cmap='PiYG', vmin = 0.45, vmax = 0.55, alpha = 0.7)
            
        tot_off_freedom = list(np.clip(np.array(Z_off) - Z_def, 0, None))
        
        for (index, row) in players_df[(players_df.team == self._off_team) & (players_df.position != 'QB')].sort_values(by = 'displayName').reset_index().iterrows():
            diff_vector = np.array([row.x, row.y]) - qb_position
            angle_from_qb = self.deg_to_rad(np.arctan2(diff_vector[-1], diff_vector[0])*180/np.pi)%(2*np.pi)
            
#             if row.displayName == 'A.J. Green':
#                 print(angle_from_qb*180/np.pi)
#                 print(self.convert_orientation(np.arctan2(diff_vector[-1], diff_vector[0])*180/np.pi))
            
            weighting = norm.pdf(angle_from_qb, loc = qb_view_orientation, scale = self._GAUSS_SCALE_RAD)/self._VIEW_WEIGHT_MAX
        
            weighting_distance = self.sigmoid(np.linalg.norm(np.array([row.x, row.y]) - qb_position))
            
            tot_off_freedom[index] = weighting_distance*((1 - self._QB_VIEW_WEIGHTING)*tot_off_freedom[index] + self._QB_VIEW_WEIGHTING*tot_off_freedom[index]*weighting)
        
        tot_area = np.sum(tot_off_freedom)
        
        prob_list = []
        plot_list = []
        for (index, row) in players_df[(players_df.team == self._off_team) & (players_df.position != 'QB')].sort_values(by = 'displayName').reset_index().iterrows():
            self._scat_prob_list[index].set_position((row.x, row.y+3.9))
            cur_prob = np.sum(tot_off_freedom.pop(0))/tot_area
            prob_list.append(cur_prob)
            plot_list.append(([row.x, qb_position[0]], [row.y, qb_position[-1]]))
            self._scat_prob_list[index].set_text(f"{cur_prob:.2f}")
            
        self._qb_target.set_data(plot_list[np.argsort(prob_list)[-1]][0], plot_list[np.argsort(prob_list)[-1]][1])
        self._qb_target_secondary.set_data(plot_list[np.argsort(prob_list)[-2]][0], plot_list[np.argsort(prob_list)[-2]][1])
        
        # Defense Coverages
        
        pos_df = players_df[['x', 'y']]
    
        dist = pdist(pos_df, 'euclidean')
        dist_df = pd.DataFrame(np.round(squareform(dist)*1000).astype(int))
        
        distance_df = dist_df.loc[players_df.position.isin(self._DEFENSE_POS).to_numpy(), players_df.position.isin(self._OFFENSE_POS).to_numpy()]
        
#         if self._Cov_G is None:
        if True:
            self._offense_nodes = players_df.nflId.to_numpy()[list(distance_df.columns)]
            self._defense_nodes = players_df.nflId.to_numpy()[list(distance_df.index.values)]

            self._Cov_G.clear()
            self._Cov_G.add_nodes_from(self._defense_nodes, bipartite = 0)
            self._Cov_G.add_nodes_from(self._offense_nodes, bipartite = 1)
            self._Cov_G.add_nodes_from(['s','e'])

            self._end_edge_list = [(off_id, 'e', 1, self._MAX_TARGETED) for off_id in self._offense_nodes]
        
#         print("****")
#         print(self._Cov_G.edges())
            
#         self._Cov_G.remove_edges_from(self._Cov_G.edges())
            
        start_edge_list = [('s', def_id, 1, coverage) for def_id, coverage in zip(self._defense_nodes, [self._MAX_COVERAGE if x >= np.round(self._MIN_DIST_FOR_DOUBLE*1000) else 1 for x in np.min(distance_df.to_numpy(), axis = 1)])]
        
        player_id_arr = players_df.nflId.to_numpy()

        player_edge_list = []
        for (idx_def, row_def) in distance_df.iterrows():
            for (idx_off, euc_dist) in row_def.items():
                player_edge_list.append((player_id_arr[idx_def], player_id_arr[idx_off], euc_dist, 1))
                
#         print(player_edge_list)

        edge_list = start_edge_list + self._end_edge_list + player_edge_list

        [self._Cov_G.add_edge(x,y,weight=z, capacity = a) for x,y, z, a in edge_list]
        
#         print(self._Cov_G.edges())

        flow_dict = nx.max_flow_min_cost(self._Cov_G, "s", "e")
        
#         print(flow_dict)
        
        list_idx = 0
        for def_id in flow_dict.keys():
            if def_id not in self._defense_nodes:
                continue
            def_pos = (players_df[players_df.nflId == def_id].x, players_df[players_df.nflId == def_id].y)
            coverage_count = 0
            for off_id in np.array(list(flow_dict[def_id].keys()))[np.array(list(flow_dict[def_id].values()))>0]:
                off_pos = (players_df[players_df.nflId == off_id].x, players_df[players_df.nflId == off_id].y)
                
#                 print(self._def_coverage_list)
#                 print(list_idx)

#                 print([def_pos[0].values[0], off_pos[0].values[0]])
#                 print([def_pos[1], off_pos[1]])
                
                
                self._def_coverage_list[list_idx].set_data([def_pos[0].values[0], off_pos[0].values[0]], [def_pos[1].values[0], off_pos[1].values[0]])
                
                coverage_count += 1
                list_idx += 1
                
            while coverage_count < self._MAX_COVERAGE:
                coverage_count += 1
                self._def_coverage_list[list_idx].set_data([], [])
                list_idx += 1
                
#         print("Here")
#         print(self._def_coverage_list)
        
        return (self._scat_football, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list, *self._scat_prob_list, self._qb_target, self._qb_target_secondary, *self._def_coverage_list)

In [None]:
game_idx = 0
play_idx = 8

unique_game_ids = week_df.gameId.unique()
unique_play_ids = week_df[week_df.gameId == unique_game_ids[game_idx]].playId.unique()

play_df = week_df[week_df.playId == unique_play_ids[play_idx]].sort_values(by = 'time')

In [None]:
play_df.frameId.isin(play_df.frameId.unique()[:3])

In [None]:
# animated_play = AnimatePlayPitchControlProbQB(play_df, 20, show_control=False)
# HTML(animated_play.ani.to_jshtml())

animated_play = AnimatePlayPitchControlFullAnnotation(play_df, 20, show_control=False)
HTML(animated_play.ani.to_jshtml())

In [None]:
import os

os.getcwd()

In [None]:
animated_play = AnimatePlayPitchControlFullAnnotation(play_df, 20, show_control=False)
animated_play.ani.save('/kaggle/working/animation.gif', writer='imagemagick')

In [None]:
from IPython.display import HTML, Image

Image(url='animation.gif')

In [None]:
import os
os.chdir('..')
from IPython.display import FileLink
FileLink('animation.gif')

In [None]:
len(play_df[play_df.position.isin(OFFENSE_POS)].nflId.unique())

In [None]:
OFFENSE_POS = ['OL','OG','LG','RG','C' ,'OT','LT','RT','TE','WR','QB','HB','RB','TB','FB']
SPECIAL_POS = ['P', 'K', 'LS', 'H']
DEFENSE_POS = ['SS','FS', 'CB','DB','S','SAF','DE','DT','NT','DL','ILB','OLB','MLB','LB']

unique_frames = play_df.frameId.unique()
num_offense = play_df[play_df.frameId == unique_frames[0]].position.isin(OFFENSE_POS).sum()
num_frame_entities = len(play_df[play_df.frameId == unique_frames[0]])

num_offense

In [None]:
from scipy.spatial.distance import pdist, squareform

unique_frames = play_df.frameId.unique()
num_offense = play_df[play_df.frameId == unique_frames[0]].position.isin(OFFENSE_POS).sum()

closest_id = np.array([])
closest_dist = np.array([])

for idx, frame in enumerate(play_df.frameId.unique()):
    
    if idx > 1:
        break
    
    frame_df = play_df[play_df.frameId == frame]
    
    player_df = frame_df[frame_df.nflId.notnull()].sort_values(by = 'displayName')
    pos_df = player_df[['x', 'y']]
    
    dist = pdist(pos_df, 'euclidean')
    dist_df = pd.DataFrame(np.round(squareform(dist)*1000).astype(int))
    
#     closest_id = np.hstack([closest_id, player_df[player_df.position.isin(OFFENSE_POS)].nflId.to_numpy()[np.argmin(dist_df.loc[:, player_df.position.isin(OFFENSE_POS).to_numpy()].to_numpy(), axis = 1)], np.nan])
#     closest_dist = np.hstack([closest_dist, np.min(dist_df.loc[:, player_df.position.isin(OFFENSE_POS).to_numpy()].to_numpy(), axis = 1), np.nan])

In [None]:
player_df

In [None]:
player_df[player_df.position.isin(OFFENSE_POS)].nflId.to_numpy()
player_df[player_df.position.isin(DEFENSE_POS)].nflId.to_numpy()

In [None]:
distance_df = dist_df.loc[player_df.position.isin(DEFENSE_POS).to_numpy(), player_df.position.isin(OFFENSE_POS).to_numpy()]
distance_df

In [None]:
player_df.nflId.to_numpy()

In [None]:
find_name_from_id = lambda search_id: players_df[players_df.nflId == search_id].displayName.values[0]



player_edge_list

In [None]:
min_dist_for_double = 10


In [None]:
offense_nodes = player_df.nflId.to_numpy()[list(distance_df.columns)]
defense_nodes = player_df.nflId.to_numpy()[list(distance_df.index.values)]

Cov_G = nx.DiGraph()
Cov_G.add_nodes_from(defense_nodes, bipartite = 0)
Cov_G.add_nodes_from(offense_nodes, bipartite = 1)
Cov_G.add_nodes_from(['s','e'])


end_edge_list = [(off_id, 'e', 1, 2) for off_id in offense_nodes]

start_edge_list = [('s', def_id, 1, coverage) for def_id, coverage in zip(defense_nodes, [2 if x >= min_dist_for_double else 1 for x in np.min(distance_df.to_numpy(), axis = 1)])]

player_id_arr = player_df.nflId.to_numpy()

player_edge_list = []
for (idx_def, row_def) in distance_df.iterrows():
    for (idx_off, euc_dist) in row_def.items():
        player_edge_list.append((player_id_arr[idx_def], player_id_arr[idx_off], euc_dist, 1))
        
edge_list = start_edge_list + end_edge_list + player_edge_list

[Cov_G.add_edge(x,y,weight=z, capacity = a) for x,y, z, a in edge_list]

flow_dict = nx.max_flow_min_cost(Cov_G, "s", "e")

In [None]:
flow_dict

In [None]:
np.array(list(flow_dict[2495613].keys()))

list(flow_dict[2495613].values())

In [None]:
np.array(list(flow_dict[2495613].keys()))[np.array(list(flow_dict[2495613].values()))>0]

In [None]:
player_df[player_df.nflId == 2507828.]

In [None]:
find_name_from_id(player_id_arr[0])

In [None]:
B = nx.Graph()
B.add_nodes_from([1,2,3,4], bipartite=0) # Add the node attribute "bipartite"
B.add_nodes_from(['abc','bcd','cef'], bipartite=1)
B.add_nodes_from(['s','e'])

myEdges = [
    ('s', 1, 1, 2, 1),
    ('s', 2, 1, 2, 1),
    ('s', 3, 1, 2, 1),
    ('s', 4, 1, 2, 1),
    
    (1,'abc', 1, 1, 1), 
    (1,'bcd', 1, 1, 2), 
    (2,'bcd', 5, 1, 1), 
    (2,'cef', 6, 1, 2), 
    (3,'cef', 4, 1, 3), 
    (4,'abc', 7, 1, 3),
    
    ('abc', 'e', 1, 2, 1),
    ('cef', 'e', 1, 2, 1),
    ('bcd', 'e', 1, 2, 1)
]

[B.add_edge(x,y,weight=z, capacity = a, length = b) for x,y, z, a, b in myEdges]

B.edges()

In [None]:
B.remove_edges_from(B.edges())

B.edges()

In [None]:
import networkx as nx

play_df.loc[play_df.frameId == 1]

In [None]:
# def __init__(self, play_df, plot_size_len, show_control = True) -> None:
#     super().__init__(play_df, plot_size_len)
#     """Initializes the datasets used to animate the play.

#     Parameters
#     ----------
#     play_df : DataFrame
#         Dataframe corresponding to the play information for the play that requires
#         animation. This data will come from the weeks dataframe and contains position
#         and velocity information for each of the players and the football.

#     Returns
#     -------
#     None
#     """
#     self._MAX_PLAYER_SPEED = 11.3
#     self._X, self._Y, self._pos = self.generate_data_grid()

#     self._ax_football = self._ax_field.twinx()

#     self._show_control = show_control
#     plt.close()



# def data_stream():
#     for time in _times:
#         yield _frame_data[self._frame_data.time == time]

In [None]:

        

# if _show_control:
#     _pitch_control_contour = _ax_field.contourf([0, 0], [0, 0], [[0,0],[0,0]])

In [None]:
plt.figure(figsize = (20, 10))
plt.imshow(np.flip(Z_test, axis = 0), aspect = 'auto')
plt.imshow(np.clip(np.flip(Z_test - Z_def, axis = 0), 0, None), aspect = 'auto')

print(np.sum(np.clip(np.flip(Z_test - Z_def, axis = 0), 0, None)))

In [None]:
self.set_axis_plots(self._ax_field, self._MAX_FIELD_X, self._MAX_FIELD_Y)
        
ball_snap_df = self._frame_data[(self._frame_data.event == 'ball_snap') & (self._frame_data.team == 'football')]
self._ax_field.axvline(ball_snap_df.x.to_numpy()[0], color = 'k', linestyle = '--')

self.set_axis_plots(self._ax_home, self._MAX_FIELD_X, self._MAX_FIELD_Y)
self.set_axis_plots(self._ax_away, self._MAX_FIELD_X, self._MAX_FIELD_Y)
self.set_axis_plots(self._ax_jersey, self._MAX_FIELD_X, self._MAX_FIELD_Y)
self.set_axis_plots(self._ax_football, self._MAX_FIELD_X, self._MAX_FIELD_Y)

for idx in range(10,120,10):
    self._ax_field.axvline(idx, color = 'k', linestyle = '-', alpha = 0.05)

self._scat_football = self._ax_football.scatter([], [], s = 100, color = 'black')
self._scat_home = self._ax_home.scatter([], [], s = 500, color = self._CPLT[0], edgecolors = 'k')
self._scat_away = self._ax_away.scatter([], [], s = 500, color = self._CPLT[1], edgecolors = 'k')

self._scat_jersey_list = []
self._scat_number_list = []
self._scat_name_list = []
self._a_dir_list = []
self._a_or_list = []
self._inf_contours_list = []
for _ in range(self._MAX_FIELD_PLAYERS):
    self._scat_jersey_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'white'))
    self._scat_number_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))
    self._scat_name_list.append(self._ax_jersey.text(0, 0, '', horizontalalignment = 'center', verticalalignment = 'center', c = 'black'))

    self._a_dir_list.append(self._ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))
    self._a_or_list.append(self._ax_field.add_patch(Arrow(0, 0, 0, 0, color = 'k')))

    if not self._show_control:
        self._inf_contours_list.append(self._ax_field.contourf([0, 0], [0, 0], [[0,0],[0,0]]))

if self._show_control:
    self._pitch_control_contour = self._ax_field.contourf([0, 0], [0, 0], [[0,0],[0,0]])

return (self._scat_football, self._scat_home, self._scat_away, *self._scat_jersey_list, *self._scat_number_list, *self._scat_name_list)

for (index, row) in pos_df[frame_df.jerseyNumber.notnull()].reset_index().iterrows():
    self._scat_jersey_list[index].set_position((row.x, row.y))
    self._scat_jersey_list[index].set_text(row.position)
    self._scat_number_list[index].set_position((row.x, row.y+1.9))
    self._scat_number_list[index].set_text(int(row.jerseyNumber))
    self._scat_name_list[index].set_position((row.x, row.y-1.9))
    self._scat_name_list[index].set_text(row.displayName.split()[-1])

    player_orientation_rad = self.deg_to_rad(self.convert_orientation(row.o))
    player_direction_rad = self.deg_to_rad(self.convert_orientation(row.dir))
    player_speed = row.s
    player_position = np.array([row.x, row.y])
    player_acc = row.a

    speed_w = player_speed/self._MAX_PLAYER_SPEED

    player_vel = np.array([np.real(self.polar_to_z(player_speed, player_direction_rad)), np.imag(self.polar_to_z(player_speed, player_direction_rad))])
    player_orient = np.array([np.real(self.polar_to_z(2, player_orientation_rad)), np.imag(self.polar_to_z(2, player_orientation_rad))])

    influence_rad = self.weighted_angle(player_vel, player_orient, speed_w)

    distance_from_football = np.sqrt((pos_df[pos_df.displayName == 'Football'].x - player_position[0])**2 + ((pos_df[pos_df.displayName == 'Football'].y - player_position[1]))**2).to_numpy()[0]

    self._a_dir_list[index].remove()
    self._a_dir_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_vel[0], player_vel[1], color = 'k'))

    self._a_or_list[index].remove()
    self._a_or_list[index] = self._ax_field.add_patch(Arrow(row.x, row.y, player_orient[0], player_orient[1], color = 'grey', width = 2))

    sigma = self.generate_sigma(influence_rad, player_speed, distance_from_football)
    mu = self.generate_mu(player_position, player_vel)

    Z = self.multivariate_gaussian(self._pos, mu, sigma)