# Welcome To The Sport Analytics + Visualization Notebook 

# 1. Analyzing the NFL games

The first dataset that we will be looking at is the one containing information about the NFL games. Such kind of datasets are very helpful in giving us an idea about how a sport's season was/will be played out.


This dataset contains the following information:

- gameId: Game identifier, unique (numeric)

- gameDate: Game Date (time, mm/dd/yyyy)

- gameTimeEastern: Start time of game (time, HH:MM:SS, EST)

- homeTeamAbbr: Home team three-letter code (text)

- visitorTeamAbbr: Visiting team three-letter code (text)

- week: Week of game (numeric)

**Let us start by importing the necessary libraries,**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import datetime

%matplotlib inline

**Next, importing the CSV file called `games.csv` which contains information about the games.**

In [None]:
# Reading in the CSV file as a DataFrame 
games_df = pd.read_csv('../input/beginners-sports-analytics-nfl-dataset/games.csv')

In [None]:
# Looking at the first five rows
games_df.head()

**Let us look at the shape of the DataFrame to determine how many games were played out in the 2018 NFL season.**

In [None]:
# Viewing the shape of the DataFrame
games_df.shape

**Before we begin our analysis, let us convert the date and time columns to Pandas datetime values.** 

**This will help to standarize such data across the multiple datasets that we work with and it will also help us use ready-made functions.**

In [None]:
# Converting to datetime.date values
games_df['gameDate'] = pd.to_datetime(games_df['gameDate']).dt.date

# Converting to datetime.time values
games_df['gameTimeEastern'] = pd.to_datetime(games_df['gameTimeEastern']).dt.time

# Looking at the first five rows
games_df.head()

**Now, let us understand how the games are distributed in accordance to the date, time, day and the week**

**Starting the analysis by looking at the distribution of games in relation to the game dates.**

In [None]:
# Checking the frequency of games in relation to game dates
# games_df['gameDate'].value_counts().reset_index()

games_df['gameDate'].value_counts().reset_index()

**There were a total of 50 different game dates.**

In [None]:
# Checking the frequency of games in relation to game dates
date_dist = games_df['gameDate'].value_counts().reset_index()

# Renaming the columns
date_dist.columns = ['date', 'frequency']

# Looking at the first five rows
date_dist.head()

**Next, sorting the data based on the date and setting the index as the date.**

In [None]:
# Sorting the DataFrame based on the date values
sorted_date_dist = date_dist.sort_values('date').set_index('date')

# Looking at the first five rows
sorted_date_dist.head()

**Let us plot the distribution using a bar plot.**

In [None]:
# Plotting a bar plot
sorted_date_dist.plot(kind='bar', figsize=(20,4))

**We can do the same analysis for the time, day and week as well. So, let us convert our code to a Python function.**

In [None]:
def find_dist(df, col_name):
    
    # Checking the frequency of games in relation to the column values
    dist = df[col_name].value_counts().reset_index()
    
    # Renaming the columns
    dist.columns = [col_name, 'frequency']
        
    # Sorting the DataFrame based on the column values
    sorted_dist = dist.sort_values(col_name, ascending=True).set_index(col_name)

    # Plotting a bar plot
    sorted_dist.plot(kind='bar', figsize=(20,4))

    # Return a boolean indicating the function was successfully executed
    return True

# Visualizing the frequency distribution of games in relation to the date
find_dist(games_df, 'gameDate')

**Let us visualize the frequency distribution of games in relation to time and week number.**

In [None]:
# Looking at the first five rows
games_df.head()

In [None]:
# Visualizing frequency distribution of games in relation to the time
find_dist(games_df, 'gameTimeEastern')

In [None]:
# Visualizing frequency distribution of games in relation to the week
find_dist(games_df, 'week')

**Finally, let us look at how the games are distributed in relation to the game days. For this, we will have to convert the dates to which day they fall in the week.**

In [None]:
# Looking at the first five rows
games_df.head()

In [None]:
# Creating a column containing the day of the week information extracted from the date
games_df['gameDay'] = games_df['gameDate'].apply(lambda x: x.strftime('%A'))

# Looking at the first five rows
games_df.head()

**Visualizing the game distribution in relation to the game day.**

In [None]:
# Visualizing frequency distribution of games in relation to the day of the week
find_dist(games_df, 'gameDay')

# 2. Knowing the NFL players

The second dataset that we will be looking at is the one containing information about the NFL players. Such kind of datasets are very helpful in giving us an idea about the physical attributes of a player and the distribution of player statistics amongst different team positions.

This dataset contains the following information:

- nflId: Player identification number, unique across players (numeric)

- height: Player height (text)

- weight: Player weight (numeric)

- birthDate: Date of birth (YYYY-MM-DD)

- collegeName: Player college (text)

- position: Player position (text)

- displayName: Player name (text)

**Let us start by importing the necessary libraries**

In [None]:
import seaborn as sns
import datetime

**Next, importing the CSV file called players.csv which contains information about the NFL players.**

In [None]:
# Reading in the CSV file as a DataFrame 
players_df = pd.read_csv('../input/beginners-sports-analytics-nfl-dataset/players.csv')

In [None]:
# Looking at the first five rows
players_df.head()

**Let us also view the shape of the DataFrame to know how many players are present in the dataset.**

In [None]:
# Viewing the shape of the DataFrame
players_df.shape

**Before we begin, let us convert the date columns to Pandas datetime values.**

In [None]:
# Converting to datetime.date values
players_df['birthDate'] = pd.to_datetime(players_df['birthDate']).dt.date

# Extracting the year
players_df['birthYear'] = pd.to_datetime(players_df['birthDate']).dt.year

# Looking at the first five rows
players_df.head()

**Let us start our analysis by finding the age distribution of the NFL players. For this, we will have to find the age of the players in respect to the year 2018.**

In [None]:
# Finding the age of the players
players_df['age'] = 2018 - players_df['birthYear']

# Looking at the first five rows
players_df.head()

**Since, we have the function we made in the previous section, we can use it to find the age distribution of the players easily.**

In [None]:
def find_dist(df, col_name):
    
    # Checking the frequency of games in relation to the column values
    dist = df[col_name].value_counts().reset_index()
    
    # Renaming the columns
    dist.columns = [col_name, 'frequency']
        
    # Sorting the DataFrame based on the column values
    sorted_dist = dist.sort_values(col_name, ascending=True).set_index(col_name)

    # Plotting a bar plot
    sorted_dist.plot(kind='bar', figsize=(20,4))

    # Return a boolean indicating the function was successfully executed
    return True

In [None]:
# Visualizing frequency distribution of players in relation to their age
find_dist(players_df, 'age')

**Next, let us also see how the players are distributed amongst different team positions.**

In [None]:
# Looking at the first five rows
players_df.head()

In [None]:
# Visualizing frequency distribution of players in relation to their positions
find_dist(players_df, 'position')

**Now, let us look at how the age distribution of players in the CB (Cornerback) and WR (Wide Receiver) positions.** 

**For this, we can select the data points for either of the positions and then, find their age distribution.**

In [None]:
# Selecting position = CB
players_df.query('position == "CB"')

In [None]:
# Visualizing frequency distribution of players in relation to the CB position
find_dist(players_df.query('position == "CB"'), 'age')

In [None]:
# Visualizing frequency distribution of players in relation to the WR position
find_dist(players_df.query('position == "WR"'), 'age')

**Now, let us look at the actual height and weight distribution of the players. However, their is some inconsistency in the data in the height column.**

In [None]:
# Looking at the first twenty rows
players_df.head(20)

**Let us fix it by converting all datapoints to inches.**

In [None]:
# Fixing the inconsistency by converting all data to inches
players_df['height'] = players_df['height'].apply(lambda x: int(x[0])*12 + int(x[2]) if '-' in x else int(x))

# Looking at the first twenty rows
players_df.head(20)

**Now, instead of looking at the height and weight distribution of players seperately, let us look at them together by making a joint plot.**

In [None]:
# Extracting the height values
players_df['height'].values

In [None]:
# Assigning the height and weight values
height = players_df['height'].values
weight = players_df['weight'].values

In [None]:
# Plotting a joint plot
sns.jointplot(weight, height)

# 3. Understanding the NFL plays

The third dataset that we will be looking at is the dataset containing information about the plays in different NFL games. Such kind of datasets are very helpful in giving us an idea about how the players are actually playing the games.

The dataset cointains the following items:

- gameId: Game identifier, unique (numeric)

- playId: Play identifier, not unique across games (numeric)

- playDescription: Description of play (text)

- quarter: Game quarter (numeric)

- down: Down (numeric)

- yardsToGo: Distance needed for a first down (numeric)

- possessionTeam: Team on offense (text)

- playType: Outcome of dropback: sack or pass (text)

- yardlineSide: 3-letter team code corresponding to line-of-scrimmage (text)

- yardlineNumber: Yard line at line-of-scrimmage (numeric)

- offenseFormation: Formation used by possession team (text)

- personnelO: Personnel used by offensive team (text)

- defendersInTheBox: Number of defenders in close proximity to line-of-scrimmage (numeric)

- numberOfPassRushers: Number of pass rushers (numeric)

- personnelD: Personnel used by defensive team (text)

- typeDropback: Dropback categorization of quarterback (text)

- preSnapHomeScore: Home score prior to the play (numeric)

- preSnapVisitorScore: Visiting team score prior to the play (numeric)

- gameClock: Time on clock of play (MM:SS)

- absoluteYardlineNumber: Distance from end zone for possession team (numeric)

- penaltyCodes: NFL categorization of the penalties that ocurred on the play. For purposes of this contest, the most important penalties are Defensive Pass Interference (DPI), Offensive Pass Interference (OPI), Illegal Contact (ICT), and Defensive Holding (DH). Multiple penalties on a play are separated by a ; (text)

- penaltyJerseyNumber: Jersey number and team code of the player commiting each penalty. Multiple penalties on a play are separated by a ; (text)

- passResult: Outcome of the passing play (C: Complete pass, I: Incomplete pass, S: Quarterback sack, IN: Intercepted pass, text)

- offensePlayResult: Yards gained by the offense, excluding penalty yardage (numeric)

- playResult: Net yards gained by the offense, including penalty yardage (numeric)

- epa: Expected points added on the play, relative to the offensive team. Expected points is a metric that estimates the average of every next scoring outcome given the play's down, distance, yardline, and time remaining (numeric)

- isDefensivePI: An indicator variable for whether or not a DPI penalty ocurred on a given play (TRUE/FALSE)

**You may have already noted that this dataset is very NFL American Football specific and there are a lot of terms you may not be familiar with.**

**Therefore,we will be going over the data in the dataset but we will not be analyzing it since you need to be a football guru to understand all these terms. However, this is a perfect opportunity for you to have a practical chance at performing a complete analysis on your own**

In [None]:
# Reading in the CSV file as a DataFrame 
plays_df = pd.read_csv('../input/beginners-sports-analytics-nfl-dataset/plays.csv')

In [None]:
# Looking at the first five rows
plays_df.head()

In [None]:
plays_df.shape

# 4. Visualizing the American Football Field

Now, we will be visualizing the American football field using Matplotlib. This section takes inspiration from the work of the Kaggle Grandmaster, [Rob Mulla](https://www.kaggle.com/robikscube/nfl-big-data-bowl-plotting-player-position/notebook), and we will be following his footsteps with some little changes of our own.

The main objective is to learn how to create advanced plots using Matplotlib in order to build up confidence in creating any kind of plot that we want. 

The field that we will be creating will try to mimic the following football field image:


In [None]:
import matplotlib.patches as patches

In [None]:
# Create a rectangle defined via an anchor point *xy* and its *width* and *height*
rect = patches.Rectangle((0, 0), 120, 53.3, facecolor='darkgreen', zorder=0)

# Creating a subplot to plot our field on
fig, ax = plt.subplots(1, figsize=(12, 6.33))

# Adding the rectangle to the plot
ax.add_patch(rect)

**Let us add a line plot to create some lines on the field by using the plot() method.**

In [None]:
# Create a rectangle defined via an anchor point *xy* and its *width* and *height*
rect = patches.Rectangle((0, 0), 120, 53.3, facecolor='darkgreen', zorder=0)

# Creating a subplot to plot our field on
fig, ax = plt.subplots(1, figsize=(12, 6.33))

# Adding the rectangle to the plot
ax.add_patch(rect)

# Plotting a line plot for marking the field lines
plt.plot([10, 10, 20, 20],
         [0, 53.3, 53.3, 0],
         color='white', zorder=0)

**Now let us create all the lines on the field.**

In [None]:
# Create a rectangle defined via an anchor point *xy* and its *width* and *height*
rect = patches.Rectangle((0, 0), 120, 53.3, facecolor='darkgreen', zorder=0)

# Creating a subplot to plot our field on
fig, ax = plt.subplots(1, figsize=(12, 6.33))

# Adding the rectangle to the plot
ax.add_patch(rect)

# Plotting a line plot for marking the field lines
plt.plot([10, 10, 20, 20, 30, 30, 40, 40, 50, 50, 60, 60, 70, 70, 80,
          80, 90, 90, 100, 100, 110, 110, 120, 0, 0, 120, 120],
         [0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 
          0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 53.3, 0, 0, 53.3],
         color='white', zorder = 0)

**Now, let us add the endzones onto the plot.**

In [None]:
# Create a rectangle defined via an anchor point *xy* and its *width* and *height*
rect = patches.Rectangle((0, 0), 120, 53.3, facecolor='darkgreen', zorder=0)

# Creating a subplot to plot our field on
fig, ax = plt.subplots(1, figsize=(12, 6.33))

# Adding the rectangle to the plot
ax.add_patch(rect)

# Plotting a line plot for marking the field lines
plt.plot([10, 10, 20, 20, 30, 30, 40, 40, 50, 50, 60, 60, 70, 70, 80,
          80, 90, 90, 100, 100, 110, 110, 120, 0, 0, 120, 120],
         [0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 
          0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 53.3, 0, 0, 53.3],
         color='white', zorder = 0)

# Creating the left end-zone
left_end_zone = patches.Rectangle((0, 0), 10, 53.3, facecolor='blue', alpha=0.2, zorder=0)

# Creating the right end-zone
right_end_zone = patches.Rectangle((110, 0), 120, 53.3, facecolor='blue', alpha=0.2, zorder=0)

# Adding the patches to the subplot
ax.add_patch(left_end_zone)
ax.add_patch(right_end_zone)

# Setting the limits of x-axis from 0 to 120
plt.xlim(0, 120)

# Setting the limits of y-axis from -5 to 58.3
plt.ylim(-5, 58.3)

# Removing the axis values from the plot
plt.axis('off')

**It is time for us to plot the numbers on the field.**

In [None]:
# Create a rectangle defined via an anchor point *xy* and its *width* and *height*
rect = patches.Rectangle((0, 0), 120, 53.3, facecolor='darkgreen', zorder=0)

# Creating a subplot to plot our field on
fig, ax = plt.subplots(1, figsize=(12, 6.33))

# Adding the rectangle to the plot
ax.add_patch(rect)

# Plotting a line plot for marking the field lines
plt.plot([10, 10, 20, 20, 30, 30, 40, 40, 50, 50, 60, 60, 70, 70, 80,
          80, 90, 90, 100, 100, 110, 110, 120, 0, 0, 120, 120],
         [0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 
          0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 53.3, 0, 0, 53.3],
         color='white', zorder = 0)

# Creating the left end-zone
left_end_zone = patches.Rectangle((0, 0), 10, 53.3, facecolor='blue', alpha=0.2, zorder=0)

# Creating the right end-zone
right_end_zone = patches.Rectangle((110, 0), 120, 53.3, facecolor='blue', alpha=0.2, zorder=0)

# Adding the patches to the subplot
ax.add_patch(left_end_zone)
ax.add_patch(right_end_zone)

# Setting the limits of x-axis from 0 to 120
plt.xlim(0, 120)

# Setting the limits of y-axis from -5 to 58.3
plt.ylim(-5, 58.3)

# Removing the axis values from the plot
# plt.axis('off')

# Plotting the numbers starting from x = 20 and ending at x = 110
# with a step of 10
for x in range(20, 110, 10):

    # Intializing another variable named 'number'
    number = x

    # If x exceeds 50, subtract it from 120
    if x > 50:
        number = 120 - x

    # Plotting the text at the bottom
    plt.text(x, 5, str(number - 10),
             horizontalalignment='center',
             fontsize=20,
             color='white')

    # Plotting the text at the top
    plt.text(x - 0.95, 53.3 - 5, str(number - 10),
             horizontalalignment='center',
             fontsize=20,
             color='white',
             rotation=180)

**Let us finally create the ground markings and complete the plot.**

In [None]:
# Create a rectangle defined via an anchor point *xy* and its *width* and *height*
rect = patches.Rectangle((0, 0), 120, 53.3, facecolor='darkgreen', zorder=0)

# Creating a subplot to plot our field on
fig, ax = plt.subplots(1, figsize=(12, 6.33))

# Adding the rectangle to the plot
ax.add_patch(rect)

# Plotting a line plot for marking the field lines
plt.plot([10, 10, 20, 20, 30, 30, 40, 40, 50, 50, 60, 60, 70, 70, 80,
          80, 90, 90, 100, 100, 110, 110, 120, 0, 0, 120, 120],
         [0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 
          0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 53.3, 0, 0, 53.3],
         color='white', zorder = 0)

# Creating the left end-zone
left_end_zone = patches.Rectangle((0, 0), 10, 53.3, facecolor='blue', alpha=0.2, zorder=0)

# Creating the right end-zone
right_end_zone = patches.Rectangle((110, 0), 120, 53.3, facecolor='blue', alpha=0.2, zorder=0)

# Adding the patches to the subplot
ax.add_patch(left_end_zone)
ax.add_patch(right_end_zone)

# Setting the limits of x-axis from 0 to 120
plt.xlim(0, 120)

# Setting the limits of y-axis from -5 to 58.3
plt.ylim(-5, 58.3)

# Removing the axis values from the plot
plt.axis('off')

# Plotting the numbers starting from x = 20 and ending at x = 110
# with a step of 10
for x in range(20, 110, 10):

    # Intializing another variable named 'number'
    number = x

    # If x exceeds 50, subtract it from 120
    if x > 50:
        number = 120 - x

    # Plotting the text at the bottom
    plt.text(x, 5, str(number - 10),
             horizontalalignment='center',
             fontsize=20,
             color='white')

    # Plotting the text at the top
    plt.text(x - 0.95, 53.3 - 5, str(number - 10),
             horizontalalignment='center',
             fontsize=20,
             color='white',
             rotation=180)

# Making ground markings
for x in range(11, 110):
        ax.plot([x, x], [0.4, 0.7], color='white', zorder = 0)
        ax.plot([x, x], [53.0, 52.5], color='white', zorder = 0)
        ax.plot([x, x], [22.91, 23.57], color='white', zorder = 0)
        ax.plot([x, x], [29.73, 30.39], color='white', zorder = 0)

**Wrapping the entire code in a function for easy plotting**

In [None]:
def create_football_field():
    
    # Create a rectangle defined via an anchor point *xy* and its *width* and *height*
    rect = patches.Rectangle((0, 0), 120, 53.3, facecolor='darkgreen', zorder=0)

    # Creating a subplot to plot our field on
    fig, ax = plt.subplots(1, figsize=(12, 6.33))

    # Adding the rectangle to the plot
    ax.add_patch(rect)

    # Plotting a line plot for marking the field lines
    plt.plot([10, 10, 20, 20, 30, 30, 40, 40, 50, 50, 60, 60, 70, 70, 80,
              80, 90, 90, 100, 100, 110, 110, 120, 0, 0, 120, 120],
             [0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 
              0, 0, 53.3, 53.3, 0, 0, 53.3, 53.3, 53.3, 0, 0, 53.3],
             color='white', zorder = 0)

    # Creating the left end-zone
    left_end_zone = patches.Rectangle((0, 0), 10, 53.3, facecolor='blue', alpha=0.2, zorder=0)

    # Creating the right end-zone
    right_end_zone = patches.Rectangle((110, 0), 120, 53.3, facecolor='blue', alpha=0.2, zorder=0)

    # Adding the patches to the subplot
    ax.add_patch(left_end_zone)
    ax.add_patch(right_end_zone)

    # Setting the limits of x-axis from 0 to 120
    plt.xlim(0, 120)

    # Setting the limits of y-axis from -5 to 58.3
    plt.ylim(-5, 58.3)

    # Removing the axis values from the plot
    plt.axis('off')

    # Plotting the numbers starting from x = 20 and ending at x = 110
    # with a step of 10
    for x in range(20, 110, 10):

        # Intializing another variable named 'number'
        number = x

        # If x exceeds 50, subtract it from 120
        if x > 50:
            number = 120 - x

        # Plotting the text at the bottom
        plt.text(x, 5, str(number - 10),
                 horizontalalignment='center',
                 fontsize=20,
                 color='white')

        # Plotting the text at the top
        plt.text(x - 0.95, 53.3 - 5, str(number - 10),
                 horizontalalignment='center',
                 fontsize=20,
                 color='white',
                 rotation=180)

    # Making ground markings
    for x in range(11, 110):
            ax.plot([x, x], [0.4, 0.7], color='white', zorder = 0)
            ax.plot([x, x], [53.0, 52.5], color='white', zorder = 0)
            ax.plot([x, x], [22.91, 23.57], color='white', zorder = 0)
            ax.plot([x, x], [29.73, 30.39], color='white', zorder = 0)
    
    # Returning the figure and axis
    return fig, ax

In [None]:
# Calling the plotting function
fig, ax = create_football_field()

# Plotting the figure
plt.show()

# 5. Adding Players onto the Field

The fourth dataset that we will be looking at is the dataset containing the tracking information of the players. Such kind of datasets are very helpful in breaking down different players gameplays on a personal level.

We will be visualizing the players on the field that we had built in the previous section.

This dataset contains the following information:

- time: Time stamp of play (time, yyyy-mm-dd, hh:mm:ss)

- x: Player position along the long axis of the field, 0 - 120 yards. See Figure 1 below. (numeric)

- y: Player position along the short axis of the field, 0 - 53.3 yards. See Figure 1 below. (numeric)

- s: Speed in yards/second (numeric)

- a: Acceleration in yards/second^2 (numeric)

- dis: Distance traveled from prior time point, in yards (numeric)

- o: Player orientation (deg), 0 - 360 degrees (numeric)

- dir: Angle of player motion (deg), 0 - 360 degrees (numeric)

- event: Tagged play details, including moment of ball snap, pass release, pass catch, tackle, etc (text)

- nflId: Player identification number, unique across players (numeric)

- displayName: Player name (text)

- jerseyNumber: Jersey number of player (numeric)

- position: Player position group (text)

- team: Team (away or home) of corresponding player (text)

- frameId: Frame identifier for each play, starting at 1 (numeric)

- gameId: Game identifier, unique (numeric)

- playId: Play identifier, not unique across games (numeric)

- playDirection: Direction that the offense is moving (text, left or right)

- route: Route ran by offensive player (text)

In [None]:
# Reading the data as a Pandas DataFrame
df = pd.read_csv('../input/beginners-sports-analytics-nfl-dataset/week_data.csv')

In [None]:
# Looking at the first five rows of the DataFrame 
df.head()

In [None]:
# Looking at the shape of the DataFrame
df.shape

**Since the time is in an improper format for analysis, let us convert it to datetime.**

In [None]:
# Converting to Time values
df['time'] = pd.to_datetime(df['time']).dt.time

# Looking at the first five rows of the DataFrame 
df.head()

**We would want to analyze each game by the passage of time, so let us sort the values to be ascending.**

In [None]:
# Sorting the values of the DataFrame by time in an ascending order
df = df.sort_values(by='time', ascending=True).reset_index(drop=True)

# Looking at the first five rows of the DataFrame 
df.head()

**Let us select a specific gameId and playID to visualize the player positions within a specific game and play.**

In [None]:
# Selecting the data for the given game and play based on their Id
sel_df = df.query('gameId == 2018111900 and playId == 5577')

# Looking at the shape of the DataFrame
print(f'The shape of the DataFrame is: {sel_df.shape}')

# Looking at the DataFrame
sel_df

**Now, let us seperate out the teams as well as the football in the data for plotting.**

In [None]:
# Selecting the home and away team
home_team = sel_df.query('team == "home"')
away_team = sel_df.query('team == "away"')

# Selecting the football
football = sel_df.query('team == "football"')

In [None]:
# Creating the football field
fig, ax = create_football_field()

# Plotitng the home team
home_team.plot(x='x', y='y', kind='scatter', ax=ax, color='blue', s=20, zorder=2)

# Plotting the away team
away_team.plot(x='x', y='y', kind='scatter', ax=ax, color='orange', s=20, zorder=2)

# Plotting the football
football.plot(x='x', y='y', kind='scatter', ax=ax, color='brown', s=20, zorder=2)

# Displaying the plot
plt.show()

**We can also visualize a specific event by just selecting the event.**

In [None]:
sel_df['event'].unique()

**Plotting the data for the event of ball_snap, that is, when the quarterback first receives the football**

In [None]:
# Creating the football field
fig, ax = create_football_field()

# Plotitng the home team
home_team.query('event == "ball_snap"').plot(x='x', y='y', kind='scatter', ax=ax, color='blue', s=20, zorder=2)

# Plotting the away team
away_team.query('event == "ball_snap"').plot(x='x', y='y', kind='scatter', ax=ax, color='orange', s=20, zorder=2)

# Plotting the football
football.query('event == "ball_snap"').plot(x='x', y='y', kind='scatter', ax=ax, color='brown', s=20, zorder=2)

# Displaying the plot
plt.show()

**In this way, we can visualize any game, play and event on the football field.**

**With this, the awesome notebook on Sports Analytics + Visualization comes to an end.**

**I hope you have learnt a lot of stuff and were excited as much I was while I prepared this notebook!**

**You can consider following me on various platforms and connect with me as well. I would love collaborating on a project or Kaggle Competition together.**

**My Github: https://github.com/aryashah2k**

**My Linkedin: https://www.linkedin.com/in/arya--shah/**

**My Twitter: https://twitter.com/aryashah2k**

**Have a great day!**