In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# NFL Special Teams Analysis: EDA

Despite not being as glamorous as Offense and Defense, Special Teams is still a crucial component of any NFL game. 6-Time Super Bowl-winning Head Coach Bill Belichick often mentions 'covering kicks' as a key ability of a winning team, and plenty of great teams have had their seasons cut short due to spotty kick coverage. Back in 2010, the San Diego Chargers had the #1 Offense AND the #1 Defense, yet failed to even make the postseason due to poor Special Teams play. With this in mind, I want to explore which teams are currently the best and worst at covering kickoffs and punts, and see what I can glean from that information!

In [None]:
# loading in the plays
play_data = pd.read_csv("/kaggle/input/nfl-big-data-bowl-2022/plays.csv")

# getting just kickoffs
kickoffs = play_data.loc[play_data['specialTeamsPlayType'] == 'Kickoff']

# getting just punts
punts = play_data.loc[play_data['specialTeamsPlayType'] == 'Punt']

fgs = play_data.loc[play_data['specialTeamsPlayType'] == 'Field Goal']

punts

Since Football is highly situational, we should also separate this plays further, since an Onside Kick and a Regular Kickoff for example, have very different intended outcomes. For most kickoffs, both teams are usually content to accept a "touchback" where the ball is kicked into the endzone and the return team chooses not to return it. A touchback places the ball at the 25-yard line, which we can use as the cutoff for "good" and "bad" coverage.

One way we can separate kickoffs is based on Kick Length. Kickers are not perfect, but they are professionals. If a kick travels less than 50 yards in the NFL, it's fairly safe to assume that it was not intended to be a 'full' kickoff, otherwise the return team could have only a very short distance to cover.

Once we have the data separated in this way, we can start sorting each dataframe by team. This will allow us to analyze each team's performance separately, before comparing side-by-side to see which Special Teams units perform the best (or worst).

In [None]:
full_kicks = kickoffs.loc[kickoffs['kickLength'] >= 50]

teams = kickoffs.possessionTeam.unique()
teams

teamData = pd.DataFrame()
for team in teams:
    team_kicks = full_kicks.loc[full_kicks['possessionTeam'] == team]
    total_kicks = team_kicks.shape[0]
    touchbackPercentage = team_kicks.loc[team_kicks['specialTeamsResult'] == 'Touchback'].shape[0] * 100 / total_kicks
    goodCoveragePercentage = team_kicks.loc[team_kicks['playResult'] > 40].shape[0] * 100 / total_kicks
    badCoveragePercentage = team_kicks.loc[team_kicks['playResult'] < 40].shape[0] * 100 / total_kicks
    teamVals = {'Team': team, 'Touchback %': touchbackPercentage, 'Good Coverage %': goodCoveragePercentage, 'Bad Coverage %': badCoveragePercentage}
    teamData = teamData.append(teamVals, ignore_index = True)

# sort by teams with the lowest percentage of bad coverage
teamData = teamData.sort_values(by=['Bad Coverage %'])
teamData


Looking at the data, some initial trends are apparent. Keeping in mind that this data stretches back to the 2019 season, one observation from simply looking at the Top 10 teams in terms of lowest "Bad Coverage" percentage, is that the vast majority are viewed as either perennial contenders or teams that may have overachieved in recent years. For example, we have our two previous Super Bowl winners, Kansas City and Tampa Bay, represented in the Top 10, alongside teams like Washington, New Orleans, and Miami that have recently made playoff appearances without significant starpower.

Unfortunately for Coach Belichick, our champion of good kick coverage, his team has struggled recently with covering kicks, coinciding with his worst season in over a decade. There are, however, several teams with seemingly poor kickoff units, including Tennessee, Green Bay, and San Francisco. These teams also 

In [None]:
pinning_punts = punts.loc[punts['possessionTeam'] != punts['yardlineSide']]

teams = kickoffs.possessionTeam.unique()
teams

teamPuntData = pd.DataFrame()
for team in teams:
    team_punts = pinning_punts.loc[pinning_punts['possessionTeam'] == team]
    total_punts = team_punts.shape[0]
    pinnedPercentage = team_punts.loc[team_punts['yardlineNumber'] - team_punts['playResult'] < 20].shape[0] * 100 / total_punts
    touchbackPercentage = team_punts.loc[team_punts['specialTeamsResult'] == 'Touchback'].shape[0] * 100 / total_punts
    disasterPercentage = team_punts.loc[team_punts['yardlineNumber'] - team_punts['playResult'] > 20].shape[0] * 100 / total_punts
    teamVals = {'Team': team,'Pinned %': pinnedPercentage, 'Touchback %': touchbackPercentage,'Disaster %': disasterPercentage}
    teamPuntData = teamPuntData.append(teamVals, ignore_index = True)

# sort by teams with the highest percentage of pinning teams back
teamPuntData = teamPuntData.sort_values(by=['Pinned %'], ascending=False)
teamPuntData

Out of all punts that occured on the opponent's side of the field, we can classify them as:
1: Pinned (Punt inside the 20)
2: Touchback
3: Disaster (Worse than touchback)

In [None]:
teams = kickoffs.possessionTeam.unique()
teams

teamFGData = pd.DataFrame()
for team in teams:
    team_fgs = fgs.loc[fgs['possessionTeam'] == team]
    total_fgs = team_fgs.shape[0]
    made_fgs = team_fgs.loc[team_fgs['specialTeamsResult'] == 'Kick Attempt Good']
    fgPct = made_fgs.shape[0] * 100 / total_fgs
    fiftyPlusMakes = made_fgs.loc[team_fgs['kickLength'] >= 50].shape[0]
    teamVals = {'Team': team,'Field Goal %': fgPct, '50+ Yard Makes': fiftyPlusMakes}
    teamFGData = teamFGData.append(teamVals, ignore_index = True)

# sort by teams with the highest percentage of made field goals
teamFGData = teamFGData.sort_values(by=['Field Goal %'], ascending=False)
teamFGData

Finally, we have each team ranked by field goal conversion percentage, along with average field goal attempt length.