<h1>Does an offense's running play time relate in any way to the amount of sacks they allow?</h1>

Measuring an offensive line's strength by some kind of stat is difficult since, unfortunately, the OL is usually measured by the absense of events like sacks. Other stats, like blocks, aren't official statistics, and thus it is difficult to find good sources for them.

Sacks are a clear measure of an OL's strength. The data there had been cleaned up from this <a href="https://www.kaggle.com/zynicide/nfl-football-player-stats"> source </a> and output to a csv.

What else could be a possible measure of an OL's strength? One possible measure is the amount of time the OL gives the quarterback to attempt to make a play. 

I had trouble finding data which counted the play from the moment the ball was snapped and when the player was tackled, went out-of-bounds, or scored a touchdown. Data I worked with only kept track of time per the official rules. So, for example, a running play that took 4 seconds from snap to tackle could take more than four seconds if the player was tackled in-bounds, and the team tried to run down the clock.

 

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd


play_duration_full_path = "./play_duration_inc_pass.csv"
games_full_path = "./sack_totals.csv"

play_duration_df = pd.read_csv(play_duration_full_path)
games_df = pd.read_csv(games_full_path)

# we only have data available for play duration past 2008 so
# subset the data appropriately
games_df = games_df[games_df["year"] >= 2009]

result_df = pd.merge(games_df, play_duration_df, left_on=['team', 'year'], right_on=['posteam', 'Year'])

sacks_result_df = result_df['defense_sacks']
duration_result_df = result_df["PlayDuration"]

print("Correlation Coefficient: " + str(sacks_result_df.corr(duration_result_df)))

_ = plt.plot(sacks_result_df, duration_result_df, linestyle='none', marker='.')
_ = plt.margins(0.02)
_ = plt.xlabel("Sacks Allowed Per Season")
_ = plt.ylabel("Play Duration (2nd Incomplete Pass) \n Average Per Season (s)")
_ = plt.title("Season Team Sacks vs. Play Duration")
a, b = np.polyfit(sacks_result_df, duration_result_df, 1)
# Make theoretical line to plot
x = np.array([0, 100])
y = a * x + b

_ = plt.plot(x, y)

plt.show()


Looks like play