# NFL 4th Down Win Probability Analysis
**Author:** Neil Dorsey

This notebook evaluates NFL coaching decisions on 4th down using simulated data. It compares actual decisions to model-driven optimal decisions using win probability.

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

## Load the Data

In [None]:
data = pd.read_csv('nfl_4th_down_decision_data.csv')
data.head()

## Data Preparation

In [None]:
data['yard_number'] = data['yard_line'].str.extract(r'(\d+)').astype(int)
data['yard_side'] = data['yard_line'].str.contains('OWN').map({True: 'Own', False: 'Opp'})
data['decision_code'] = data['decision'].astype('category').cat.codes
data['time_min'] = data['time_left_q4'].str.split(':').apply(lambda x: int(x[0]) + int(x[1])/60)

## Model Building: Predicting Win Probability

In [None]:
X = data[['to_go', 'yard_number', 'decision_code']]
y = data['wp_optimal']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print('MSE:', mean_squared_error(y_test, y_pred))
model.coef_

## Decision Evaluation

In [None]:
data['wp_delta'] = round(data['wp_optimal'] - data['wp_actual'], 3)
data['missed_opportunity'] = (data['wp_delta'] > 0.05) & (data['decision'] != 'Go for it')
data[['team', 'coach', 'decision', 'wp_actual', 'wp_optimal', 'wp_delta', 'missed_opportunity']].head()

## Insights

- Coaches left potential win probability on the table in multiple scenarios.
- The model suggests going for it more often improves WP.
- Future models can include more features like score, field goal range, or weather.

## Conclusion
This notebook provides an analytical framework to assess coaching decisions and drive strategic improvements.