# 🏈 NFL QB Rating Predictor (2023)


This project uses 2023 NFL quarterback stats to build a machine learning model that predicts QB passer rating based on performance data.

- 📊 **Data:** [Pro-Football-Reference](https://www.pro-football-reference.com/years/2023/passing.htm)
- ⚙️ **Model:** Linear Regression
- 🎯 **Goal:** Predict passer rating using common passing stats

# 📦 Imports
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

# 📥 Load NFL 2023 Passing Data
url = "https://www.pro-football-reference.com/years/2023/passing.htm"
nfl = pd.read_html(url)[0]
nfl = nfl[nfl['Player'] != 'Player'].reset_index(drop=True)

# 🔧 Convert numeric columns
cols = ['Yds', 'TD', 'Int', 'Rate', 'Cmp%', 'Y/A', 'Sk%', 'G', 'Att']
for col in cols:
    nfl[col] = pd.to_numeric(nfl[col], errors='coerce')

# 🧹 Filter: Only QBs with at least 100 attempts
nfl_qbs = nfl[nfl['Att'] > 100].copy()

# 🎯 Define Features & Target
model_features = ['TD', 'Int', 'Yds', 'Cmp%', 'Y/A', 'Sk%', 'G', 'Att']
features = nfl_qbs[model_features + ['Rate']].dropna()
X = features[model_features]
y = features['Rate']
## 🤖 Model Building

We used Scikit-Learn's `LinearRegression` model to predict QB Passer Rating using the following features:

- `TD` – Touchdowns
- `Int` – Interceptions
- `Yds` – Passing Yards
- `Cmp%` – Completion Percentage
- `Y/A` – Yards per Attempt
- `Sk%` – Sack Percentage
- `G` – Games Played
- `Att` – Pass Attempts

We split the dataset into an 80/20 train-test set to evaluate performance.

# 🤖 Train/Test Split + Model
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# 📈 Evaluate
print("R² Score:", r2_score(y_test, y_pred))
print("MSE:", mean_squared_error(y_test, y_pred))

# 🧪 Compare
comparison = pd.DataFrame({'Actual': y_test.values, 'Predicted': y_pred})
comparison.head()

# 📊 Plot Predictions
plt.figure(figsize=(8,5))
sns.scatterplot(x=y_test, y=y_pred)
plt.xlabel('Actual Passer Rating')
plt.ylabel('Predicted Passer Rating')
plt.title('Actual vs Predicted QB Rating (2023)')
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'r--')
plt.show()


## 📈 Model Results & Evaluation

The model performs very well:

- **R² Score:** ~0.93 — This means the model explains 93% of the variance in passer ratings.
- **MSE:** ~7.2 — On average, predictions are off by just ~2.6 rating points.

The scatterplot below shows how closely the predicted values track the actual passer ratings. The red dashed line represents perfect predictions.
