This notebook tests the trained model on unseen 2022 NFL season data.

**Goals:**
- Load new season stats.
- Apply the same feature engineering.
- Predict Super Bowl win probability using the trained model.
- Identify likely top performers.

**Tools:**
- pandas, scikit-learn (joblib), trained model

In [2]:
import pandas as pd
import joblib
from sklearn.ensemble import RandomForestClassifier

rf_model = joblib.load('../models/rf_model.pkl')

df_2024 = pd.read_csv('C:/Users/vishw/OneDrive/Documents/NFL 2025 Winner/data/raw/2022_data.csv')


# Feature engineering (same as before)
df_2024['points_per_game'] = df_2024['points'] / df_2024['g']
df_2024['yards_per_play'] = df_2024['total_yards'] / df_2024['plays_offense']
df_2024['completion_rate'] = df_2024['pass_cmp'] / df_2024['pass_att']
df_2024['rush_avg'] = df_2024['rush_yds'] / df_2024['rush_att']
df_2024.fillna(0, inplace=True)

# Feature set used in training
features = [
    'points_diff', 'score_pct', 'turnover_pct',
    'pass_td', 'rush_td', 'penalties',
    'points_per_game', 'yards_per_play', 'completion_rate', 'rush_avg'
]

X_2024 = df_2024[features]

# Predict using the trained Random Forest model
df_2024['win_probability'] = rf_model.predict_proba(X_2024)[:, 1]

# Show top teams by prediction
df_2024[['team', 'win_probability']].sort_values(by='win_probability', ascending=False).head(10)

Unnamed: 0,team,win_probability
12,Kansas City Chiefs,0.61
20,Minnesota Vikings,0.04
0,Buffalo Bills,0.03
21,Detroit Lions,0.02
28,San Francisco 49ers,0.02
22,Green Bay Packers,0.01
8,Jacksonville Jaguars,0.01
1,Miami Dolphins,0.01
7,Cleveland Browns,0.0
6,Pittsburgh Steelers,0.0
