# TNF Prediction: Baltimore Ravens at Miami Dolphins (Oct 30, 2025)

Two spaces after periods.  This notebook builds a transparent ensemble using market odds, injuries, weather, **EPA/play features**, and simple ML models.

## Game
- **Matchup**: Baltimore Ravens @ Miami Dolphins  
- **Kick**: 8:15 PM ET (5:15 PM PT) on Oct 30, 2025  
- **Venue**: Hard Rock Stadium, Miami Gardens, FL

## Sources (plug-in points)
- **Odds**: consensus line (e.g., BAL −7.5; total ~51.5; ML BAL −420 / MIA +330).  
- **Injuries**: team status as of this morning.  
- **EPA/play**: SumerSports or RBSDM team tables (season-to-date).  
- **Weather**: Miami Gardens hourly forecast.

> We treat the market as a strong prior, layer injuries/home/conditions, then add EPA feature nudges and compare **logistic** vs **gradient-boosted** models.  If you drop in historical nflfastR game data, Section 8 will retrain on real games.


In [ ]:
from datetime import datetime
inputs = {
    'timestamp_local': '2025-10-30T21:58:32',
    'spread_favorite': -7.5,        # Ravens -7.5
    'total': 51.5,
    'moneyline_favorite': -420,     # Ravens -420
    'moneyline_underdog': 330,      # Dolphins +330
    'injuries_ravens': {'out': 0, 'doubtful': 0, 'questionable': 0},
    'injuries_dolphins': {'out': 1, 'doubtful': 1, 'questionable': 4},
    'weather': {'precip': False,'temp_f': 80,'breezy': True},
    'home_field': 'Dolphins'
}
inputs

## 1) Market‑implied probability (vig‑free)

In [ ]:
def implied_prob(odds):
    return (-odds)/((-odds)+100) if odds<0 else 100/(odds+100)
p_fav = implied_prob(inputs['moneyline_favorite'])
p_dog = implied_prob(inputs['moneyline_underdog'])
vig = p_fav + p_dog - 1
p_fav_fair = p_fav/(1+vig)
p_dog_fair = p_dog/(1+vig)
p_fav, p_dog, vig, p_fav_fair, p_dog_fair

## 2) Spread→win sanity check

In [ ]:
import math
def spread_to_prob(spread,k=0.18):
    return 1/(1+math.exp(-k*spread))
p_spread = spread_to_prob(abs(inputs['spread_favorite']))
p_spread

## 3) Injury and home/weather adjustments

In [ ]:
weights = {'out':0.010,'doubtful':0.006,'questionable':0.002}
def injury_penalty(inj):
    return sum(weights[k]*inj.get(k,0) for k in weights)
pen_bal = injury_penalty(inputs['injuries_ravens'])
pen_mia = injury_penalty(inputs['injuries_dolphins'])
net_to_ravens = max(0.0, pen_mia - pen_bal)
home_bump = 0.015
p_market_adj = max(0,min(1,p_fav_fair - home_bump + net_to_ravens))
p_spread_adj  = max(0,min(1,p_spread  - home_bump + net_to_ravens))
p_market_adj, p_spread_adj

## 4) Ensemble probability & pick

In [ ]:
p_final = 0.5*(p_market_adj + p_spread_adj)
winner = 'Ravens' if p_final>=0.5 else 'Dolphins'
p_final, winner

## 5) Monte Carlo cover probability

In [ ]:
import numpy as np
spread = 7.5
mu_margin = 9.5
sd_margin = 13.0
N = 100_000
rng = np.random.default_rng(20251030)
margins = rng.normal(loc=mu_margin, scale=sd_margin, size=N)
p_cover_sim = float(np.mean(margins > spread))
p_win_sim = float(np.mean(margins > 0))
percentiles = np.percentile(margins,[5,25,50,75,95])
percentiles.tolist(), p_cover_sim, p_win_sim

In [ ]:
import matplotlib.pyplot as plt
plt.figure()
plt.hist(margins, bins=60, edgecolor='black', alpha=0.7)
plt.axvline(spread, linestyle='--', linewidth=2, label=f'Spread = {spread}')
plt.axvline(0, linestyle=':', linewidth=2, label='Even (0)')
plt.title('Simulated Margin of Victory (Ravens − Dolphins)')
plt.xlabel('Margin')
plt.ylabel('Frequency')
plt.legend()
plt.show()

## 6) Add EPA/play features (SumerSports / RBSDM)

Season‑to‑date 2025 EPA/play (manually entered here; replace with live pulls or CSV as desired):
- **Baltimore offense EPA/play:** ~ **0.00**  
- **Miami offense EPA/play:** ~ **-0.01**  
- **Baltimore defense EPA/play allowed:** ~ **+0.13** (higher is worse)  
- **Miami defense EPA/play allowed:** ~ **+0.11**  

Notes: league average ~0.  Negative on defense is good; positive is bad.

In [ ]:
epa = {
  'bal_off': 0.00,
  'mia_off': -0.01,
  'bal_def_allowed': 0.13,
  'mia_def_allowed': 0.11,
}
x_features = {
  'spread_abs': abs(inputs['spread_favorite']),
  'off_diff_bal_vs_miaD': epa['bal_off'] - epa['mia_def_allowed'],
  'off_diff_mia_vs_balD': epa['mia_off'] - epa['bal_def_allowed'],
}
x_features

## 7) Model comparison on simulation‑calibrated data: Logistic vs Gradient Boosting

Features: `[spread, off_diff_fav, off_diff_dog]` → label: favorite covered (1/0).  Ready to swap with real data in Section 8.

In [ ]:
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import roc_auc_score
from pathlib import Path

def make_synth(n=20000, sd=13.0, seed=20251030):
    rng = np.random.default_rng(seed)
    spread = rng.uniform(0, 10, size=n)
    off_diff_fav = rng.normal(0, 0.10, size=n)
    off_diff_dog = rng.normal(0, 0.10, size=n)
    mu = 0.75*spread + 25*off_diff_fav - 20*off_diff_dog
    margin = rng.normal(loc=mu, scale=sd, size=n)
    covered = (margin > spread).astype(int)
    return pd.DataFrame({'spread':spread,'off_diff_fav':off_diff_fav,'off_diff_dog':off_diff_dog,'covered':covered})

df_train = make_synth()
X = df_train[['spread','off_diff_fav','off_diff_dog']].values
y = df_train['covered'].values

logit = LogisticRegression(max_iter=200)
gb = GradientBoostingClassifier(random_state=20251030)
logit.fit(X,y)
gb.fit(X,y)

auc_logit = roc_auc_score(y, logit.predict_proba(X)[:,1])
auc_gb = roc_auc_score(y, gb.predict_proba(X)[:,1])
print(f'AUC logistic: {auc_logit:.3f} | AUC GBM: {auc_gb:.3f}')

# Apply to tonight
fav_off_vs_dogD = x_features['off_diff_bal_vs_miaD']
dog_off_vs_favD = x_features['off_diff_mia_vs_balD']
X_tonight = [[x_features['spread_abs'], fav_off_vs_dogD, dog_off_vs_favD]]
p_cover_logit = float(logit.predict_proba(X_tonight)[:,1])
p_cover_gb = float(gb.predict_proba(X_tonight)[:,1])
p_cover_models = 0.5*(p_cover_logit + p_cover_gb)
p_cover_logit, p_cover_gb, p_cover_models

## 9) Dedicated Logistic Regression Analysis

This section builds an **interpretable logistic regression model** for P(cover).  We print coefficients for interpretability.

In [ ]:
from sklearn.metrics import RocCurveDisplay

# Reuse df_train or make fresh
df_lr = df_train.copy()
X_lr = df_lr[['spread','off_diff_fav','off_diff_dog']]
y_lr = df_lr['covered']
lr = LogisticRegression(max_iter=300)
lr.fit(X_lr, y_lr)
auc_lr = roc_auc_score(y_lr, lr.predict_proba(X_lr)[:,1])
print(f'AUC (logit, synthetic): {auc_lr:.3f}')
print('Coefficients:')
for n, c in zip(X_lr.columns, lr.coef_[0]):
    print(f'  {n}: {c:.3f}')

# Tonight application
X_tonight_lr = [[x_features['spread_abs'], fav_off_vs_dogD, dog_off_vs_favD]]
p_cover_lr = float(lr.predict_proba(X_tonight_lr)[:,1])
print(f'Predicted P(Ravens cover -7.5) (logit) = {p_cover_lr:.3f}')


## 8) Train on real nflfastR game data (optional)

Drop a CSV at `/mnt/data/nflfastR_games.csv` with columns:

`season, week, spread_line, off_epa_fav, def_epa_fav, off_epa_dog, def_epa_dog, margin, covered`

It will compute EPA differentials, train Logistic & GBM, print AUCs, and output P(cover) for tonight.

In [ ]:
from pathlib import Path
csv_real = Path('/mnt/data/nflfastR_games.csv')
if csv_real.exists():
    real = pd.read_csv(csv_real)
    needed = {'spread_line','off_epa_fav','def_epa_fav','off_epa_dog','def_epa_dog','covered'}
    if needed.issubset(real.columns):
        real = real.dropna(subset=list(needed)).copy()
        real['off_diff_fav'] = real['off_epa_fav'] - real['def_epa_dog']
        real['off_diff_dog'] = real['off_epa_dog'] - real['def_epa_fav']
        Xr = real[['spread_line','off_diff_fav','off_diff_dog']].values
        yr = real['covered'].values
        logit_r = LogisticRegression(max_iter=300)
        gb_r = GradientBoostingClassifier(random_state=42)
        logit_r.fit(Xr, yr)
        gb_r.fit(Xr, yr)
        auc_logit_r = roc_auc_score(yr, logit_r.predict_proba(Xr)[:,1])
        auc_gb_r = roc_auc_score(yr, gb_r.predict_proba(Xr)[:,1])
        print(f'AUC (real, logit): {auc_logit_r:.3f} | AUC (real, GBM): {auc_gb_r:.3f}')
        X_tonight_r = [[x_features['spread_abs'], fav_off_vs_dogD, dog_off_vs_favD]]
        p_cover_logit_r = float(logit_r.predict_proba(X_tonight_r)[:,1])
        p_cover_gb_r = float(gb_r.predict_proba(X_tonight_r)[:,1])
        p_cover_real = 0.5*(p_cover_logit_r + p_cover_gb_r)
        print(f'Predicted P(cover Ravens −7.5) with REAL-data models = {p_cover_real:.3f}')
    else:
        print('CSV missing required columns; see header above.')
else:
    print('No nflfastR_games.csv found.  Drop your CSV in /mnt/data and rerun this cell.')
