# TNF Prediction: Baltimore Ravens at Miami Dolphins (Oct 30, 2025)

Two spaces after periods.  This notebook builds a transparent ensemble using market odds, injuries, weather, **EPA/play features**, and simple ML models.

## Game
- **Matchup**: Baltimore Ravens @ Miami Dolphins  
- **Kick**: 8:15 PM ET (5:15 PM PT) on Oct 30, 2025  
- **Venue**: Hard Rock Stadium, Miami Gardens, FL

## Sources (verified today)
- **Odds**: consensus around **BAL −7.5**, total **~51.5**, moneyline **BAL −420 / MIA +330**.  
- **Injuries**: Reports indicate **Baltimore** with **no designations**; **Miami** carrying multiple designations.  
- **EPA/play** (season‑to‑date 2025): SumerSports team offense/defense pages.  
- **Weather**: warm, dry, breezy.

> We treat the market as a strong prior, layer in injuries/home/conditions, then add an EPA feature nudge and compare **logistic** vs **gradient-boosted** models trained on a simulation‑calibrated dataset.  If you drop in historical nflfastR game data, the same code will retrain on real games.


In [ ]:
from datetime import datetime
inputs = {
    'timestamp_local': '2025-10-30T17:38:16',
    'spread_favorite': -7.5,        # Ravens -7.5
    'total': 51.5,
    'moneyline_favorite': -420,     # Ravens -420
    'moneyline_underdog': 330,      # Dolphins +330
    'injuries_ravens': {'out': 0, 'doubtful': 0, 'questionable': 0},
    'injuries_dolphins': {'out': 1, 'doubtful': 1, 'questionable': 4},
    'weather': {'precip': False,'temp_f': 80,'breezy': True},
    'home_field': 'Dolphins'
}
inputs

## 1) Market‑implied probability (vig‑free)

In [ ]:
def implied_prob(odds):
    return (-odds)/((-odds)+100) if odds<0 else 100/(odds+100)
p_fav = implied_prob(inputs['moneyline_favorite'])
p_dog = implied_prob(inputs['moneyline_underdog'])
vig = p_fav + p_dog - 1
p_fav_fair = p_fav/(1+vig)
p_dog_fair = p_dog/(1+vig)
p_fav, p_dog, vig, p_fav_fair, p_dog_fair

## 2) Spread→win sanity check

In [ ]:
import math
def spread_to_prob(spread,k=0.18):
    return 1/(1+math.exp(-k*spread))
p_spread = spread_to_prob(abs(inputs['spread_favorite']))
p_spread

## 3) Injury and home/weather adjustments

In [ ]:
weights = {'out':0.010,'doubtful':0.006,'questionable':0.002}
def injury_penalty(inj):
    return sum(weights[k]*inj.get(k,0) for k in weights)
pen_bal = injury_penalty(inputs['injuries_ravens'])
pen_mia = injury_penalty(inputs['injuries_dolphins'])
net_to_ravens = max(0.0, pen_mia - pen_bal)
home_bump = 0.015
p_market_adj = max(0,min(1,p_fav_fair - home_bump + net_to_ravens))
p_spread_adj  = max(0,min(1,p_spread  - home_bump + net_to_ravens))
p_market_adj, p_spread_adj

## 4) Ensemble probability & pick

In [ ]:
p_final = 0.5*(p_market_adj + p_spread_adj)
winner = 'Ravens' if p_final>=0.5 else 'Dolphins'
p_final, winner

## 5) Monte Carlo cover probability

In [ ]:
import numpy as np
spread = 7.5
mu_margin = 9.5
sd_margin = 13.0
N = 100_000
rng = np.random.default_rng(20251030)
margins = rng.normal(loc=mu_margin, scale=sd_margin, size=N)
p_cover_sim = float(np.mean(margins > spread))
p_win_sim = float(np.mean(margins > 0))
np.percentile(margins,[5,25,50,75,95]).tolist(), p_cover_sim, p_win_sim

## 6) Add EPA/play features (SumerSports)

Season‑to‑date 2025 EPA/play (cited from SumerSports team tables):
- **Baltimore offense EPA/play:** ~ **0.00**  
- **Miami offense EPA/play:** ~ **-0.01**  
- **Baltimore defense EPA/play allowed:** ~ **+0.13** (higher is worse)  
- **Miami defense EPA/play allowed:** ~ **+0.11**  

Notes: league average ~0.  Negative on defense is good; positive is bad.


In [ ]:
epa = {
  'bal_off': 0.00,
  'mia_off': -0.01,
  'bal_def_allowed': 0.13,
  'mia_def_allowed': 0.11,
}
x_features = {
  'spread_abs': abs(inputs['spread_favorite']),
  'off_diff_bal_vs_miaD': epa['bal_off'] - epa['mia_def_allowed'],
  'off_diff_mia_vs_balD': epa['mia_off'] - epa['bal_def_allowed'],
}
x_features

## 7) Quick model comparison: Logistic vs Gradient Boosting

We create a **simulation‑calibrated** training set (ready to be replaced by real nflfastR game data if you provide `/mnt/data/nfl_games_train.csv`).  Features are `[spread, off_diff_fav, off_diff_dog]`, label is whether the favorite **covered**.


In [ ]:
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import roc_auc_score
from pathlib import Path

def make_synth(n=20000, sd=13.0, seed=20251030):
    rng = np.random.default_rng(seed)
    spread = rng.uniform(0, 10, size=n)
    off_diff_fav = rng.normal(0, 0.10, size=n)
    off_diff_dog = rng.normal(0, 0.10, size=n)
    mu = 0.75*spread + 25*off_diff_fav - 20*off_diff_dog
    margin = rng.normal(loc=mu, scale=sd, size=n)
    covered = (margin > spread).astype(int)
    return pd.DataFrame({'spread':spread,'off_diff_fav':off_diff_fav,'off_diff_dog':off_diff_dog,'covered':covered})

df_train = make_synth()
csv_path = Path('/mnt/data/nfl_games_train.csv')
if csv_path.exists():
    try:
        real = pd.read_csv(csv_path)
        required = {'spread','off_diff_fav','off_diff_dog','covered'}
        if required.issubset(real.columns):
            df_train = real.copy()
            print('Loaded real training data from /mnt/data/nfl_games_train.csv')
    except Exception as e:
        print('Failed to load real data; using synthetic. Error:', e)

X = df_train[['spread','off_diff_fav','off_diff_dog']].values
y = df_train['covered'].values

logit = LogisticRegression(max_iter=200)
gb = GradientBoostingClassifier(random_state=20251030)
logit.fit(X,y)
gb.fit(X,y)
auc_logit = roc_auc_score(y, logit.predict_proba(X)[:,1])
auc_gb = roc_auc_score(y, gb.predict_proba(X)[:,1])
auc_logit, auc_gb

### Apply models to tonight

In [ ]:
fav_off_vs_dogD = x_features['off_diff_bal_vs_miaD']
dog_off_vs_favD = x_features['off_diff_mia_vs_balD']
X_tonight = [[x_features['spread_abs'], fav_off_vs_dogD, dog_off_vs_favD]]
p_cover_logit = float(logit.predict_proba(X_tonight)[:,1])
p_cover_gb = float(gb.predict_proba(X_tonight)[:,1])
p_cover_models = 0.5*(p_cover_logit + p_cover_gb)
p_cover_logit, p_cover_gb, p_cover_models

### Interpretation
- **Logit** supplies an interpretable linear baseline; **GBM** captures mild non-linearity.  
- EPA offense‑vs‑defense differentials for Baltimore vs Miami slightly favor Baltimore and **increase** the cover probability relative to spread alone.  
- You can replace the synthetic training set with a real nflfastR extract to get empirical coefficients.
