# Ball Knower v1.1 - Calibration & Historical Backtest

## Overview

This notebook demonstrates the Ball Knower v1.1 calibrated spread model.

### Model Evolution:
- **v1.0**: Deterministic model with fixed weights  
  `spread = 0.02*nfelo_diff + 0.5*substack_diff + 35*epa_off_diff - 35*epa_def_diff`

- **v1.1**: Calibrated weights learned from historical Vegas lines  
  Uses ordinary least squares to find optimal weights that minimize error vs. Vegas

### This Notebook:
1. Load historical weeks (1-10 of 2025 season)
2. Calibrate model weights using all historical games
3. Evaluate correlation with Vegas lines
4. Calculate Mean Absolute Error (MAE)
5. Backtest Against-The-Spread (ATS) performance with edge thresholds

---

## Section 1: Setup & Imports

In [None]:
import sys
from pathlib import Path
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Add project root to path
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root))

# Import Ball Knower v1.1 calibration module
from ball_knower.models.v1_1_calibration import (
    calibrate_weights,
    prepare_training_matrix,
    build_week_lines_v1_1,
    load_schedule_data
)

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.precision', 2)

# Plot settings
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (10, 6)

print("‚úì Imports complete")

## Section 2: Configuration

Define the historical weeks to use for calibration.

In [None]:
# Training configuration
SEASON = 2025
TRAINING_WEEKS = list(range(1, 11))  # Weeks 1-10

print(f"Calibration Configuration:")
print(f"  Season: {SEASON}")
print(f"  Training Weeks: {TRAINING_WEEKS}")
print(f"  Total Weeks: {len(TRAINING_WEEKS)}")

## Section 3: Calibrate Model Weights

Use historical weeks to solve for optimal weights via ordinary least squares.

The model solves:  
`vegas_line ‚âà w_nfelo * nfelo_diff + w_substack * substack_diff + w_epa_off * epa_off_diff + w_epa_def * epa_def_diff + bias`

In [None]:
# Calibrate weights using historical data
weights = calibrate_weights(SEASON, TRAINING_WEEKS)

print("\nCalibrated Weights Summary:")
print(f"  nfelo:          {weights['weight_nfelo']:.4f}")
print(f"  substack:       {weights['weight_substack']:.4f}")
print(f"  epa_offensive:  {weights['weight_epa_off']:.4f}")
print(f"  epa_defensive:  {weights['weight_epa_def']:.4f}")
print(f"  bias:           {weights['bias']:.4f}")

### Comparison: v1.0 vs v1.1 Weights

Let's compare the calibrated weights with the v1.0 fixed weights.

In [None]:
# Fixed weights from v1.0
v1_0_weights = {
    'weight_nfelo': 0.02,
    'weight_substack': 0.5,
    'weight_epa_off': 35.0,
    'weight_epa_def': -35.0,
    'bias': 0.0
}

# Create comparison table
comparison = pd.DataFrame({
    'Component': ['nfelo', 'substack', 'epa_off', 'epa_def', 'bias'],
    'v1.0 (Fixed)': [
        v1_0_weights['weight_nfelo'],
        v1_0_weights['weight_substack'],
        v1_0_weights['weight_epa_off'],
        v1_0_weights['weight_epa_def'],
        v1_0_weights['bias']
    ],
    'v1.1 (Calibrated)': [
        weights['weight_nfelo'],
        weights['weight_substack'],
        weights['weight_epa_off'],
        weights['weight_epa_def'],
        weights['bias']
    ]
})

comparison['Difference'] = comparison['v1.1 (Calibrated)'] - comparison['v1.0 (Fixed)']
comparison['% Change'] = (comparison['Difference'] / comparison['v1.0 (Fixed)'].replace(0, np.nan)) * 100

print("\n" + "="*80)
print("WEIGHT COMPARISON: v1.0 vs v1.1")
print("="*80)
display(comparison)
print("="*80)

## Section 4: Generate Predictions on Training Data

Apply calibrated weights to historical data to evaluate fit.

In [None]:
# Load training matrix to get predictions
X, y, games_df = prepare_training_matrix(SEASON, TRAINING_WEEKS)

# Calculate v1.1 predictions
X_with_bias = np.column_stack([X, np.ones(len(X))])
w_vector = np.array([
    weights['weight_nfelo'],
    weights['weight_substack'],
    weights['weight_epa_off'],
    weights['weight_epa_def'],
    weights['bias']
])

games_df['bk_line_v1_1'] = X_with_bias @ w_vector

# Calculate v1.0 predictions
games_df['bk_line_v1_0'] = (
    0.02 * games_df['nfelo_diff'] +
    0.5 * games_df['substack_power_diff'] +
    35.0 * games_df['epa_off_diff'] +
    -35.0 * games_df['epa_def_diff']
)

print(f"\n‚úì Generated predictions for {len(games_df)} games")
print(f"\nSample predictions:")
display(games_df[[
    'week', 'away_team', 'home_team', 
    'vegas_line', 'bk_line_v1_1', 'bk_line_v1_0'
]].head(10))

## Section 5: Correlation Analysis

Visualize how well our model predictions correlate with Vegas lines.

In [None]:
# Calculate correlations
corr_v1_1 = np.corrcoef(games_df['vegas_line'], games_df['bk_line_v1_1'])[0, 1]
corr_v1_0 = np.corrcoef(games_df['vegas_line'], games_df['bk_line_v1_0'])[0, 1]

# Create scatter plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# v1.1 plot
ax1.scatter(games_df['vegas_line'], games_df['bk_line_v1_1'], alpha=0.6, s=50)
ax1.plot([-15, 15], [-15, 15], 'r--', label='Perfect Agreement', linewidth=2)
ax1.set_xlabel('Vegas Line', fontsize=12)
ax1.set_ylabel('Ball Knower v1.1 Line', fontsize=12)
ax1.set_title(f'v1.1 Calibrated Model\nCorrelation: {corr_v1_1:.4f}', fontsize=14)
ax1.legend()
ax1.grid(True, alpha=0.3)

# v1.0 plot
ax2.scatter(games_df['vegas_line'], games_df['bk_line_v1_0'], alpha=0.6, s=50, color='orange')
ax2.plot([-15, 15], [-15, 15], 'r--', label='Perfect Agreement', linewidth=2)
ax2.set_xlabel('Vegas Line', fontsize=12)
ax2.set_ylabel('Ball Knower v1.0 Line', fontsize=12)
ax2.set_title(f'v1.0 Fixed Weights Model\nCorrelation: {corr_v1_0:.4f}', fontsize=14)
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"\nCorrelation with Vegas Lines:")
print(f"  v1.1 (Calibrated): {corr_v1_1:.4f}")
print(f"  v1.0 (Fixed):      {corr_v1_0:.4f}")
print(f"  Improvement:       {corr_v1_1 - corr_v1_0:+.4f}")

## Section 6: Mean Absolute Error (MAE)

Calculate prediction error relative to Vegas lines.

In [None]:
# Calculate MAE for both models
mae_v1_1 = np.mean(np.abs(games_df['vegas_line'] - games_df['bk_line_v1_1']))
mae_v1_0 = np.mean(np.abs(games_df['vegas_line'] - games_df['bk_line_v1_0']))

# Calculate RMSE
rmse_v1_1 = np.sqrt(np.mean((games_df['vegas_line'] - games_df['bk_line_v1_1']) ** 2))
rmse_v1_0 = np.sqrt(np.mean((games_df['vegas_line'] - games_df['bk_line_v1_0']) ** 2))

print("\n" + "="*70)
print("PREDICTION ERROR ANALYSIS")
print("="*70)
print(f"\nMean Absolute Error (MAE):")
print(f"  v1.1 (Calibrated): {mae_v1_1:.3f} points")
print(f"  v1.0 (Fixed):      {mae_v1_0:.3f} points")
print(f"  Improvement:       {mae_v1_0 - mae_v1_1:+.3f} points")

print(f"\nRoot Mean Squared Error (RMSE):")
print(f"  v1.1 (Calibrated): {rmse_v1_1:.3f} points")
print(f"  v1.0 (Fixed):      {rmse_v1_0:.3f} points")
print(f"  Improvement:       {rmse_v1_0 - rmse_v1_1:+.3f} points")
print("="*70)

## Section 7: Against-The-Spread (ATS) Backtest

Simulate betting strategy:
- Only bet when our model disagrees with Vegas by at least X points (edge threshold)
- Evaluate win rate and W-L-P record at different thresholds: 1, 2, 3, 4 points

In [None]:
# Add actual game results (margin from home team perspective)
games_df['actual_margin'] = games_df['home_score'] - games_df['away_score']

# Filter for games with actual results
games_with_results = games_df[games_df['actual_margin'].notna()].copy()

print(f"Games with actual results: {len(games_with_results)} / {len(games_df)}")

if len(games_with_results) == 0:
    print("\n‚ö†Ô∏è  No game results available yet for ATS backtest.")
    print("This section requires completed games with scores.")
else:
    # Calculate edge (model line - vegas line)
    games_with_results['edge_v1_1'] = games_with_results['bk_line_v1_1'] - games_with_results['vegas_line']
    games_with_results['edge_v1_0'] = games_with_results['bk_line_v1_0'] - games_with_results['vegas_line']
    
    # Determine if home team covered the spread
    # Home covers if: actual_margin + vegas_line > 0
    games_with_results['home_covered'] = (games_with_results['actual_margin'] + games_with_results['vegas_line']) > 0
    
    # Determine which side we would bet on based on edge
    # If edge < 0: our model is more favorable to home than Vegas ‚Üí bet home
    # If edge > 0: our model is less favorable to home than Vegas ‚Üí bet away
    games_with_results['bet_home_v1_1'] = games_with_results['edge_v1_1'] < 0
    games_with_results['bet_home_v1_0'] = games_with_results['edge_v1_0'] < 0
    
    # Edge thresholds to test
    EDGE_THRESHOLDS = [1, 2, 3, 4]
    
    # Results storage
    ats_results = []
    
    for threshold in EDGE_THRESHOLDS:
        # Filter games where we have at least 'threshold' edge
        bets_v1_1 = games_with_results[games_with_results['edge_v1_1'].abs() >= threshold].copy()
        bets_v1_0 = games_with_results[games_with_results['edge_v1_0'].abs() >= threshold].copy()
        
        # v1.1 results
        if len(bets_v1_1) > 0:
            bets_v1_1['bet_won'] = bets_v1_1['bet_home_v1_1'] == bets_v1_1['home_covered']
            bets_v1_1['bet_push'] = bets_v1_1['actual_margin'] + bets_v1_1['vegas_line'] == 0
            
            wins_v1_1 = bets_v1_1['bet_won'].sum()
            losses_v1_1 = (~bets_v1_1['bet_won'] & ~bets_v1_1['bet_push']).sum()
            pushes_v1_1 = bets_v1_1['bet_push'].sum()
            total_v1_1 = len(bets_v1_1)
            win_rate_v1_1 = wins_v1_1 / total_v1_1 if total_v1_1 > 0 else 0
        else:
            wins_v1_1 = losses_v1_1 = pushes_v1_1 = total_v1_1 = 0
            win_rate_v1_1 = 0
        
        # v1.0 results
        if len(bets_v1_0) > 0:
            bets_v1_0['bet_won'] = bets_v1_0['bet_home_v1_0'] == bets_v1_0['home_covered']
            bets_v1_0['bet_push'] = bets_v1_0['actual_margin'] + bets_v1_0['vegas_line'] == 0
            
            wins_v1_0 = bets_v1_0['bet_won'].sum()
            losses_v1_0 = (~bets_v1_0['bet_won'] & ~bets_v1_0['bet_push']).sum()
            pushes_v1_0 = bets_v1_0['bet_push'].sum()
            total_v1_0 = len(bets_v1_0)
            win_rate_v1_0 = wins_v1_0 / total_v1_0 if total_v1_0 > 0 else 0
        else:
            wins_v1_0 = losses_v1_0 = pushes_v1_0 = total_v1_0 = 0
            win_rate_v1_0 = 0
        
        ats_results.append({
            'Edge Threshold': f"{threshold}+ pts",
            'v1.1 Bets': total_v1_1,
            'v1.1 Record': f"{wins_v1_1}-{losses_v1_1}-{pushes_v1_1}",
            'v1.1 Win %': f"{win_rate_v1_1*100:.1f}%",
            'v1.0 Bets': total_v1_0,
            'v1.0 Record': f"{wins_v1_0}-{losses_v1_0}-{pushes_v1_0}",
            'v1.0 Win %': f"{win_rate_v1_0*100:.1f}%"
        })
    
    ats_df = pd.DataFrame(ats_results)
    
    print("\n" + "="*100)
    print("ATS BACKTEST RESULTS (Against-The-Spread Performance)")
    print("="*100)
    print("\nBetting Strategy: Only bet when model disagrees with Vegas by >= edge threshold")
    print("Record Format: Wins-Losses-Pushes\n")
    display(ats_df)
    print("="*100)
    print("\nNote: Win rate above 52.4% is profitable at -110 odds")
    print("="*100)

## Section 8: Summary & Key Insights

In [None]:
print("\n" + "="*80)
print("BALL KNOWER v1.1 - SUMMARY")
print("="*80)
print(f"\nüìä Training Data:")
print(f"   Season: {SEASON}")
print(f"   Weeks: {min(TRAINING_WEEKS)} - {max(TRAINING_WEEKS)}")
print(f"   Games: {len(games_df)}")

print(f"\nüéØ Calibrated Weights:")
print(f"   nfelo:          {weights['weight_nfelo']:>8.4f}")
print(f"   substack:       {weights['weight_substack']:>8.4f}")
print(f"   epa_offensive:  {weights['weight_epa_off']:>8.4f}")
print(f"   epa_defensive:  {weights['weight_epa_def']:>8.4f}")
print(f"   bias:           {weights['bias']:>8.4f}")

print(f"\nüìà Model Performance:")
print(f"   Correlation with Vegas: {corr_v1_1:.4f}")
print(f"   MAE:  {mae_v1_1:.3f} points")
print(f"   RMSE: {rmse_v1_1:.3f} points")

print(f"\n‚úÖ Improvement over v1.0:")
print(f"   Correlation: {corr_v1_1 - corr_v1_0:+.4f}")
print(f"   MAE:         {mae_v1_0 - mae_v1_1:+.3f} points")
print(f"   RMSE:        {rmse_v1_0 - rmse_v1_1:+.3f} points")

print("\n" + "="*80)
print("‚úì Analysis Complete")
print("="*80)

## Next Steps

1. **Apply to Future Weeks**: Use `build_week_lines_v1_1()` to generate predictions for upcoming games
2. **Expand Training Data**: Include more historical seasons for more robust calibration
3. **Feature Engineering**: Consider additional components (rest days, QB adjustments, weather)
4. **Model v1.2**: Add ML correction layer on top of v1.1 base predictions

---

**Model Formula (v1.1):**

```
spread = w‚ÇÅ¬∑nfelo_diff + w‚ÇÇ¬∑substack_diff + w‚ÇÉ¬∑epa_off_diff + w‚ÇÑ¬∑epa_def_diff + bias
```

Where weights are learned from historical Vegas lines via OLS regression.

---