# Residualization Against Betting Odds: Calibrated Correction Modeling

## Approach: Uplift Modeling

This notebook implements a **residual-learning** approach to upset prediction:

### Core Formula:
```
uplift = logit(P_actual) - logit(P_market)
```

Where:
- **P_market** = Betting odds probability (market's prediction)
- **P_actual** = Actual outcome (1 if upset occurred, 0 otherwise)
- **uplift** = The residual that captures where the market was wrong

### Why This Approach?

1. **Preserves market efficiency as a prior** ✓
   - Betting markets aggregate information from thousands of experts
   - We don't discard this valuable signal
   
2. **Quantifies contextual factors the market misses** ✓
   - Train model on non-odds features to predict uplift
   - Identifies systematic market biases
   
3. **Avoids data leakage** ✓
   - Model never uses odds to define the target
   - Only uses pre-match observable features (rankings, stats, context)
   
4. **Interpretable insights** ✓
   - SHAP values reveal when/why market underestimates underdogs
   - Can identify surface-specific, player-style, or tournament biases

### Workflow:
1. Calculate market probabilities from betting odds
2. Compute uplift = logit(actual) - logit(market)
3. Train model on engineered features to predict uplift
4. Combine: P_final = inverse_logit(logit(P_market) + uplift_predicted)
5. Evaluate if we beat the market's predictions