# Advanced Predictions & Scenario Analysis

This notebook demonstrates advanced prediction workflows beyond basic match outcomes.

**Topics**:
1. Multi-match lookahead predictions
2. Team strength evolution within tournaments
3. Injury scenario modeling
4. Custom prediction workflows
5. Prediction calibration on held-out matches

In [None]:
from rugby_ranking.notebook_utils import setup_notebook_environment, load_model_and_trace
from rugby_ranking.model.predictions import MatchPredictor
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Setup
dataset, df, model_dir = setup_notebook_environment()

## 1. Match Prediction Basics

Load the model and make predictions for upcoming matches.

In [None]:
model, trace = load_model_and_trace("latest")

# Create predictor
predictor = MatchPredictor(model, trace, dataset)

# Predict a match between two teams
team_a = "Leinster"
team_b = "Munster"

prediction = predictor.predict_match(
    team_a=team_a,
    team_b=team_b,
    lineup_a=None,  # Team effects only (no lineup specified)
    lineup_b=None,
)

print(f"Match: {team_a} vs {team_b}")
print(f"  P(Team A wins): {prediction['p_team_a_wins']:.3f}")
print(f"  P(Team B wins): {prediction['p_team_b_wins']:.3f}")
print(f"  Expected scores: {team_a} {prediction['expected_score_a']:.1f} - {prediction['expected_score_b']:.1f} {team_b}")

## 2. Multi-Match Lookahead

Predict outcomes for a sequence of matches and compute path probabilities.

In [None]:
# Define a sequence of matches
matches = [
    {"team_a": "Leinster", "team_b": "Munster"},
    {"team_a": "Leinster", "team_b": "Ulster"},
    {"team_a": "Leinster", "team_b": "Connacht"},
]

# Predict each match and compute joint probabilities
match_predictions = []
for match in matches:
    pred = predictor.predict_match(
        team_a=match["team_a"],
        team_b=match["team_b"],
    )
    match_predictions.append({
        "matchup": f"{match['team_a']} vs {match['team_b']}",
        "p_team_a_wins": pred['p_team_a_wins'],
        "score_a": pred['expected_score_a'],
        "score_b": pred['expected_score_b'],
    })

predictions_df = pd.DataFrame(match_predictions)
print("Multi-Match Predictions:")
print(predictions_df)

# Compute path probability (all wins)
all_wins_prob = predictions_df['p_team_a_wins'].prod()
print(f"\nP(Team A wins all 3 matches): {all_wins_prob:.4f}")

## 3. Tournament Strength Evolution

Analyze how team strengths change over the course of a tournament season.

In [None]:
# Get matches for a specific team over a season
season = "2024"
team = "Leinster"
team_matches = df[(df['team'] == team) & (df['season'] == season)].copy()
team_matches = team_matches.sort_values('date')

print(f"{team} matches in {season}: {len(team_matches)} matches")
print(f"Date range: {team_matches['date'].min().date()} to {team_matches['date'].max().date()}\n")

# If time-varying effects are available, extract team trend
if hasattr(model, 'config') and getattr(model.config, 'time_varying_effects', False):
    print("Time-varying effects available (form changes captured in posterior)")
    # Would extract trend from trace here
else:
    print("Note: Static model (no within-season form changes)")
    print("Consider training with time_varying_effects=True for form evolution")

## 4. Injury Scenario Modeling

Simulate impact of player unavailability on team predictions.

In [None]:
# Example: Predict match with and without a key player

def predict_with_injury(team_a, team_b, injured_player=None, dataset=None, predictor=None):
    """
    Predict match assuming a player is injured (unavailable).
    
    Parameters:
    - team_a, team_b: Team names
    - injured_player: Player name to exclude
    - dataset: MatchDataset instance
    - predictor: MatchPredictor instance
    
    Returns:
    - Prediction dict with injury scenario
    """
    # TODO: Implement lineup-based prediction
    # This would:
    # 1. Get typical lineup for team_a
    # 2. Remove injured_player from lineup
    # 3. Optionally add replacement player (if known)
    # 4. Call predictor.predict_match(lineup_a=modified_lineup, lineup_b=...)
    # 5. Compare to baseline prediction
    
    print(f"Injury scenario: {injured_player} unavailable for {team_a}")
    print("(Implementation pending: requires lineup-based prediction)")
    return None

# Example call
predict_with_injury("Leinster", "Munster", injured_player="Johnny Sexton", 
                   dataset=dataset, predictor=predictor)

## 5. Custom Prediction Workflows

Build custom analyses combining multiple predictions.

In [None]:
# Example: Find most impactful matches for a team's tournament placement

def analyze_tournament_impact(team, remaining_matches, predictor):
    """
    Analyze impact of each remaining match on final tournament standing.
    
    Parameters:
    - team: Team name
    - remaining_matches: List of dicts with 'opponent' and 'date'
    - predictor: MatchPredictor instance
    
    Computes: Î”P(top 4) for each match outcome
    """
    print(f"Tournament impact analysis for {team}:")
    print(f"Remaining matches: {len(remaining_matches)}\n")
    
    for i, match in enumerate(remaining_matches, 1):
        opponent = match.get('opponent', '?')
        prediction = predictor.predict_match(team, opponent)
        print(f"{i}. vs {opponent}: P(win) = {prediction['p_team_a_wins']:.3f}")
    
    print("\n(Full impact calculation requires season simulator)")

# Example
remaining = [
    {"opponent": "Munster", "date": "2026-02-01"},
    {"opponent": "Ulster", "date": "2026-02-08"},
    {"opponent": "Connacht", "date": "2026-02-15"},
]
analyze_tournament_impact("Leinster", remaining, predictor)

## 6. Calibration & Validation

Evaluate prediction accuracy on held-out data.

In [None]:
# TODO: Implement calibration analysis
# This would:
# 1. Split data into train/test (temporal or random split)
# 2. Train model on train set
# 3. Make predictions for test set matches
# 4. Compute metrics:
#    - Calibration (predicted probs vs actual outcomes)
#    - Log-likelihood
#    - Brier score
#    - AUC-ROC
# 5. Generate calibration plot
# 6. Show prediction intervals coverage

print("Calibration analysis structure:")
print("- Temporal split: Use first 80% of matches for training")
print("- Make predictions for remaining 20%")
print("- Compute: P(outcome), log-likelihood, calibration metrics")
print("- Generate calibration plot (predicted prob vs actual frequency)")
print("\n(Implementation pending in validation.py module)")