# Stage 8a — Multi-GW Belief Rollup

## Why Multi-GW Rollup?

Single-GW decisions (captaincy, transfer-IN) showed that availability adjustment (`p_play × mu_points`) does **not** improve ranking outcomes. The top-ranked players by `mu_points` are already nailed-on starters, so multiplying by `p_play ≈ 0.95` just dampens their scores.

But what happens over **multiple gameweeks**?

If a player has `p_play = 0.90` in each GW, over 5 GWs their cumulative probability of playing all matches is:

$$P(\text{plays all 5}) = 0.90^5 = 0.59$$

This suggests that **availability risk compounds over time**. A player who is slightly rotation-prone may miss 1-2 GWs in a 5-GW window, significantly reducing their actual returns.

Stage 8a computes multi-GW belief rollups to test whether availability adjustment becomes valuable over longer horizons.

## Rollup Definitions

For each `(player_id, gw_start, horizon H)`:

| Metric | Formula | Meaning |
|--------|---------|--------|
| `cum_mu_points` | Σ mu_points | Total expected points if plays all GWs |
| `cum_ev` | Σ (p_play × mu_points) | Availability-adjusted expected value |
| `cum_play_prob` | Π p_play | Probability of playing all GWs |

In [None]:
import pandas as pd
import numpy as np

# Load rollup data
rollups = pd.read_csv("../storage/datasets/beliefs_multigw.csv")
beliefs = pd.read_csv("../storage/datasets/beliefs.csv")

print(f"Rollup rows: {len(rollups):,}")
print(f"Horizons: {sorted(rollups['horizon'].unique())}")
print(f"Players: {rollups['player_id'].nunique():,}")

## Example: One Player Across Horizons

Let's look at how beliefs accumulate for a single player across different horizons.

In [None]:
# Find a player with high mu_points to use as example
avg_mu = beliefs.groupby("player_id")["mu_points"].mean().sort_values(ascending=False)
example_player = avg_mu.index[0]

# Get player name (if available) or just use ID
print(f"Example player ID: {example_player}")
print(f"Average mu_points: {avg_mu.iloc[0]:.2f}")
print()

# Show rollup for this player starting from GW 5
gw_start = 5
player_rollup = rollups[
    (rollups["player_id"] == example_player) & 
    (rollups["gw_start"] == gw_start)
].sort_values("horizon")

print(f"Player {example_player} rollup from GW {gw_start}:")
print()
print(player_rollup[["horizon", "cum_mu_points", "cum_ev", "cum_play_prob"]].to_string(index=False))

## Summary: Average Rollups by Horizon

In [None]:
# Compute averages by horizon
summary = rollups.groupby("horizon").agg(
    avg_cum_mu_points=("cum_mu_points", "mean"),
    avg_cum_ev=("cum_ev", "mean"),
    avg_cum_play_prob=("cum_play_prob", "mean"),
    n_rows=("player_id", "count"),
).reset_index()

# Add EV ratio (how much value is "lost" to availability)
summary["ev_ratio"] = summary["avg_cum_ev"] / summary["avg_cum_mu_points"]

print("Average Rollups by Horizon")
print("=" * 70)
print(summary.to_string(index=False))

## Interpretation

### Key Observations

1. **cum_play_prob decays rapidly**: The average probability of playing all GWs drops to ~17-30% even over short horizons. This is because the player pool includes many rotation-prone players.

2. **cum_ev < cum_mu_points**: The availability-adjusted expected value is always lower than the raw sum, reflecting the risk of missing games.

3. **EV ratio decreases with horizon**: Over longer horizons, a larger fraction of expected value is "lost" to availability risk.

### What This Enables

Stage 8b can now evaluate:

> "Does ranking players by `cum_ev` outperform ranking by `cum_mu_points` over multi-GW horizons?"

If availability compounds meaningfully, we expect `cum_ev` to produce better rankings for hold-length decisions (e.g., "which player should I transfer in and keep for 5 GWs?").

### Caveats

- Beliefs are treated as independent across GWs (no autocorrelation)
- No decay or weighting applied
- Squad constraints not considered

This stage is a pure belief transformation. Evaluation happens in Stage 8b.