# üß™ QEPC Sandbox v2

A safe playground to experiment with the QEPC NBA engine:

1. Environment & imports  
2. (Optional) System diagnostics  
3. Load schedule & select games  
4. Build team strengths  
5. (Optional) Apply injury overrides  
6. Compute Œª (expected points)  
7. Run Poisson simulations  
8. View and explore results  
9. (Optional) Interactive widgets


## 1. Environment & Project Setup


In [1]:
import sys
from pathlib import Path

# Try to detect the QEPC project root
cwd = Path.cwd()
candidate_roots = [cwd, cwd.parent, cwd.parent.parent]

project_root = None
for cand in candidate_roots:
    if (cand / "qepc").is_dir() and (cand / "qepc_autoload.py").exists():
        project_root = cand
        break

if project_root is None:
    project_root = cwd

if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

print("üìÅ QEPC project root set to:", project_root)

try:
    from notebook_context import *
    print("‚úÖ notebook_context imported.")
except ImportError:
    print("‚ÑπÔ∏è notebook_context not found; continuing without it.")

try:
    import qepc_autoload as qa
    print("‚úÖ qepc_autoload imported as qa.")
except Exception as e:
    print("‚ùå Error importing qepc_autoload:", e)


üìÅ QEPC project root set to: C:\Users\wdors\qepc_project
[QEPC Paths] Project Root set: C:\Users\wdors\qepc_project
[QEPC] Autoload complete.
[QEPC] Root Shim Restored. Forwarding to qepc.autoload...
‚úÖ notebook_context imported.
‚úÖ qepc_autoload imported as qa.


## 2. System Diagnostics (Optional)


In [2]:
# === QEPC Sandbox: System Diagnostics (Optional) ===

from qepc.utils.diagnostics import run_system_check

diagnostic_report = run_system_check()
diagnostic_report  # Shows root/files/modules info as a dict


üöÄ QEPC SYSTEM DIAGNOSTICS INITIALIZED...

‚úÖ Project Root: OK ‚Äì Resolved to C:\Users\wdors\qepc_project

üîç Checking required files...
‚úÖ Canonical Schedule: OK ‚Äì C:\Users\wdors\qepc_project\data\Games.csv
‚úÖ Raw Player Stats: OK ‚Äì C:\Users\wdors\qepc_project\data\raw\PlayerStatistics.csv
‚úÖ Raw Team Stats: OK ‚Äì C:\Users\wdors\qepc_project\data\raw\TeamStatistics.csv
‚úÖ Autoload Context: OK ‚Äì C:\Users\wdors\qepc_project\qepc_autoload.py
‚úÖ Restore Guide (Notebook): OK ‚Äì C:\Users\wdors\qepc_project\RESTORE_GUIDE.ipynb
‚úÖ Restore Guide (Markdown): OK ‚Äì C:\Users\wdors\qepc_project\notebooks\RESTORE_GUIDE.md


Check,Status,Details
Canonical Schedule,OK,C:\Users\wdors\qepc_project\data\Games.csv
Raw Player Stats,OK,C:\Users\wdors\qepc_project\data\raw\PlayerStatistics.csv
Raw Team Stats,OK,C:\Users\wdors\qepc_project\data\raw\TeamStatistics.csv
Autoload Context,OK,C:\Users\wdors\qepc_project\qepc_autoload.py
Restore Guide (Notebook),OK,C:\Users\wdors\qepc_project\RESTORE_GUIDE.ipynb
Restore Guide (Markdown),OK,C:\Users\wdors\qepc_project\notebooks\RESTORE_GUIDE.md



üìä Checking data schemas (where files exist)...
‚úÖ Schema: data/Games.csv: OK ‚Äì All expected columns present.
‚úÖ Schema: data/Team_Stats.csv: OK ‚Äì All expected columns present.
‚úÖ Schema: data/raw/PlayerStatistics.csv: OK ‚Äì All expected columns present.
‚úÖ Schema: data/raw/TeamStatistics.csv: OK ‚Äì All expected columns present.


Check,Status,Details
data/Games.csv,OK,All expected columns present.
data/Team_Stats.csv,OK,All expected columns present.
data/raw/PlayerStatistics.csv,OK,All expected columns present.
data/raw/TeamStatistics.csv,OK,All expected columns present.



üß™ Checking key module imports...
‚úÖ Module: qepc.autoload.paths: OK ‚Äì Loaded
‚úÖ Module: qepc.core.lambda_engine: OK ‚Äì Loaded
‚úÖ Module: qepc.core.simulator: OK ‚Äì Loaded
‚úÖ Module: qepc.sports.nba.sim: OK ‚Äì Loaded
‚úÖ Module: qepc.sports.nba.strengths_v2: OK ‚Äì Loaded
‚úÖ Module: qepc.sports.nba.player_data: OK ‚Äì Loaded
‚úÖ Module: qepc.sports.nba.opponent_data: OK ‚Äì Loaded
‚úÖ Module: qepc.utils.backup: OK ‚Äì Loaded
‚úÖ Module: qepc.backtest.backtest_engine: OK ‚Äì Loaded


Check,Status,Details
qepc.autoload.paths,OK,Loaded successfully
qepc.core.lambda_engine,OK,Loaded successfully
qepc.core.simulator,OK,Loaded successfully
qepc.sports.nba.sim,OK,Loaded successfully
qepc.sports.nba.strengths_v2,OK,Loaded successfully
qepc.sports.nba.player_data,OK,Loaded successfully
qepc.sports.nba.opponent_data,OK,Loaded successfully
qepc.utils.backup,OK,Loaded successfully
qepc.backtest.backtest_engine,OK,Loaded successfully



‚ú® DIAGNOSTICS COMPLETE.


{'project_root': 'C:\\Users\\wdors\\qepc_project',
 'files': [('Canonical Schedule',
   'OK',
   'C:\\Users\\wdors\\qepc_project\\data\\Games.csv'),
  ('Raw Player Stats',
   'OK',
   'C:\\Users\\wdors\\qepc_project\\data\\raw\\PlayerStatistics.csv'),
  ('Raw Team Stats',
   'OK',
   'C:\\Users\\wdors\\qepc_project\\data\\raw\\TeamStatistics.csv'),
  ('Autoload Context',
   'OK',
   'C:\\Users\\wdors\\qepc_project\\qepc_autoload.py'),
  ('Restore Guide (Notebook)',
   'OK',
   'C:\\Users\\wdors\\qepc_project\\RESTORE_GUIDE.ipynb'),
  ('Restore Guide (Markdown)',
   'OK',
   'C:\\Users\\wdors\\qepc_project\\notebooks\\RESTORE_GUIDE.md')],
 'schemas': [('data/Games.csv', 'OK', 'All expected columns present.'),
  ('data/Team_Stats.csv', 'OK', 'All expected columns present.'),
  ('data/raw/PlayerStatistics.csv', 'OK', 'All expected columns present.'),
  ('data/raw/TeamStatistics.csv', 'OK', 'All expected columns present.')],
 'modules': [('qepc.autoload.paths', 'OK', 'Loaded successfully')

## 3. Load NBA Schedule


In [3]:
# === QEPC Sandbox: Load NBA Schedule ===

import qepc_autoload as qa

schedule = qa.load_nba_schedule()
print("Number of games in schedule:", len(schedule))
schedule.head()


[QEPC NBA Sim] Successfully loaded and parsed 771 games from original format.
Number of games in schedule: 771


Unnamed: 0,Date,Time,Away Team,Home Team,Venue,Notes,gameDate
0,10/21/2025,7:30 PM,Houston Rockets,Oklahoma City Thunder,Paycom Center,Regular Season,2025-10-21 19:30:00
1,10/21/2025,10:00 PM,Golden State Warriors,Los Angeles Lakers,Crypto.com Arena,Regular Season,2025-10-21 22:00:00
2,10/22/2025,7:00 PM,Brooklyn Nets,Charlotte Hornets,Spectrum Center,Regular Season,2025-10-22 19:00:00
3,10/22/2025,7:00 PM,Cleveland Cavaliers,New York Knicks,Madison Square Garden,Regular Season,2025-10-22 19:00:00
4,10/22/2025,7:00 PM,Miami Heat,Orlando Magic,Kia Center,Regular Season,2025-10-22 19:00:00


## 4. Select Games to Model


In [4]:
# === QEPC Sandbox: Select Games to Model ===

# Option A: first 4 games (e.g., opening night)
games_to_model = schedule.head(4).copy()

print("Using these games:")
games_to_model


Using these games:


Unnamed: 0,Date,Time,Away Team,Home Team,Venue,Notes,gameDate
0,10/21/2025,7:30 PM,Houston Rockets,Oklahoma City Thunder,Paycom Center,Regular Season,2025-10-21 19:30:00
1,10/21/2025,10:00 PM,Golden State Warriors,Los Angeles Lakers,Crypto.com Arena,Regular Season,2025-10-21 22:00:00
2,10/22/2025,7:00 PM,Brooklyn Nets,Charlotte Hornets,Spectrum Center,Regular Season,2025-10-22 19:00:00
3,10/22/2025,7:00 PM,Cleveland Cavaliers,New York Knicks,Madison Square Garden,Regular Season,2025-10-22 19:00:00


## 5. Build Advanced Team Strengths


In [5]:
# === QEPC Sandbox: Build Advanced Team Strengths ===

from qepc.sports.nba.strengths_v2 import calculate_advanced_strengths

advanced_strengths = calculate_advanced_strengths()
print("Raw advanced_strengths shape:", advanced_strengths.shape)

# Collapse to one row per team
advanced_team_strengths = (
    advanced_strengths
    .groupby("Team", as_index=False)
    .mean(numeric_only=True)
)

print("Unique teams in advanced strengths:", len(advanced_team_strengths))
advanced_team_strengths.head()


[QEPC Strength V2] Starting Advanced Calculation (Cutoff: Now)...
[QEPC PlayerData] Successfully loaded 1635462 rows from PlayerStatistics.csv.
[QEPC Opponent Processor] Loading raw team data for Weighted DRtg...
[QEPC Opponent Processor] Calculated Weighted DRtg for 30 teams.
[QEPC Strength V2] Calculated Time-Travel Strengths for 30 teams.
Raw advanced_strengths shape: (30, 5)
Unique teams in advanced strengths: 30


Unnamed: 0,Team,ORtg,DRtg,Pace,Volatility
0,Atlanta Hawks,122.0,109.682123,71.94,10.262725
1,Boston Celtics,122.0,107.711196,68.08,12.410859
2,Brooklyn Nets,122.0,118.555811,73.044706,9.097107
3,Charlotte Hornets,122.0,116.165573,70.528421,13.022697
4,Chicago Bulls,122.0,113.762988,69.166667,9.479424


## 6. Injury Overrides (Optional)


In [28]:
# === 6. Injury Overrides (Optional) ===

import pandas as pd
from pathlib import Path

# Prefer data-driven overrides if present, fall back to manual CSV
base_inj_path = project_root / "data" / "Injury_Overrides.csv"
dd_inj_path = project_root / "data" / "Injury_Overrides_data_driven.csv"

inj_path = None
if dd_inj_path.exists():
    inj_path = dd_inj_path
    print("‚úÖ Using DATA-DRIVEN injury overrides:", inj_path)
elif base_inj_path.exists():
    inj_path = base_inj_path
    print("‚ÑπÔ∏è Data-driven file not found. Using MANUAL injury overrides:", inj_path)
else:
    print("‚ö†Ô∏è No Injury_Overrides file found in data/.")
    print("   Proceeding with unadjusted advanced strengths.")
    team_strengths_for_lambda = advanced_team_strengths.copy()

if inj_path is not None:
    injuries = pd.read_csv(inj_path)
    print("\nLoaded injury overrides:")
    display(injuries)

    # Collapse to team-level impact (multiply impacts if multiple injured players on a team)
    team_injury_impact = (
        injuries
        .groupby("Team", as_index=False)["Impact"]
        .prod()
        .rename(columns={"Impact": "InjuryImpact"})
    )

    print("\nTeam-level InjuryImpact:")
    display(team_injury_impact)

    # Merge into advanced strengths
    inj_adjusted = advanced_team_strengths.merge(
        team_injury_impact,
        on="Team",
        how="left"
    )

    # Teams with no overrides get impact = 1.0 (no change)
    inj_adjusted["InjuryImpact"] = inj_adjusted["InjuryImpact"].fillna(1.0)

    # Apply impact to ORtg (you can also apply to Pace if you want later)
    inj_adjusted["ORtg_inj"] = inj_adjusted["ORtg"] * inj_adjusted["InjuryImpact"]

    print("\nInjury-adjusted team strengths (first 5 rows):")
    display(inj_adjusted.head())

    # This is the version we feed into the lambda engine
    team_strengths_for_lambda = inj_adjusted.copy()
    team_strengths_for_lambda["ORtg"] = team_strengths_for_lambda["ORtg_inj"]

    # Optional cleanup: drop helper columns
    for col in ["ORtg_inj", "InjuryImpact"]:
        if col in team_strengths_for_lambda.columns:
            team_strengths_for_lambda.drop(columns=[col], inplace=True)
            

‚úÖ Using DATA-DRIVEN injury overrides: C:\Users\wdors\qepc_project\data\Injury_Overrides_data_driven.csv

Loaded injury overrides:


Unnamed: 0,Team,PlayerName,Status,Impact,Note
0,Indiana Pacers,Tyrese Haliburton,out_for_season,1.0,Torn Achilles (Right) ‚Äì out for 2025-26 season...
1,Los Angeles Clippers,Bradley Beal,out_for_season,1.0,Hip fracture ‚Äì season-ending surgery after ear...
2,Houston Rockets,Fred VanVleet,out_for_season,1.0,Torn ACL ‚Äì suffered during Sept 2025 offseason...
3,Dallas Mavericks,Dante Exum,out_for_season,1.0,Knee injury ‚Äì ruled out indefinitely shortly a...
4,Oklahoma City Thunder,Thomas Sorber,out_for_season,1.0,Torn ACL ‚Äì rookie center injured in Sept 2025
5,Oklahoma City Thunder,Nikola Topic,out_for_season,1.0,ACL recovery ‚Äì sitting out rookie season to re...
6,Boston Celtics,Jayson Tatum,long_term_out,1.0,Torn Achilles ‚Äì injured in May 2025 playoffs; ...
7,New Orleans Pelicans,Dejounte Murray,long_term_out,1.0,Torn Achilles ‚Äì injured Jan 2025; targeting Ja...
8,Dallas Mavericks,Kyrie Irving,long_term_out,1.0,Torn ACL (Left) ‚Äì injured March 2025; aiming f...
9,Denver Nuggets,Aaron Gordon,long_term_out,1.0,Severe hamstring strain ‚Äì multi-month absence;...



Team-level InjuryImpact:


Unnamed: 0,Team,InjuryImpact
0,Atlanta Hawks,1.0
1,Boston Celtics,1.0
2,Brooklyn Nets,1.0
3,Cleveland Cavaliers,1.0
4,Dallas Mavericks,1.0
5,Denver Nuggets,1.0
6,Houston Rockets,1.0
7,Indiana Pacers,1.0
8,Los Angeles Clippers,1.0
9,Milwaukee Bucks,1.0



Injury-adjusted team strengths (first 5 rows):


Unnamed: 0,Team,ORtg,DRtg,Pace,Volatility,InjuryImpact,ORtg_inj
0,Atlanta Hawks,122.0,109.682123,71.94,10.262725,1.0,122.0
1,Boston Celtics,122.0,107.711196,68.08,12.410859,1.0,122.0
2,Brooklyn Nets,122.0,118.555811,73.044706,9.097107,1.0,122.0
3,Charlotte Hornets,122.0,116.165573,70.528421,13.022697,1.0,122.0
4,Chicago Bulls,122.0,113.762988,69.166667,9.479424,1.0,122.0


## 7. Compute Lambda (Expected Points)


In [29]:
# === QEPC Sandbox: Compute Lambda (Expected Points) ===

from qepc.core.lambda_engine import compute_lambda

lambda_df = compute_lambda(games_to_model, team_strengths_for_lambda)

print("Lambda dataframe columns:")
print(lambda_df.columns.tolist())

# Adjust these column names if needed based on the printout above
display(
    lambda_df[
        ["Away Team", "Home Team", "lambda_away", "lambda_home", "vol_away", "vol_home"]
    ]
)


[QEPC Lambda] Computed lambda & volatility for 4 games.
Lambda dataframe columns:
['Date', 'Time', 'Away Team', 'Home Team', 'Venue', 'Notes', 'gameDate', 'lambda_home', 'lambda_away', 'vol_home', 'vol_away']


Unnamed: 0,Away Team,Home Team,lambda_away,lambda_home,vol_away,vol_home
0,Houston Rockets,Oklahoma City Thunder,86.559386,95.911747,32.325436,14.033294
1,Golden State Warriors,Los Angeles Lakers,99.676064,100.119971,10.256705,13.236314
2,Brooklyn Nets,Charlotte Hornets,106.314731,111.540341,9.097107,13.022697
3,Cleveland Cavaliers,New York Knicks,102.89436,106.596302,12.175616,13.069948


## 8. Run Simulation & View Results


In [30]:
# === QEPC Sandbox: Run QEPC Simulation ===

from qepc.core.simulator import run_qepc_simulation

sim_results = run_qepc_simulation(lambda_df, num_trials=20000)

print("Simulation result columns:")
print(sim_results.columns.tolist())
sim_results.head()


[QEPC Simulator] Running 20000 trials for 4 games (Chaos Engine Active)...
[QEPC Simulator] Simulation complete.
Simulation result columns:
['Date', 'Time', 'Away Team', 'Home Team', 'Venue', 'Notes', 'gameDate', 'lambda_home', 'lambda_away', 'vol_home', 'vol_away', 'Home_Win_Prob', 'Away_Win_Prob', 'Tie_Prob', 'Expected_Score_Total', 'Expected_Spread', 'Sim_Home_Score', 'Sim_Away_Score']


Unnamed: 0,Date,Time,Away Team,Home Team,Venue,Notes,gameDate,lambda_home,lambda_away,vol_home,vol_away,Home_Win_Prob,Away_Win_Prob,Tie_Prob,Expected_Score_Total,Expected_Spread,Sim_Home_Score,Sim_Away_Score
0,10/21/2025,7:30 PM,Houston Rockets,Oklahoma City Thunder,Paycom Center,Regular Season,2025-10-21 19:30:00,95.911747,86.559386,14.033294,32.325436,0.6591,0.32345,0.01745,182.4419,9.3097,95.8758,86.5661
1,10/21/2025,10:00 PM,Golden State Warriors,Los Angeles Lakers,Crypto.com Arena,Regular Season,2025-10-21 22:00:00,100.119971,99.676064,13.236314,10.256705,0.4944,0.48265,0.02295,199.5923,0.2096,99.90095,99.69135
2,10/22/2025,7:00 PM,Brooklyn Nets,Charlotte Hornets,Spectrum Center,Regular Season,2025-10-22 19:00:00,111.540341,106.314731,13.022697,9.097107,0.6068,0.37055,0.02265,217.746,5.1222,111.4341,106.3119
3,10/22/2025,7:00 PM,Cleveland Cavaliers,New York Knicks,Madison Square Garden,Regular Season,2025-10-22 19:00:00,106.596302,102.89436,13.069948,12.175616,0.576,0.40155,0.02245,209.1723,3.6751,106.4237,102.7486


## 9. Interactive QEPC Controls (Widgets)


In [31]:
import ipywidgets as widgets
from IPython.display import display, clear_output
import pandas as pd
from datetime import datetime, timedelta

from qepc.core.lambda_engine import compute_lambda
from qepc.core.simulator import run_qepc_simulation


# --- Build a filtered "upcoming games" view from schedule ---

def ensure_game_datetime(df: pd.DataFrame) -> pd.Series:
    """
    Ensure we have a datetime column for each game.
    Prefer 'gameDate' if it exists; otherwise parse from Date + Time.
    """
    if "gameDate" in df.columns:
        return pd.to_datetime(df["gameDate"])
    # Fallback: parse Date + Time strings
    return pd.to_datetime(df["Date"] + " " + df["Time"])


def format_game_option(row):
    """Turn a schedule row into a nice label for the dropdown."""
    return f"{row['Date']} {row['Time']} ‚Äì {row['Away Team']} @ {row['Home Team']}"


# 1) Compute game datetimes
game_dt = ensure_game_datetime(schedule)

# 2) Define "today" and the 3-day window
today = pd.Timestamp.today().normalize()
horizon = today + pd.Timedelta(days=3)

mask = (game_dt >= today) & (game_dt < horizon)
upcoming = schedule.loc[mask].copy()

# If no games in the next 3 days, fall back to next 20 games on the schedule
if upcoming.empty:
    print("‚ÑπÔ∏è No games found in the next 3 days. Showing next 20 games instead.")
    upcoming = schedule.sort_values(game_dt.name).head(20).copy()

# Build (label, original_index) options
game_options = []
for idx, row in upcoming.iterrows():
    label = format_game_option(row)
    game_options.append((label, idx))

game_dropdown = widgets.Dropdown(
    options=game_options,
    description='Game:',
    layout=widgets.Layout(width='95%')
)

# Keep the trials slider
num_trials_slider = widgets.IntSlider(
    value=10000,
    min=1000,
    max=50000,
    step=1000,
    description='Trials:',
    continuous_update=False
)

run_button = widgets.Button(
    description="Run QEPC Sim",
    button_style='success',
    tooltip='Run QEPC for the selected game'
)

output = widgets.Output()


def on_run_clicked(b):
    with output:
        clear_output()

        # 1) Get the selected game (index refers back to full schedule)
        game_idx = game_dropdown.value
        game_row = schedule.loc[[game_idx]].copy()  # DataFrame with a single row

        print("Running QEPC for:")
        display(game_row[["Date", "Time", "Away Team", "Home Team", "Venue", "Notes"]])

        # 2) Compute lambda for this single game
        lambda_df = compute_lambda(game_row, team_strengths_for_lambda)

        print("\nLambda (expected points) for this game:")
        display(
            lambda_df[
                ["Away Team", "Home Team", "lambda_away", "lambda_home", "vol_away", "vol_home"]
            ]
        )

        # 3) Run simulation
        trials = num_trials_slider.value
        print(f"\nRunning simulation with {trials} trials...\n")
        sim_results = run_qepc_simulation(lambda_df, num_trials=trials)

        # 4) Show a compact summary (adapt names if your sim uses slightly different ones)
        cols = [
            "Away Team", "Home Team",
            "Home_Win_Prob", "Away_Win_Prob",
            "Expected_Score_Total", "Expected_Spread",
            "Sim_Home_Score", "Sim_Away_Score"
        ]
        cols = [c for c in cols if c in sim_results.columns]

        print("QEPC summary for this matchup:")
        display(sim_results[cols])


run_button.on_click(on_run_clicked)

controls = widgets.VBox([
    widgets.HTML("<h3>üéõ QEPC Interactive Controls</h3>"),
    game_dropdown,
    num_trials_slider,
    run_button
])

display(widgets.VBox([controls, output]))

VBox(children=(VBox(children=(HTML(value='<h3>üéõ QEPC Interactive Controls</h3>'), Dropdown(description='Game:'‚Ä¶

## 10. Data-Driven Injury Impact from PlayerStatistics

This section uses `data/raw/PlayerStatistics.csv` to estimate how much of each
team's offense an injured player was responsible for, and uses that to set
the `Impact` multiplier instead of guessing by hand.


In [32]:
import pandas as pd

# --- 10.1 Load PlayerStatistics from the QEPC data folder ---

ps_path = project_root / "data" / "raw" / "PlayerStatistics.csv"
print("Loading PlayerStatistics from:", ps_path)

player_stats = pd.read_csv(
    ps_path,
    low_memory=False,
    parse_dates=["gameDate"]
)

player_stats.head()

Loading PlayerStatistics from: C:\Users\wdors\qepc_project\data\raw\PlayerStatistics.csv


Unnamed: 0,firstName,lastName,personId,gameId,gameDate,playerteamCity,playerteamName,opponentteamCity,opponentteamName,gameType,...,threePointersPercentage,freeThrowsAttempted,freeThrowsMade,freeThrowsPercentage,reboundsDefensive,reboundsOffensive,reboundsTotal,foulsPersonal,turnovers,plusMinusPoints
0,Jamal,Murray,1627750,22500248,2025-11-17T21:00:00Z,Denver,Nuggets,Chicago,Bulls,,...,0.455,5.0,5.0,1.0,11.0,0.0,11.0,3.0,2.0,-1.0
1,Bruce,Brown,1628971,22500248,2025-11-17T21:00:00Z,Denver,Nuggets,Chicago,Bulls,,...,0.0,0.0,0.0,0.0,2.0,0.0,2.0,1.0,0.0,-17.0
2,Jevon,Carter,1628975,22500248,2025-11-17T21:00:00Z,Chicago,Bulls,Denver,Nuggets,,...,0.5,0.0,0.0,0.0,3.0,1.0,4.0,2.0,1.0,20.0
3,Kevin,Huerter,1628989,22500248,2025-11-17T21:00:00Z,Chicago,Bulls,Denver,Nuggets,,...,0.444,2.0,2.0,1.0,2.0,0.0,2.0,0.0,1.0,-21.0
4,Jalen,Pickett,1629618,22500248,2025-11-17T21:00:00Z,Denver,Nuggets,Chicago,Bulls,,...,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,9.0


In [33]:

# --- 10.2 Filter to recent games (e.g. 2024-25 & 2025-26) ---

import pandas as pd

# 1) Convert gameDate to timezone-aware datetime (UTC)
player_stats["gameDate"] = pd.to_datetime(
    player_stats["gameDate"],
    errors="coerce",
    utc=True   # make everything tz-aware in UTC
)

# 2) Define cutoff_date as timezone-aware (also UTC)
cutoff_date = pd.Timestamp("2024-10-01", tz="UTC")

# 3) Now the comparison is tz-aware vs tz-aware ‚Üí valid
ps_recent = player_stats[player_stats["gameDate"] >= cutoff_date].copy()

print("Recent rows:", len(ps_recent))
ps_recent.head()

Recent rows: 8024


Unnamed: 0,firstName,lastName,personId,gameId,gameDate,playerteamCity,playerteamName,opponentteamCity,opponentteamName,gameType,...,threePointersPercentage,freeThrowsAttempted,freeThrowsMade,freeThrowsPercentage,reboundsDefensive,reboundsOffensive,reboundsTotal,foulsPersonal,turnovers,plusMinusPoints
0,Jamal,Murray,1627750,22500248,2025-11-17 21:00:00+00:00,Denver,Nuggets,Chicago,Bulls,,...,0.455,5.0,5.0,1.0,11.0,0.0,11.0,3.0,2.0,-1.0
1,Bruce,Brown,1628971,22500248,2025-11-17 21:00:00+00:00,Denver,Nuggets,Chicago,Bulls,,...,0.0,0.0,0.0,0.0,2.0,0.0,2.0,1.0,0.0,-17.0
2,Jevon,Carter,1628975,22500248,2025-11-17 21:00:00+00:00,Chicago,Bulls,Denver,Nuggets,,...,0.5,0.0,0.0,0.0,3.0,1.0,4.0,2.0,1.0,20.0
3,Kevin,Huerter,1628989,22500248,2025-11-17 21:00:00+00:00,Chicago,Bulls,Denver,Nuggets,,...,0.444,2.0,2.0,1.0,2.0,0.0,2.0,0.0,1.0,-21.0
4,Jalen,Pickett,1629618,22500248,2025-11-17 21:00:00+00:00,Denver,Nuggets,Chicago,Bulls,,...,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,9.0


In [34]:
# --- 10.3 Build PlayerName & offensive contribution metric ---

import numpy as np

# Make sure these columns exist / are numeric
for col in ["points", "assists", "reboundsOffensive", "turnovers"]:
    if col not in player_stats.columns:
        print(f"Warning: column {col} not found in PlayerStatistics.")
        ps_recent[col] = 0
    else:
        ps_recent[col] = ps_recent[col].fillna(0)

ps_recent["PlayerName"] = (
    ps_recent["firstName"].astype(str) + " " + ps_recent["lastName"].astype(str)
)

ps_recent["off_contrib"] = (
    ps_recent["points"]
    + 0.7 * ps_recent["assists"]
    + 0.7 * ps_recent["reboundsOffensive"]
    - 0.9 * ps_recent["turnovers"]
)

ps_recent[["PlayerName", "playerteamName", "gameId", "off_contrib"]].head()

Unnamed: 0,PlayerName,playerteamName,gameId,off_contrib
0,Jamal Murray,Nuggets,22500248,35.0
1,Bruce Brown,Nuggets,22500248,0.0
2,Jevon Carter,Bulls,22500248,15.5
3,Kevin Huerter,Bulls,22500248,19.8
4,Jalen Pickett,Nuggets,22500248,3.7


In [35]:
# --- 10.4 Team totals per game and player share within team ---

team_totals = (
    ps_recent
    .groupby(["gameId", "playerteamName"], as_index=False)["off_contrib"]
    .sum()
    .rename(columns={"off_contrib": "team_off_contrib"})
)

ps_recent = ps_recent.merge(
    team_totals,
    on=["gameId", "playerteamName"],
    how="left"
)

ps_recent["team_off_contrib"] = ps_recent["team_off_contrib"].replace({0: np.nan})
ps_recent["off_share"] = ps_recent["off_contrib"] / ps_recent["team_off_contrib"]

ps_recent[["PlayerName", "playerteamName", "gameId", "off_contrib", "off_share"]].head()

Unnamed: 0,PlayerName,playerteamName,gameId,off_contrib,off_share
0,Jamal Murray,Nuggets,22500248,35.0,0.243733
1,Bruce Brown,Nuggets,22500248,0.0,0.0
2,Jevon Carter,Bulls,22500248,15.5,0.101506
3,Kevin Huerter,Bulls,22500248,19.8,0.129666
4,Jalen Pickett,Nuggets,22500248,3.7,0.025766


In [36]:
# --- 10.5 Map short team names to full team names used in QEPC ---

TEAM_NAME_MAP = {
    "Hawks": "Atlanta Hawks",
    "Celtics": "Boston Celtics",
    "Nets": "Brooklyn Nets",
    "Hornets": "Charlotte Hornets",
    "Bulls": "Chicago Bulls",
    "Cavaliers": "Cleveland Cavaliers",
    "Mavericks": "Dallas Mavericks",
    "Nuggets": "Denver Nuggets",
    "Pistons": "Detroit Pistons",
    "Warriors": "Golden State Warriors",
    "Rockets": "Houston Rockets",
    "Pacers": "Indiana Pacers",
    "Clippers": "Los Angeles Clippers",
    "Lakers": "Los Angeles Lakers",
    "Grizzlies": "Memphis Grizzlies",
    "Heat": "Miami Heat",
    "Bucks": "Milwaukee Bucks",
    "Timberwolves": "Minnesota Timberwolves",
    "Pelicans": "New Orleans Pelicans",
    "Knicks": "New York Knicks",
    "Thunder": "Oklahoma City Thunder",
    "Magic": "Orlando Magic",
    "76ers": "Philadelphia 76ers",
    "Suns": "Phoenix Suns",
    "Trail Blazers": "Portland Trail Blazers",
    "Kings": "Sacramento Kings",
    "Spurs": "San Antonio Spurs",
    "Raptors": "Toronto Raptors",
    "Jazz": "Utah Jazz",
    "Wizards": "Washington Wizards",
}

ps_recent["TeamFull"] = ps_recent["playerteamName"].map(TEAM_NAME_MAP)

# Just sanity-check a few rows
ps_recent[["playerteamName", "TeamFull", "PlayerName"]].head()

Unnamed: 0,playerteamName,TeamFull,PlayerName
0,Nuggets,Denver Nuggets,Jamal Murray
1,Nuggets,Denver Nuggets,Bruce Brown
2,Bulls,Chicago Bulls,Jevon Carter
3,Bulls,Chicago Bulls,Kevin Huerter
4,Nuggets,Denver Nuggets,Jalen Pickett


In [37]:
# --- 10.6 Aggregate to per-team, per-player offensive share over the period ---

player_offense = (
    ps_recent
    .groupby(["TeamFull", "PlayerName"], as_index=False)
    .agg(
        total_off_contrib=("off_contrib", "sum"),
        total_team_off_contrib=("team_off_contrib", "sum")
    )
)

player_offense["off_share_season"] = (
    player_offense["total_off_contrib"] / player_offense["total_team_off_contrib"]
)

player_offense = player_offense.rename(columns={"TeamFull": "Team"})

print("Player offense share table shape:", player_offense.shape)
player_offense.sort_values("off_share_season", ascending=False).head(10)

Player offense share table shape: (643, 5)


Unnamed: 0,Team,PlayerName,total_off_contrib,total_team_off_contrib,off_share_season
492,Philadelphia 76ers,Tyrese Maxey,542.3,2165.1,0.250473
292,Los Angeles Lakers,Luka Doncic,439.3,1794.1,0.244858
447,Oklahoma City Thunder,Shai Gilgeous-Alexander,587.3,2405.1,0.244189
355,Milwaukee Bucks,Giannis Antetokounmpo,493.5,2075.8,0.23774
165,Denver Nuggets,Nikola Jokic,542.5,2325.3,0.233303
174,Detroit Pistons,Cade Cunningham,423.4,1972.7,0.21463
610,Utah Jazz,Lauri Markkanen,449.6,2153.4,0.208786
498,Phoenix Suns,Devin Booker,492.8,2378.2,0.207216
264,Los Angeles Clippers,James Harden,405.7,1980.9,0.204806
113,Cleveland Cavaliers,Donovan Mitchell,464.9,2284.6,0.203493


In [38]:
# === 10. Data-driven Impact + Save CSV (combined) ===

import pandas as pd

# Load your original manual overrides
inj_path = project_root / "data" / "Injury_Overrides.csv"
injuries = pd.read_csv(inj_path)

print("Original injury overrides:")
display(injuries)

# Merge per-player offensive share into the injuries table
# player_offense was built earlier and has:
#   Team, PlayerName, off_share_season
inj_with_share = injuries.merge(
    player_offense[["Team", "PlayerName", "off_share_season"]],
    on=["Team", "PlayerName"],
    how="left"
)

print("Injuries + offensive share (before Impact calc):")
display(inj_with_share)

# Status ‚Üí severity scale: how much of the player's share we apply
status_scale = {
    "out_for_season": 1.0,
    "long_term_out": 0.7,
    "mid_term_out": 0.5,
}

inj_with_share["status_scale"] = inj_with_share["Status"].map(status_scale).fillna(0.5)

# If we don't find a share (rookie / no recent games), assume small share
inj_with_share["off_share_season"] = inj_with_share["off_share_season"].fillna(0.05)

# Impact = 1 - scale * share, clipped between 0.6 and 1.0
raw_impact = 1 - inj_with_share["status_scale"] * inj_with_share["off_share_season"]
inj_with_share["Impact"] = raw_impact.clip(lower=0.6, upper=1.0)

print("Injuries with data-driven Impact:")
display(
    inj_with_share[
        ["Team", "PlayerName", "Status", "off_share_season", "Impact", "Note"]
    ]
)

# Save as a NEW file so the original Injury_Overrides.csv stays untouched
out_path = project_root / "data" / "Injury_Overrides_data_driven.csv"
inj_with_share.drop(columns=["status_scale"]).to_csv(out_path, index=False)
print("\n‚úÖ Saved data-driven injury overrides to:", out_path)


Original injury overrides:


Unnamed: 0,Team,PlayerName,Status,Impact,Note
0,Indiana Pacers,Tyrese Haliburton,out_for_season,1.0,Torn Achilles (Right) ‚Äì out for 2025-26 season...
1,Los Angeles Clippers,Bradley Beal,out_for_season,1.0,Hip fracture ‚Äì season-ending surgery after ear...
2,Houston Rockets,Fred VanVleet,out_for_season,1.0,Torn ACL ‚Äì suffered during Sept 2025 offseason...
3,Dallas Mavericks,Dante Exum,out_for_season,1.0,Knee injury ‚Äì ruled out indefinitely shortly a...
4,Oklahoma City Thunder,Thomas Sorber,out_for_season,1.0,Torn ACL ‚Äì rookie center injured in Sept 2025
5,Oklahoma City Thunder,Nikola Topic,out_for_season,1.0,ACL recovery ‚Äì sitting out rookie season to re...
6,Boston Celtics,Jayson Tatum,long_term_out,1.0,Torn Achilles ‚Äì injured in May 2025 playoffs; ...
7,New Orleans Pelicans,Dejounte Murray,long_term_out,1.0,Torn Achilles ‚Äì injured Jan 2025; targeting Ja...
8,Dallas Mavericks,Kyrie Irving,long_term_out,1.0,Torn ACL (Left) ‚Äì injured March 2025; aiming f...
9,Denver Nuggets,Aaron Gordon,long_term_out,1.0,Severe hamstring strain ‚Äì multi-month absence;...


Injuries + offensive share (before Impact calc):


Unnamed: 0,Team,PlayerName,Status,Impact,Note,off_share_season
0,Indiana Pacers,Tyrese Haliburton,out_for_season,1.0,Torn Achilles (Right) ‚Äì out for 2025-26 season...,
1,Los Angeles Clippers,Bradley Beal,out_for_season,1.0,Hip fracture ‚Äì season-ending surgery after ear...,0.051547
2,Houston Rockets,Fred VanVleet,out_for_season,1.0,Torn ACL ‚Äì suffered during Sept 2025 offseason...,0.0
3,Dallas Mavericks,Dante Exum,out_for_season,1.0,Knee injury ‚Äì ruled out indefinitely shortly a...,0.0
4,Oklahoma City Thunder,Thomas Sorber,out_for_season,1.0,Torn ACL ‚Äì rookie center injured in Sept 2025,
5,Oklahoma City Thunder,Nikola Topic,out_for_season,1.0,ACL recovery ‚Äì sitting out rookie season to re...,0.090007
6,Boston Celtics,Jayson Tatum,long_term_out,1.0,Torn Achilles ‚Äì injured in May 2025 playoffs; ...,0.0
7,New Orleans Pelicans,Dejounte Murray,long_term_out,1.0,Torn Achilles ‚Äì injured Jan 2025; targeting Ja...,0.0
8,Dallas Mavericks,Kyrie Irving,long_term_out,1.0,Torn ACL (Left) ‚Äì injured March 2025; aiming f...,0.0
9,Denver Nuggets,Aaron Gordon,long_term_out,1.0,Severe hamstring strain ‚Äì multi-month absence;...,0.138688


Injuries with data-driven Impact:


Unnamed: 0,Team,PlayerName,Status,off_share_season,Impact,Note
0,Indiana Pacers,Tyrese Haliburton,out_for_season,0.05,0.95,Torn Achilles (Right) ‚Äì out for 2025-26 season...
1,Los Angeles Clippers,Bradley Beal,out_for_season,0.051547,0.948453,Hip fracture ‚Äì season-ending surgery after ear...
2,Houston Rockets,Fred VanVleet,out_for_season,0.0,1.0,Torn ACL ‚Äì suffered during Sept 2025 offseason...
3,Dallas Mavericks,Dante Exum,out_for_season,0.0,1.0,Knee injury ‚Äì ruled out indefinitely shortly a...
4,Oklahoma City Thunder,Thomas Sorber,out_for_season,0.05,0.95,Torn ACL ‚Äì rookie center injured in Sept 2025
5,Oklahoma City Thunder,Nikola Topic,out_for_season,0.090007,0.909993,ACL recovery ‚Äì sitting out rookie season to re...
6,Boston Celtics,Jayson Tatum,long_term_out,0.0,1.0,Torn Achilles ‚Äì injured in May 2025 playoffs; ...
7,New Orleans Pelicans,Dejounte Murray,long_term_out,0.0,1.0,Torn Achilles ‚Äì injured Jan 2025; targeting Ja...
8,Dallas Mavericks,Kyrie Irving,long_term_out,0.0,1.0,Torn ACL (Left) ‚Äì injured March 2025; aiming f...
9,Denver Nuggets,Aaron Gordon,long_term_out,0.138688,0.902918,Severe hamstring strain ‚Äì multi-month absence;...



‚úÖ Saved data-driven injury overrides to: C:\Users\wdors\qepc_project\data\Injury_Overrides_data_driven.csv


___________________________

## 7b. Script Superposition Prototype (Grind / Base / Chaos)

This section treats each game as a superposition of 3 possible scripts:

- **Grind**: lower scoring, lower variance  
- **Base**: normal game (your current Œª)  
- **Chaos**: higher scoring, higher variance  

Each script has its own Œª and volatility scaling and a script weight.
QEPC runs simulations for each script separately, then mixes the results
according to the script weights.


In [39]:
# === 7b. Script Superposition Prototype (Grind / Base / Chaos) ===

import pandas as pd
from copy import deepcopy
from qepc.core.simulator import run_qepc_simulation

# 1) Define your scripts
SCRIPT_CONFIGS = [
    {
        "id": "GRIND",
        "name": "Grind (low total, low variance)",
        "lambda_scale": 0.92,   # lower scoring
        "vol_scale": 0.90,      # slightly less volatile
        "weight": 0.25,         # 25% of universes
    },
    {
        "id": "BASE",
        "name": "Base (normal game)",
        "lambda_scale": 1.00,
        "vol_scale": 1.00,
        "weight": 0.50,         # 50% of universes
    },
    {
        "id": "CHAOS",
        "name": "Chaos (high total, high variance)",
        "lambda_scale": 1.08,   # higher scoring
        "vol_scale": 1.20,      # more swingy
        "weight": 0.25,         # 25% of universes
    },
]

# Make sure weights sum to 1.0
total_w = sum(s["weight"] for s in SCRIPT_CONFIGS)
for s in SCRIPT_CONFIGS:
    s["weight"] = s["weight"] / total_w


def build_script_lambda(lambda_base: pd.DataFrame, script: dict) -> pd.DataFrame:
    """
    Return a copy of lambda_base with lambda and volatility scaled
    according to the script configuration.
    """
    df = lambda_base.copy()

    lam_scale = script["lambda_scale"]
    vol_scale = script["vol_scale"]

    # These column names are based on your existing lambda_df
    # Adjust here if your lambda engine uses different names
    for col in ["lambda_home", "lambda_away"]:
        if col in df.columns:
            df[col] = df[col] * lam_scale

    for col in ["vol_home", "vol_away"]:
        if col in df.columns:
            df[col] = df[col] * vol_scale

    return df


def run_qepc_multiscript(lambda_base: pd.DataFrame,
                         script_configs: list[dict],
                         num_trials: int = 20000) -> pd.DataFrame:
    """
    For each script:
      - build script-specific lambda_df
      - run run_qepc_simulation(lambda_df_script, num_trials)
    Then mix the results across scripts using script weights.

    We assume run_qepc_simulation returns one row per game with numeric
    columns representing expectations/probabilities.
    """
    script_results = []

    for script in script_configs:
        print(f"\nRunning script: {script['id']} ‚Äì {script['name']}")
        lam_s = build_script_lambda(lambda_base, script)
        sim_s = run_qepc_simulation(lam_s, num_trials=num_trials)

        # Tag which script these results came from (for debugging / analysis)
        sim_s = sim_s.copy()
        sim_s["script_id"] = script["id"]
        sim_s["script_weight"] = script["weight"]
        script_results.append(sim_s)

    # Concatenate all script results
    all_scripts_df = pd.concat(script_results, axis=0, ignore_index=True)

    # Now compute a weighted average across scripts for each game
    # We assume the order of games is the same for each script,
    # so we can group by the identifying game columns.
    # We'll try to detect game-identifying columns.
    candidate_keys = ["Date", "Time", "Away Team", "Home Team", "Venue", "Notes", "gameDate"]
    group_keys = [c for c in candidate_keys if c in all_scripts_df.columns]

    if not group_keys:
        # Fallback: just use the index of the base lambda_df as implicit game_id
        all_scripts_df["game_idx"] = (
            all_scripts_df.groupby("script_id").cumcount()
        )
        group_keys = ["game_idx"]

    # Separate numeric vs non-numeric columns
    numeric_cols = all_scripts_df.select_dtypes(include="number").columns.tolist()
    # We'll exclude script_id from numeric averaging
    numeric_cols = [c for c in numeric_cols if c not in ["script_weight"]]

    # Weighted average by script_weight for each game & script
    def weighted_agg(group: pd.DataFrame) -> pd.Series:
        # script_weight is constant within each script, but we only need one per row
        weights = group["script_weight"]
        # Normalize weights in case they don't sum exactly to 1 per game
        w = weights / weights.sum()

        result = {}
        # For non-numeric columns, just take the first (they should be identical across scripts)
        for col in group.columns:
            if col in numeric_cols or col == "script_weight":
                continue
            if col in group_keys:
                # keep keys as they are, we'll set them later from index
                continue
            # just take the first non-key non-numeric as representative
            result[col] = group[col].iloc[0]

        # For numeric columns, compute weighted average
        for col in numeric_cols:
            result[col] = (group[col] * w).sum()

        return pd.Series(result)

    combined = (
        all_scripts_df
        .groupby(group_keys, as_index=False)
        .apply(weighted_agg)
        .reset_index()
    )

    # Bring group keys back as columns if they were in the index
    # (pandas groupby/apply quirks; we already used as_index=False, but let's be safe)
    for key in group_keys:
        if key not in combined.columns and key in all_scripts_df.columns:
            combined[key] = all_scripts_df.groupby(group_keys)[key].first().values

    print("\n‚úÖ Multi-script QEPC results computed.")
    return combined


In [40]:
# --- Run multi-script QEPC on the current games_to_model ---

# lambda_df should already exist from your normal pipeline:
#   lambda_df = compute_lambda(games_to_model, team_strengths_for_lambda)

multi_script_results = run_qepc_multiscript(
    lambda_base=lambda_df,
    script_configs=SCRIPT_CONFIGS,
    num_trials=20000
)

print("Columns in multi-script results:")
print(multi_script_results.columns.tolist())

multi_script_results.head()



Running script: GRIND ‚Äì Grind (low total, low variance)
[QEPC Simulator] Running 20000 trials for 4 games (Chaos Engine Active)...
[QEPC Simulator] Simulation complete.

Running script: BASE ‚Äì Base (normal game)
[QEPC Simulator] Running 20000 trials for 4 games (Chaos Engine Active)...
[QEPC Simulator] Simulation complete.

Running script: CHAOS ‚Äì Chaos (high total, high variance)
[QEPC Simulator] Running 20000 trials for 4 games (Chaos Engine Active)...
[QEPC Simulator] Simulation complete.

‚úÖ Multi-script QEPC results computed.
Columns in multi-script results:
['index', 'Date', 'Time', 'Away Team', 'Home Team', 'Venue', 'Notes', 'gameDate', 'script_id', 'lambda_home', 'lambda_away', 'vol_home', 'vol_away', 'Home_Win_Prob', 'Away_Win_Prob', 'Tie_Prob', 'Expected_Score_Total', 'Expected_Spread', 'Sim_Home_Score', 'Sim_Away_Score']


  .apply(weighted_agg)


Unnamed: 0,index,Date,Time,Away Team,Home Team,Venue,Notes,gameDate,script_id,lambda_home,lambda_away,vol_home,vol_away,Home_Win_Prob,Away_Win_Prob,Tie_Prob,Expected_Score_Total,Expected_Spread,Sim_Home_Score,Sim_Away_Score
0,0,10/21/2025,10:00 PM,Golden State Warriors,Los Angeles Lakers,Crypto.com Arena,Regular Season,2025-10-21 22:00:00,GRIND,100.119971,99.676064,13.567222,10.513123,0.500312,0.474938,0.02475,199.769013,0.545038,100.157025,99.611987
1,1,10/21/2025,7:30 PM,Houston Rockets,Oklahoma City Thunder,Paycom Center,Regular Season,2025-10-21 19:30:00,GRIND,95.911747,86.559386,14.384126,33.133572,0.657537,0.326362,0.0161,182.4639,9.478675,95.971288,86.492613
2,2,10/22/2025,7:00 PM,Brooklyn Nets,Charlotte Hornets,Spectrum Center,Regular Season,2025-10-22 19:00:00,GRIND,111.540341,106.314731,13.348265,9.324535,0.60875,0.3691,0.02215,217.891613,5.215962,111.553787,106.337825
3,3,10/22/2025,7:00 PM,Cleveland Cavaliers,New York Knicks,Madison Square Garden,Regular Season,2025-10-22 19:00:00,GRIND,106.596302,102.89436,13.396696,12.480006,0.573062,0.404725,0.022212,209.501175,3.731375,106.616275,102.8849


In [41]:
# Look at the core outputs from the multi-script engine
view_cols = [
    "Away Team",
    "Home Team",
    "Home_Win_Prob",
    "Away_Win_Prob",
    "Expected_Score_Total",
    "Expected_Spread",
    "Sim_Home_Score",
    "Sim_Away_Score",
]

# Keep only columns that actually exist (in case names differ)
view_cols = [c for c in view_cols if c in multi_script_results.columns]

multi_script_results[view_cols]


Unnamed: 0,Away Team,Home Team,Home_Win_Prob,Away_Win_Prob,Expected_Score_Total,Expected_Spread,Sim_Home_Score,Sim_Away_Score
0,Golden State Warriors,Los Angeles Lakers,0.500312,0.474938,199.769013,0.545038,100.157025,99.611987
1,Houston Rockets,Oklahoma City Thunder,0.657537,0.326362,182.4639,9.478675,95.971288,86.492613
2,Brooklyn Nets,Charlotte Hornets,0.60875,0.3691,217.891613,5.215962,111.553787,106.337825
3,Cleveland Cavaliers,New York Knicks,0.573062,0.404725,209.501175,3.731375,106.616275,102.8849


In [42]:
# Assuming your original one-script sim_results still exists:
cols = [c for c in view_cols if c in sim_results.columns]

print("Single-script QEPC:")
display(sim_results[cols])

print("\nMulti-script QEPC:")
display(multi_script_results[cols])


Single-script QEPC:


Unnamed: 0,Away Team,Home Team,Home_Win_Prob,Away_Win_Prob,Expected_Score_Total,Expected_Spread,Sim_Home_Score,Sim_Away_Score
0,Houston Rockets,Oklahoma City Thunder,0.6591,0.32345,182.4419,9.3097,95.8758,86.5661
1,Golden State Warriors,Los Angeles Lakers,0.4944,0.48265,199.5923,0.2096,99.90095,99.69135
2,Brooklyn Nets,Charlotte Hornets,0.6068,0.37055,217.746,5.1222,111.4341,106.3119
3,Cleveland Cavaliers,New York Knicks,0.576,0.40155,209.1723,3.6751,106.4237,102.7486



Multi-script QEPC:


Unnamed: 0,Away Team,Home Team,Home_Win_Prob,Away_Win_Prob,Expected_Score_Total,Expected_Spread,Sim_Home_Score,Sim_Away_Score
0,Golden State Warriors,Los Angeles Lakers,0.500312,0.474938,199.769013,0.545038,100.157025,99.611987
1,Houston Rockets,Oklahoma City Thunder,0.657537,0.326362,182.4639,9.478675,95.971288,86.492613
2,Brooklyn Nets,Charlotte Hornets,0.60875,0.3691,217.891613,5.215962,111.553787,106.337825
3,Cleveland Cavaliers,New York Knicks,0.573062,0.404725,209.501175,3.731375,106.616275,102.8849


In [43]:
# === 7c. Compare Single-Script vs Multi-Script QEPC ===

import pandas as pd

# 1) Choose keys that identify each game
key_cols = [c for c in ["Date", "Time", "Away Team", "Home Team"] if c in sim_results.columns]

single = sim_results.copy()
multi = multi_script_results.copy()

single["source"] = "single"
multi["source"] = "multi"

# 2) Select the main metric columns we care about
metric_cols = [
    "Home_Win_Prob",
    "Away_Win_Prob",
    "Expected_Score_Total",
    "Expected_Spread",
]

metric_cols = [c for c in metric_cols if c in single.columns and c in multi.columns]

# 3) Rename metrics for merge (so we get side-by-side columns)
single_renamed = single[key_cols + metric_cols].copy()
multi_renamed = multi[key_cols + metric_cols].copy()

single_renamed = single_renamed.rename(columns={c: f"{c}_single" for c in metric_cols})
multi_renamed = multi_renamed.rename(columns={c: f"{c}_multi" for c in metric_cols})

# 4) Merge on game keys
compare_df = pd.merge(
    single_renamed,
    multi_renamed,
    on=key_cols,
    how="inner",
)

# 5) Compute deltas (multi - single)
for c in metric_cols:
    compare_df[f"{c}_delta"] = compare_df[f"{c}_multi"] - compare_df[f"{c}_single"]

# 6) Sort by biggest change in expected total (or any metric you like)
sort_col = "Expected_Score_Total_delta" if "Expected_Score_Total_delta" in compare_df.columns else None
if sort_col:
    compare_df = compare_df.sort_values(sort_col, ascending=False)

compare_df


Unnamed: 0,Date,Time,Away Team,Home Team,Home_Win_Prob_single,Away_Win_Prob_single,Expected_Score_Total_single,Expected_Spread_single,Home_Win_Prob_multi,Away_Win_Prob_multi,Expected_Score_Total_multi,Expected_Spread_multi,Home_Win_Prob_delta,Away_Win_Prob_delta,Expected_Score_Total_delta,Expected_Spread_delta
3,10/22/2025,7:00 PM,Cleveland Cavaliers,New York Knicks,0.576,0.40155,209.1723,3.6751,0.573062,0.404725,209.501175,3.731375,-0.002938,0.003175,0.328875,0.056275
1,10/21/2025,10:00 PM,Golden State Warriors,Los Angeles Lakers,0.4944,0.48265,199.5923,0.2096,0.500312,0.474938,199.769013,0.545038,0.005912,-0.007713,0.176713,0.335438
2,10/22/2025,7:00 PM,Brooklyn Nets,Charlotte Hornets,0.6068,0.37055,217.746,5.1222,0.60875,0.3691,217.891613,5.215962,0.00195,-0.00145,0.145612,0.093762
0,10/21/2025,7:30 PM,Houston Rockets,Oklahoma City Thunder,0.6591,0.32345,182.4419,9.3097,0.657537,0.326362,182.4639,9.478675,-0.001563,0.002912,0.022,0.168975


In [44]:
# === 7d. Highlight games most affected by multi-script QEPC ===

# Set how big a change you care about
prob_threshold = 0.02      # 0.02 = 2 percentage points
total_threshold = 1.0      # 1.0 = 1 point of total

cols = compare_df.columns

# Build a mask safely depending on which delta columns exist
mask = False

if "Home_Win_Prob_delta" in cols:
    mask = mask | (compare_df["Home_Win_Prob_delta"].abs() >= prob_threshold)

if "Away_Win_Prob_delta" in cols:
    mask = mask | (compare_df["Away_Win_Prob_delta"].abs() >= prob_threshold)

if "Expected_Score_Total_delta" in cols:
    mask = mask | (compare_df["Expected_Score_Total_delta"].abs() >= total_threshold)

# Apply mask
big_moves = compare_df[mask].copy()

print(f"Games with ‚â• {prob_threshold*100:.1f} pp change in win prob or ‚â• {total_threshold:.1f} pts change in total:")
big_moves


Games with ‚â• 2.0 pp change in win prob or ‚â• 1.0 pts change in total:


Unnamed: 0,Date,Time,Away Team,Home Team,Home_Win_Prob_single,Away_Win_Prob_single,Expected_Score_Total_single,Expected_Spread_single,Home_Win_Prob_multi,Away_Win_Prob_multi,Expected_Score_Total_multi,Expected_Spread_multi,Home_Win_Prob_delta,Away_Win_Prob_delta,Expected_Score_Total_delta,Expected_Spread_delta


In [45]:
# === 11. Injury Impact Inspector ===

import pandas as pd

dd_inj_path = project_root / "data" / "Injury_Overrides_data_driven.csv"
inj_dd = pd.read_csv(dd_inj_path)

print("Data-driven injury overrides loaded from:", dd_inj_path)
display(inj_dd)

def show_injury_impact(team: str, player: str):
    """Quick helper to inspect one injured player's impact row."""
    row = inj_dd[(inj_dd["Team"] == team) & (inj_dd["PlayerName"] == player)]
    if row.empty:
        print(f"No injury override found for {player} on {team}")
    else:
        display(row)

# Examples: tweak these to whoever you care about
show_injury_impact("Indiana Pacers", "Tyrese Haliburton")
show_injury_impact("Boston Celtics", "Jayson Tatum")
show_injury_impact("Los Angeles Clippers", "Bradley Beal")


Data-driven injury overrides loaded from: C:\Users\wdors\qepc_project\data\Injury_Overrides_data_driven.csv


Unnamed: 0,Team,PlayerName,Status,Impact,Note,off_share_season
0,Indiana Pacers,Tyrese Haliburton,out_for_season,0.95,Torn Achilles (Right) ‚Äì out for 2025-26 season...,0.05
1,Los Angeles Clippers,Bradley Beal,out_for_season,0.948453,Hip fracture ‚Äì season-ending surgery after ear...,0.051547
2,Houston Rockets,Fred VanVleet,out_for_season,1.0,Torn ACL ‚Äì suffered during Sept 2025 offseason...,0.0
3,Dallas Mavericks,Dante Exum,out_for_season,1.0,Knee injury ‚Äì ruled out indefinitely shortly a...,0.0
4,Oklahoma City Thunder,Thomas Sorber,out_for_season,0.95,Torn ACL ‚Äì rookie center injured in Sept 2025,0.05
5,Oklahoma City Thunder,Nikola Topic,out_for_season,0.909993,ACL recovery ‚Äì sitting out rookie season to re...,0.090007
6,Boston Celtics,Jayson Tatum,long_term_out,1.0,Torn Achilles ‚Äì injured in May 2025 playoffs; ...,0.0
7,New Orleans Pelicans,Dejounte Murray,long_term_out,1.0,Torn Achilles ‚Äì injured Jan 2025; targeting Ja...,0.0
8,Dallas Mavericks,Kyrie Irving,long_term_out,1.0,Torn ACL (Left) ‚Äì injured March 2025; aiming f...,0.0
9,Denver Nuggets,Aaron Gordon,long_term_out,0.902918,Severe hamstring strain ‚Äì multi-month absence;...,0.138688


Unnamed: 0,Team,PlayerName,Status,Impact,Note,off_share_season
0,Indiana Pacers,Tyrese Haliburton,out_for_season,0.95,Torn Achilles (Right) ‚Äì out for 2025-26 season...,0.05


Unnamed: 0,Team,PlayerName,Status,Impact,Note,off_share_season
6,Boston Celtics,Jayson Tatum,long_term_out,1.0,Torn Achilles ‚Äì injured in May 2025 playoffs; ...,0.0


Unnamed: 0,Team,PlayerName,Status,Impact,Note,off_share_season
1,Los Angeles Clippers,Bradley Beal,out_for_season,0.948453,Hip fracture ‚Äì season-ending surgery after ear...,0.051547
