# Live Calibration Session

**Purpose**: Fast iteration cycle for live parameter calibration

**Workflow**: Load â†’ Modify â†’ Run â†’ Visualize â†’ Repeat

**This is throwaway code** - for SCRI session only

---

## Setup: Import Engine

In [None]:
import sys
sys.path.insert(0, '../..')

from seleensim.entities import Site, Trial, PatientFlow
from seleensim.distributions import Triangular, Gamma, Bernoulli
from seleensim.simulation import SimulationEngine
from seleensim.constraints import (
    BudgetThrottlingConstraint,
    ResourceCapacityConstraint,
    LinearResponseCurve,
    LinearCapacityDegradation,
    NoCapacityDegradation
)

import json
import matplotlib.pyplot as plt
import numpy as np
from datetime import datetime

print("âœ“ Engine imported (read-only)")

---

## Cell 1: Load Assumption Parameters

**What this does**: Loads baseline trial configuration and constraint parameters from JSON/dict

**Instructions**: Run once at start of session

In [None]:
# Baseline assumption parameters
# These represent "Week 1" expert guesses before calibration

assumptions = {
    "trial": {
        "trial_id": "LIVE_CALIBRATION",
        "target_enrollment": 200,
        "site": {
            "activation_time": {"low": 30, "mode": 45, "high": 90},  # Days
            "enrollment_rate": {"shape": 2, "scale": 1.5},             # Patients/month
            "dropout_rate": 0.15                                        # Probability
        },
        "patient_flow": {
            "visit_duration": {"low": 90, "mode": 180, "high": 365}  # Days
        }
    },
    "constraints": {
        "budget": {
            "budget_per_day": 50000,        # $/day available
            "min_speed_ratio": 0.5           # 0.5 = work can slow to 50% (2x longer)
        },
        "capacity": {
            "resource_id": "CRA",
            "use_degradation": True,         # True = efficiency degrades, False = queueing only
            "threshold": 0.8,                # Degradation starts at 80% utilization
            "max_multiplier": 2.0,           # 2x slower at max utilization
            "max_utilization": 1.5           # Max penalty at 150% utilization
        }
    },
    "simulation": {
        "num_runs": 100,                     # Monte Carlo iterations
        "master_seed": 42                    # For reproducibility
    }
}

# Optional: Load from JSON file instead
# with open('baseline_assumptions.json', 'r') as f:
#     assumptions = json.load(f)

print("âœ“ Baseline assumptions loaded")
print(f"\nTrial: {assumptions['trial']['target_enrollment']} patients")
print(f"Budget: ${assumptions['constraints']['budget']['budget_per_day']:,}/day")
print(f"Budget max slowdown: {1/assumptions['constraints']['budget']['min_speed_ratio']:.1f}x")
print(f"Capacity degradation: {'Yes' if assumptions['constraints']['capacity']['use_degradation'] else 'No'}")
if assumptions['constraints']['capacity']['use_degradation']:
    print(f"  Threshold: {assumptions['constraints']['capacity']['threshold']*100:.0f}%")
    print(f"  Max slowdown: {assumptions['constraints']['capacity']['max_multiplier']:.1f}x")

---

## Cell 2: Modify Key Parameters

**What this does**: Change 3-4 key parameters that SCRI wants to explore

**Instructions**: 
1. SCRI suggests new parameter values
2. Update the values below
3. Re-run this cell
4. Proceed to Cell 3 (run simulation)

**Key parameters** (change these based on SCRI feedback):

In [None]:
# ðŸ‘‰ SCRI: CHANGE THESE VALUES DURING SESSION

# Parameter 1: Budget constraint - how much can work slow down?
# 0.5 = 2x slower max, 0.2 = 5x slower max, 0.1 = 10x slower max
min_speed_ratio = 0.5

# Parameter 2: Daily budget available
# Typical range: 25,000 to 150,000
budget_per_day = 50000

# Parameter 3: Capacity degradation threshold
# At what utilization % does efficiency start degrading?
# 0.8 = 80%, 0.7 = 70%, 0.9 = 90%
capacity_threshold = 0.8

# Parameter 4: Capacity max degradation
# How much slower at maximum utilization?
# 2.0 = 2x slower, 3.0 = 3x slower, 5.0 = 5x slower
capacity_max_multiplier = 2.0

# Optional: Toggle capacity degradation on/off
use_capacity_degradation = True

# Update assumptions dict with new values
assumptions['constraints']['budget']['min_speed_ratio'] = min_speed_ratio
assumptions['constraints']['budget']['budget_per_day'] = budget_per_day
assumptions['constraints']['capacity']['threshold'] = capacity_threshold
assumptions['constraints']['capacity']['max_multiplier'] = capacity_max_multiplier
assumptions['constraints']['capacity']['use_degradation'] = use_capacity_degradation

print("âœ“ Parameters updated:")
print(f"\n  Budget:")
print(f"    Daily rate: ${budget_per_day:,}")
print(f"    Max slowdown: {1/min_speed_ratio:.1f}x")
print(f"\n  Capacity:")
if use_capacity_degradation:
    print(f"    Model: Linear degradation")
    print(f"    Threshold: {capacity_threshold*100:.0f}% utilization")
    print(f"    Max slowdown: {capacity_max_multiplier:.1f}x")
else:
    print(f"    Model: Queueing only (no degradation)")

print(f"\n  â†’ Ready to run simulation")

---

## Cell 3: Run Simulation

**What this does**: Builds trial and constraints from parameters, runs Monte Carlo simulation

**Instructions**: Run this after Cell 2 to see impact of parameter changes

**Expected runtime**: ~5-10 seconds for 100 runs

In [None]:
# Build trial from assumptions
site = Site(
    site_id="SITE_001",
    activation_time=Triangular(**assumptions['trial']['site']['activation_time']),
    enrollment_rate=Gamma(**assumptions['trial']['site']['enrollment_rate']),
    dropout_rate=Bernoulli(p=assumptions['trial']['site']['dropout_rate'])
)

flow = PatientFlow(
    flow_id="STANDARD_FLOW",
    states={"enrolled", "completed"},
    initial_state="enrolled",
    terminal_states={"completed"},
    transition_times={
        ("enrolled", "completed"): Triangular(**assumptions['trial']['patient_flow']['visit_duration'])
    }
)

trial = Trial(
    trial_id=assumptions['trial']['trial_id'],
    target_enrollment=assumptions['trial']['target_enrollment'],
    sites=[site],
    patient_flow=flow
)

# Build constraints from assumptions
budget_constraint = BudgetThrottlingConstraint(
    budget_per_day=assumptions['constraints']['budget']['budget_per_day'],
    response_curve=LinearResponseCurve(
        min_speed_ratio=assumptions['constraints']['budget']['min_speed_ratio']
    )
)

if assumptions['constraints']['capacity']['use_degradation']:
    capacity_response = LinearCapacityDegradation(
        threshold=assumptions['constraints']['capacity']['threshold'],
        max_multiplier=assumptions['constraints']['capacity']['max_multiplier'],
        max_utilization=assumptions['constraints']['capacity']['max_utilization']
    )
else:
    capacity_response = NoCapacityDegradation()

capacity_constraint = ResourceCapacityConstraint(
    resource_id=assumptions['constraints']['capacity']['resource_id'],
    capacity_response=capacity_response
)

constraints = [budget_constraint, capacity_constraint]

# Run simulation
print("Running simulation...")
engine = SimulationEngine(
    master_seed=assumptions['simulation']['master_seed'],
    constraints=constraints
)
results = engine.run(trial, num_runs=assumptions['simulation']['num_runs'])

print("\nâœ“ Simulation complete")
print(f"\nResults ({assumptions['simulation']['num_runs']} runs):")
print(f"  P10 completion: {results.completion_time_p10:.1f} days")
print(f"  P50 completion: {results.completion_time_p50:.1f} days")
print(f"  P90 completion: {results.completion_time_p90:.1f} days")
print(f"  Mean events rescheduled: {results.mean_events_rescheduled:.1f}")
print(f"\n  â†’ Proceed to Cell 4 to visualize")

---

## Cell 4: Visualize Outputs

**What this does**: Simple plots showing completion time distribution and key percentiles

**Instructions**: 
1. Review plots with SCRI
2. Ask: "Does this match your experience?"
3. If not, return to Cell 2, adjust parameters, re-run
4. Repeat until SCRI says "That feels about right"

In [None]:
# Create figure with 2 subplots
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Histogram of completion times
axes[0].hist(results.completion_times, bins=30, alpha=0.7, edgecolor='black', color='steelblue')
axes[0].axvline(results.completion_time_p10, color='green', linestyle='--', linewidth=2, label='P10')
axes[0].axvline(results.completion_time_p50, color='red', linestyle='--', linewidth=2, label='P50')
axes[0].axvline(results.completion_time_p90, color='orange', linestyle='--', linewidth=2, label='P90')
axes[0].set_xlabel('Completion Time (days)', fontsize=12)
axes[0].set_ylabel('Frequency', fontsize=12)
axes[0].set_title('Trial Completion Time Distribution', fontsize=14, fontweight='bold')
axes[0].legend(fontsize=10)
axes[0].grid(alpha=0.3)

# Plot 2: Box plot summary
box_data = [results.completion_times]
bp = axes[1].boxplot(box_data, widths=0.5, patch_artist=True)
bp['boxes'][0].set_facecolor('steelblue')
bp['boxes'][0].set_alpha(0.7)
axes[1].set_ylabel('Completion Time (days)', fontsize=12)
axes[1].set_title('Distribution Summary', fontsize=14, fontweight='bold')
axes[1].set_xticklabels(['Trial'])
axes[1].grid(alpha=0.3, axis='y')

# Add text annotations
textstr = f"P10: {results.completion_time_p10:.1f} days\n"
textstr += f"P50: {results.completion_time_p50:.1f} days\n"
textstr += f"P90: {results.completion_time_p90:.1f} days\n"
textstr += f"Events rescheduled: {results.mean_events_rescheduled:.1f}"
axes[1].text(0.02, 0.98, textstr, transform=axes[1].transAxes, fontsize=10,
             verticalalignment='top', bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))

plt.tight_layout()
plt.show()

# Summary statistics
print("\n" + "="*50)
print("SUMMARY STATISTICS")
print("="*50)
print(f"\nCompletion Time (days):")
print(f"  Min:  {results.completion_times.min():.1f}")
print(f"  P10:  {results.completion_time_p10:.1f}")
print(f"  P25:  {np.percentile(results.completion_times, 25):.1f}")
print(f"  P50:  {results.completion_time_p50:.1f}")
print(f"  P75:  {np.percentile(results.completion_times, 75):.1f}")
print(f"  P90:  {results.completion_time_p90:.1f}")
print(f"  Max:  {results.completion_times.max():.1f}")
print(f"\nRange:")
print(f"  P10-P90: {results.completion_time_p90 - results.completion_time_p10:.1f} days")
print(f"  Std dev: {results.completion_times.std():.1f} days")
print(f"\nConstraint Impact:")
print(f"  Events rescheduled: {results.mean_events_rescheduled:.1f} (avg)")
print("\n" + "="*50)

---

## Iteration Questions for SCRI

After viewing results, ask SCRI:

1. **Does this match your experience?**
   - Is P50 too high? Too low? About right?
   - Is the range (P10-P90) realistic?

2. **What would you change?**
   - Budget slowdown: More aggressive (lower `min_speed_ratio`)? Less aggressive?
   - Capacity threshold: Does degradation start earlier or later?
   - Max slowdown: 2x feels right? Or more like 3x? 5x?

3. **Direction test:**
   - If we increase max slowdown, should P90 increase? (Yes)
   - If we increase budget, should P90 decrease? (Yes)
   - Does this make intuitive sense to you?

**If SCRI says "adjust X":**
- Go back to Cell 2
- Change parameter X
- Re-run Cell 2 â†’ Cell 3 â†’ Cell 4
- Repeat until convergence

---

## Save Calibrated Parameters

**When to use**: After SCRI says "That feels about right"

**What this does**: Saves final parameter values to JSON file

In [None]:
# Capture final calibrated parameters
calibrated = {
    "session_date": datetime.now().isoformat(),
    "status": "SCRI_APPROVED",  # Change this if not approved
    "parameters": assumptions,
    "results": {
        "p10_days": float(results.completion_time_p10),
        "p50_days": float(results.completion_time_p50),
        "p90_days": float(results.completion_time_p90),
        "mean_events_rescheduled": float(results.mean_events_rescheduled),
        "std_dev_days": float(results.completion_times.std())
    },
    "notes": "Add SCRI feedback here"
}

# Save to file
filename = f"calibrated_params_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
with open(filename, 'w') as f:
    json.dump(calibrated, f, indent=2)

print(f"âœ“ Calibrated parameters saved to: {filename}")
print("\nFinal parameter values:")
print(f"\n  Budget:")
print(f"    Daily rate: ${calibrated['parameters']['constraints']['budget']['budget_per_day']:,}")
print(f"    Max slowdown: {1/calibrated['parameters']['constraints']['budget']['min_speed_ratio']:.1f}x")
print(f"\n  Capacity:")
if calibrated['parameters']['constraints']['capacity']['use_degradation']:
    print(f"    Model: Linear degradation")
    print(f"    Threshold: {calibrated['parameters']['constraints']['capacity']['threshold']*100:.0f}%")
    print(f"    Max slowdown: {calibrated['parameters']['constraints']['capacity']['max_multiplier']:.1f}x")
else:
    print(f"    Model: Queueing only")
print(f"\n  Results:")
print(f"    P50: {calibrated['results']['p50_days']:.1f} days")
print(f"    P90: {calibrated['results']['p90_days']:.1f} days")
print(f"\n  âœ“ Session complete")

---

## Session Notes

**Document SCRI feedback here:**

### Iteration History
| Iteration | min_speed_ratio | budget_per_day | capacity_threshold | capacity_max_mult | P50 (days) | SCRI Feedback |
|-----------|----------------|----------------|-------------------|-------------------|------------|---------------|
| 1         |                |                |                   |                   |            |               |
| 2         |                |                |                   |                   |            |               |
| 3         |                |                |                   |                   |            |               |
| FINAL     |                |                |                   |                   |            | âœ“ Approved    |

### Key Insights
- What did SCRI find surprising?
- What matched their intuition?
- What parameters were most sensitive?
- What's missing from the model?

### Next Steps
- Architecture validated? Yes / No / Partial
- Gaps identified:
- Follow-up needed? Yes / No

---

**End of live calibration session**