# Baseline Calibration Session

**Purpose**: Live parameter manipulation for SCRI calibration

**This is a throwaway notebook** - designed for one-time SCRI session, not production.

---

## Setup

In [None]:
# Import from existing engine (READ-ONLY)
import sys
sys.path.insert(0, '../..')

from seleensim.entities import Site, Trial, PatientFlow
from seleensim.distributions import Triangular, Gamma, Bernoulli
from seleensim.simulation import SimulationEngine
from seleensim.constraints import (
    BudgetThrottlingConstraint,
    ResourceCapacityConstraint,
    LinearResponseCurve,
    LinearCapacityDegradation,
    NoCapacityDegradation
)

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Define Baseline Trial

**Question for SCRI**: Do these distributions feel right?

In [None]:
# Baseline site configuration
site = Site(
    site_id="SITE_001",
    activation_time=Triangular(low=30, mode=45, high=90),  # Days to activate
    enrollment_rate=Gamma(shape=2, scale=1.5),              # Patients per month
    dropout_rate=Bernoulli(p=0.15)                          # 15% dropout
)

# Patient flow (simple: enrolled â†’ completed)
flow = PatientFlow(
    flow_id="STANDARD_FLOW",
    states={"enrolled", "completed"},
    initial_state="enrolled",
    terminal_states={"completed"},
    transition_times={
        ("enrolled", "completed"): Triangular(low=90, mode=180, high=365)
    }
)

# Trial
trial = Trial(
    trial_id="BASELINE_TRIAL",
    target_enrollment=200,
    sites=[site],
    patient_flow=flow
)

print("Baseline trial configured")
print(f"Target enrollment: {trial.target_enrollment}")
print(f"Number of sites: {len(trial.sites)}")

## SCRI Exercise 1: Budget Constraint

**Question**: When budget is tight, how much does work slow down?

### Change these parameters and re-run:

In [None]:
# ðŸ‘‰ SCRI: CHANGE THESE VALUES
budget_per_day = 50000        # Daily budget available
min_speed_ratio = 0.5         # 0.5 = work can slow to 50% speed (2x longer)
                              # 0.2 = work can slow to 20% speed (5x longer)

# Create constraint with your parameters
budget_constraint = BudgetThrottlingConstraint(
    budget_per_day=budget_per_day,
    response_curve=LinearResponseCurve(min_speed_ratio=min_speed_ratio)
)

print(f"Budget constraint configured:")
print(f"  Daily budget: ${budget_per_day:,.0f}")
print(f"  Max slowdown: {1/min_speed_ratio:.1f}x")

### Run simulation with budget constraint:

In [None]:
# Run simulation
engine = SimulationEngine(master_seed=42, constraints=[budget_constraint])
results = engine.run(trial, num_runs=100)

# Show results
print("\nSimulation Results (with budget constraint):")
print(f"  P10 completion: {results.completion_time_p10:.1f} days")
print(f"  P50 completion: {results.completion_time_p50:.1f} days")
print(f"  P90 completion: {results.completion_time_p90:.1f} days")
print(f"  Mean events rescheduled: {results.mean_events_rescheduled:.1f}")

# Visualize
plt.figure(figsize=(10, 4))
plt.hist(results.completion_times, bins=30, alpha=0.7, edgecolor='black')
plt.axvline(results.completion_time_p50, color='red', linestyle='--', label='P50')
plt.axvline(results.completion_time_p90, color='orange', linestyle='--', label='P90')
plt.xlabel('Completion Time (days)')
plt.ylabel('Frequency')
plt.title('Trial Completion Time Distribution')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

**Ask SCRI**: 
- Does this feel right? Too fast? Too slow?
- Should max slowdown be more like 3x? 5x? 10x?
- Go back and change `min_speed_ratio` above, then re-run

## SCRI Exercise 2: Capacity Constraint

**Question**: When you have 1 CRA for 10 sites vs 1 for 5 sites, what happens?

### Change these parameters and re-run:

In [None]:
# ðŸ‘‰ SCRI: CHANGE THESE VALUES

# Option A: No degradation (conservative default)
use_degradation = True        # Set to False for queueing-only model

# Option B: Linear degradation parameters
threshold = 0.8               # Degradation starts at 80% utilization
max_multiplier = 2.0          # Work becomes 2x slower at max
max_utilization = 1.5         # Max penalty at 150% utilization

# Create constraint
if use_degradation:
    capacity_response = LinearCapacityDegradation(
        threshold=threshold,
        max_multiplier=max_multiplier,
        max_utilization=max_utilization
    )
    print("Using LINEAR DEGRADATION model")
    print(f"  Degradation starts: {threshold*100:.0f}% utilization")
    print(f"  Max slowdown: {max_multiplier:.1f}x")
else:
    capacity_response = NoCapacityDegradation()
    print("Using QUEUEING-ONLY model (no degradation)")

capacity_constraint = ResourceCapacityConstraint(
    resource_id="CRA",
    capacity_response=capacity_response
)

### Run simulation with capacity constraint:

In [None]:
# Run simulation
engine = SimulationEngine(master_seed=42, constraints=[capacity_constraint])
results = engine.run(trial, num_runs=100)

# Show results
print("\nSimulation Results (with capacity constraint):")
print(f"  P10 completion: {results.completion_time_p10:.1f} days")
print(f"  P50 completion: {results.completion_time_p50:.1f} days")
print(f"  P90 completion: {results.completion_time_p90:.1f} days")
print(f"  Mean events rescheduled: {results.mean_events_rescheduled:.1f}")

# Visualize
plt.figure(figsize=(10, 4))
plt.hist(results.completion_times, bins=30, alpha=0.7, edgecolor='black')
plt.axvline(results.completion_time_p50, color='red', linestyle='--', label='P50')
plt.axvline(results.completion_time_p90, color='orange', linestyle='--', label='P90')
plt.xlabel('Completion Time (days)')
plt.ylabel('Frequency')
plt.title('Trial Completion Time Distribution')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

**Ask SCRI**:
- Does efficiency degrade? Or just queue up?
- If it degrades, at what utilization % does it start?
- Go back and change parameters above, then re-run

## SCRI Exercise 3: Combined Constraints

**Question**: When BOTH budget AND capacity are constrained, what happens?

In [None]:
# Use the parameters you configured above
constraints = [budget_constraint, capacity_constraint]

# Run simulation
engine = SimulationEngine(master_seed=42, constraints=constraints)
results = engine.run(trial, num_runs=100)

# Show results
print("\nSimulation Results (with BOTH constraints):")
print(f"  P10 completion: {results.completion_time_p10:.1f} days")
print(f"  P50 completion: {results.completion_time_p50:.1f} days")
print(f"  P90 completion: {results.completion_time_p90:.1f} days")
print(f"  Mean events rescheduled: {results.mean_events_rescheduled:.1f}")

# Visualize
plt.figure(figsize=(10, 4))
plt.hist(results.completion_times, bins=30, alpha=0.7, edgecolor='black')
plt.axvline(results.completion_time_p50, color='red', linestyle='--', label='P50')
plt.axvline(results.completion_time_p90, color='orange', linestyle='--', label='P90')
plt.xlabel('Completion Time (days)')
plt.ylabel('Frequency')
plt.title('Trial Completion Time Distribution (Combined Constraints)')
plt.legend()
plt.grid(alpha=0.3)
plt.show()

**Ask SCRI**:
- Is this more realistic than individual constraints?
- Do constraints interact in expected ways?
- Does combined effect feel right?

## Comparison: Baseline vs Constrained

Run unconstrained simulation for comparison:

In [None]:
# Unconstrained baseline
engine_baseline = SimulationEngine(master_seed=42, constraints=[])
results_baseline = engine_baseline.run(trial, num_runs=100)

# Constrained
engine_constrained = SimulationEngine(master_seed=42, constraints=[budget_constraint, capacity_constraint])
results_constrained = engine_constrained.run(trial, num_runs=100)

# Compare
comparison = pd.DataFrame({
    'Metric': ['P10 (days)', 'P50 (days)', 'P90 (days)', 'Events rescheduled'],
    'Unconstrained': [
        results_baseline.completion_time_p10,
        results_baseline.completion_time_p50,
        results_baseline.completion_time_p90,
        results_baseline.mean_events_rescheduled
    ],
    'Constrained': [
        results_constrained.completion_time_p10,
        results_constrained.completion_time_p50,
        results_constrained.completion_time_p90,
        results_constrained.mean_events_rescheduled
    ]
})

comparison['Difference'] = comparison['Constrained'] - comparison['Unconstrained']
comparison['% Change'] = (comparison['Difference'] / comparison['Unconstrained'] * 100).round(1)

print("\nComparison: Unconstrained vs Constrained")
print(comparison.to_string(index=False))

# Visualize side-by-side
fig, axes = plt.subplots(1, 2, figsize=(14, 4))

# Unconstrained
axes[0].hist(results_baseline.completion_times, bins=30, alpha=0.7, edgecolor='black')
axes[0].axvline(results_baseline.completion_time_p50, color='red', linestyle='--')
axes[0].set_xlabel('Completion Time (days)')
axes[0].set_ylabel('Frequency')
axes[0].set_title('Unconstrained')
axes[0].grid(alpha=0.3)

# Constrained
axes[1].hist(results_constrained.completion_times, bins=30, alpha=0.7, edgecolor='black', color='orange')
axes[1].axvline(results_constrained.completion_time_p50, color='red', linestyle='--')
axes[1].set_xlabel('Completion Time (days)')
axes[1].set_ylabel('Frequency')
axes[1].set_title('Constrained (Budget + Capacity)')
axes[1].grid(alpha=0.3)

plt.tight_layout()
plt.show()

## Save Calibrated Parameters

Once SCRI says "That feels about right", save the parameters:

In [None]:
import json
from datetime import datetime

# Capture final parameter choices
calibrated_params = {
    "session_date": datetime.now().isoformat(),
    "trial_id": trial.trial_id,
    "budget_constraint": {
        "budget_per_day": budget_per_day,
        "response_curve": {
            "type": "LinearResponseCurve",
            "min_speed_ratio": min_speed_ratio,
            "max_speed_ratio": 1.0
        }
    },
    "capacity_constraint": {
        "resource_id": "CRA",
        "capacity_response": {
            "type": "LinearCapacityDegradation" if use_degradation else "NoCapacityDegradation",
            "threshold": threshold if use_degradation else None,
            "max_multiplier": max_multiplier if use_degradation else None,
            "max_utilization": max_utilization if use_degradation else None
        }
    },
    "results": {
        "p10_days": results_constrained.completion_time_p10,
        "p50_days": results_constrained.completion_time_p50,
        "p90_days": results_constrained.completion_time_p90,
        "mean_events_rescheduled": results_constrained.mean_events_rescheduled
    }
}

# Save to file
filename = f"scri_calibrated_params_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
with open(filename, 'w') as f:
    json.dump(calibrated_params, f, indent=2)

print(f"\nCalibrated parameters saved to: {filename}")
print("\nFinal parameter values:")
print(json.dumps(calibrated_params, indent=2))

---

## Session Notes

**Document SCRI feedback here**:

### Budget Constraint
- Initial guess: min_speed_ratio = ?
- SCRI feedback: ?
- Final value: ?
- Reasoning: ?

### Capacity Constraint
- Degradation model: Yes / No
- If yes, threshold = ?
- SCRI feedback: ?
- Reasoning: ?

### Gaps Identified
- What's missing from the model?
- What features did SCRI request?
- What assumptions don't match reality?

### Next Steps
- Architecture validated? Yes / No / Partial
- If gaps: What needs to be added?
- Follow-up session needed? Yes / No