# Module 5: Machine Learning for ED Triage Optimization
## Production-Grade Predictive Analytics

---

### Strategic Question: Which Prediction Target(s) Should We Optimize?

The project allows: "Probability of Admission" **OR** "Probability of LWBS" **OR** "Remaining Time to PIA"

**Our Analysis:**

| Target | Operational Value | Data Quality | Actionability | Recommendation |
|--------|-------------------|--------------|---------------|----------------|
| **Admission** | HIGH ‚Äî enables bed planning, early consults | Good signal (14% rate) | Immediate action possible | ‚úÖ PRIMARY |
| **LWBS** | HIGH ‚Äî prevents patient safety issues | Weak signal (1.5% rate) | Proactive monitoring | ‚úÖ SECONDARY |
| **PIA Time** | MEDIUM ‚Äî patient communication | Weak signal (R¬≤<0.20) | Limited actionability | ‚ö†Ô∏è OPTIONAL |

**Decision: Build TWO optimized models (Admission + LWBS)**

**Rationale:**
1. Both targets drive different but complementary decisions
2. They can run in parallel with minimal latency
3. Combined, they give Triage Lead a complete risk picture

---

## Section 1: Environment Setup

In [1]:
# =============================================================================
# CELL 1: IMPORTS
# =============================================================================

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from IPython.display import display, HTML
import warnings
warnings.filterwarnings('ignore')

# ML Core
from sklearn.model_selection import (
    train_test_split, cross_val_score, StratifiedKFold, 
    RandomizedSearchCV, GridSearchCV
)
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer

# Models to Compare
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.tree import DecisionTreeClassifier

# Try to import XGBoost and LightGBM (may need installation)
try:
    from xgboost import XGBClassifier
    XGBOOST_AVAILABLE = True
    print("‚úì XGBoost available")
except ImportError:
    XGBOOST_AVAILABLE = False
    print("‚ö† XGBoost not available ‚Äî using GradientBoosting instead")

try:
    from lightgbm import LGBMClassifier
    LIGHTGBM_AVAILABLE = True
    print("‚úì LightGBM available")
except ImportError:
    LIGHTGBM_AVAILABLE = False
    print("‚ö† LightGBM not available ‚Äî using GradientBoosting instead")

# Metrics
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    roc_auc_score, roc_curve, precision_recall_curve,
    confusion_matrix, classification_report, brier_score_loss,
    average_precision_score
)

# For calibration
from sklearn.calibration import CalibratedClassifierCV, calibration_curve

print("\n‚úì All core imports successful")

‚úì XGBoost available
‚úì LightGBM available

‚úì All core imports successful


In [2]:
# =============================================================================
# CELL 2: LOAD DEPENDENCIES
# =============================================================================

%run ../utils/helpers.ipynb
%run data_loader.ipynb

NTH-ED DATA LOADING PIPELINE
üì• Loading event log...
   ‚úì Loaded 90,965 events
   ‚úì 16,011 unique patient visits
   ‚úì Columns: ['visit_id', 'patient_id', 'initial_zone', 'age', 'month', 'day', 'gender', 'triage_code', 'triage_desc', 'disposition_code', 'disposition_desc', 'consult_desc', 'cdu_flag', 'consult_req_flag', 'consult_arrival_flag', 'event', 'timestamp']

‚è∞ Parsing timestamps...
   ‚úì All timestamps parsed successfully
   ‚úì Date range: 2021-03-31 23:59:00 to 2021-06-01 17:16:00

üîß Handling missing data...
   ‚Ä¢ initial_zone: 1,950 missing (2.1%)
   ‚Ä¢ triage_code: 3 missing (0.0%)
   ‚Ä¢ consult_desc: 69,698 missing (76.6%)
   ‚Ä¢ age: 0 missing (0.0%)
   ‚úì Missing zones filled with 'Unknown'
   ‚úì Missing consults marked as 'No Consult'

üìã Standardizing columns for process mining...
   ‚úì Created process mining columns: case_id, activity, resource
   ‚úì Created outcome flags: is_admitted, is_lwbs

üîÄ Sorting events with logical ordering...
   ‚úì 

In [3]:
# =============================================================================
# CELL 3: LOAD DATA
# =============================================================================

filepath = "/Users/ishaandawra/Desktop/Machine Learning Notes/Machine Learning Projects/Analytics_Colloquia_Project/data/event_log_ED_MMA_2026.csv"
event_log, visits = load_and_prepare_data(filepath)

print(f"\n{'='*60}")
print("DATA OVERVIEW")
print(f"{'='*60}")
print(f"Total visits: {len(visits):,}")
print(f"\nTarget Variables:")
print(f"  ‚Ä¢ Admission rate: {visits['is_admitted'].mean()*100:.1f}% ({visits['is_admitted'].sum():,} cases)")
print(f"  ‚Ä¢ LWBS rate: {visits['is_lwbs'].mean()*100:.2f}% ({visits['is_lwbs'].sum():,} cases)")
print(f"  ‚Ä¢ Median PIA: {visits['pia_minutes'].median():.0f} minutes")

NTH-ED DATA LOADING PIPELINE
üì• Loading event log...
   ‚úì Loaded 90,965 events
   ‚úì 16,011 unique patient visits
   ‚úì Columns: ['visit_id', 'patient_id', 'initial_zone', 'age', 'month', 'day', 'gender', 'triage_code', 'triage_desc', 'disposition_code', 'disposition_desc', 'consult_desc', 'cdu_flag', 'consult_req_flag', 'consult_arrival_flag', 'event', 'timestamp']

‚è∞ Parsing timestamps...
   ‚úì All timestamps parsed successfully
   ‚úì Date range: 2021-03-31 23:59:00 to 2021-06-01 17:16:00

üîß Handling missing data...
   ‚Ä¢ initial_zone: 1,950 missing (2.1%)
   ‚Ä¢ triage_code: 3 missing (0.0%)
   ‚Ä¢ consult_desc: 69,698 missing (76.6%)
   ‚Ä¢ age: 0 missing (0.0%)
   ‚úì Missing zones filled with 'Unknown'
   ‚úì Missing consults marked as 'No Consult'

üìã Standardizing columns for process mining...
   ‚úì Created process mining columns: case_id, activity, resource
   ‚úì Created outcome flags: is_admitted, is_lwbs

üîÄ Sorting events with logical ordering...
   ‚úì 

---
## Section 2: Feature Engineering (Primary Driver of Performance)

**Philosophy:** Features > Algorithms. Good features make simple models work; bad features make complex models fail.

**Feature Categories:**
1. **Patient Features** ‚Äî Demographics, triage assessment
2. **Temporal Features** ‚Äî Time of arrival, day patterns
3. **Operational Context** ‚Äî Historical zone-hour patterns (KEY INNOVATION)
4. **Interaction Features** ‚Äî Domain-driven combinations

---

In [4]:
# =============================================================================
# CELL 4: COMPREHENSIVE FEATURE ENGINEERING
# =============================================================================

def engineer_features_production(visits: pd.DataFrame) -> pd.DataFrame:
    """
    Production-grade feature engineering for ED triage prediction.
    
    CRITICAL: Only uses features available AT TRIAGE TIME.
    No leakage from future events (assessment, discharge, etc.)
    
    Returns:
        DataFrame with engineered features
    """
    
    df = visits.copy()
    
    print("Engineering features...")
    
    # =========================================================================
    # 1. PATIENT DEMOGRAPHIC FEATURES
    # =========================================================================
    
    # Age transformations
    df['age_scaled'] = df['age'] / 100  # Normalize to 0-1
    df['age_squared'] = (df['age'] / 100) ** 2  # Capture non-linear age effects
    df['is_pediatric'] = (df['age'] < 18).astype(int)
    df['is_young_adult'] = ((df['age'] >= 18) & (df['age'] < 40)).astype(int)
    df['is_middle_age'] = ((df['age'] >= 40) & (df['age'] < 65)).astype(int)
    df['is_senior'] = (df['age'] >= 65).astype(int)
    df['is_elderly'] = (df['age'] >= 80).astype(int)
    
    # Gender
    df['is_male'] = (df['gender'] == 'M').astype(int)
    
    print("  ‚úì Demographic features")
    
    # =========================================================================
    # 2. TRIAGE ASSESSMENT FEATURES (Most Predictive)
    # =========================================================================
    
    df['triage_code_clean'] = df['triage_code'].fillna(3)  # Default to URGENT
    
    # Acuity categories
    df['is_ctas_1'] = (df['triage_code_clean'] == 1).astype(int)  # Resuscitation
    df['is_ctas_2'] = (df['triage_code_clean'] == 2).astype(int)  # Emergent
    df['is_ctas_3'] = (df['triage_code_clean'] == 3).astype(int)  # Urgent
    df['is_ctas_4'] = (df['triage_code_clean'] == 4).astype(int)  # Less Urgent
    df['is_ctas_5'] = (df['triage_code_clean'] == 5).astype(int)  # Non-Urgent
    
    # Grouped acuity
    df['is_high_acuity'] = (df['triage_code_clean'] <= 2).astype(int)  # CTAS 1-2
    df['is_medium_acuity'] = (df['triage_code_clean'] == 3).astype(int)  # CTAS 3
    df['is_low_acuity'] = (df['triage_code_clean'] >= 4).astype(int)  # CTAS 4-5
    
    # Inverse triage (higher = more urgent) for easier interpretation
    df['acuity_score'] = 6 - df['triage_code_clean']  # 5=most urgent, 1=least
    
    print("  ‚úì Triage assessment features")
    
    # =========================================================================
    # 3. TEMPORAL FEATURES
    # =========================================================================
    
    # Cyclical encoding for hour (captures 11 PM close to midnight)
    df['hour_sin'] = np.sin(2 * np.pi * df['arrival_hour'] / 24)
    df['hour_cos'] = np.cos(2 * np.pi * df['arrival_hour'] / 24)
    
    # Time periods
    df['is_morning'] = ((df['arrival_hour'] >= 6) & (df['arrival_hour'] < 12)).astype(int)
    df['is_afternoon'] = ((df['arrival_hour'] >= 12) & (df['arrival_hour'] < 18)).astype(int)
    df['is_evening'] = ((df['arrival_hour'] >= 18) & (df['arrival_hour'] < 23)).astype(int)
    df['is_night'] = ((df['arrival_hour'] >= 23) | (df['arrival_hour'] < 6)).astype(int)
    
    # Peak hours (from case study: 10 AM - 10 PM)
    df['is_peak_hours'] = ((df['arrival_hour'] >= 10) & (df['arrival_hour'] <= 22)).astype(int)
    
    # Day of week
    day_map = {'Monday': 0, 'Tuesday': 1, 'Wednesday': 2, 'Thursday': 3,
               'Friday': 4, 'Saturday': 5, 'Sunday': 6}
    df['day_num'] = df['arrival_day'].map(day_map).fillna(0)
    df['is_weekend'] = df['arrival_day'].isin(['Saturday', 'Sunday']).astype(int)
    df['is_monday'] = (df['arrival_day'] == 'Monday').astype(int)  # Often busiest
    
    print("  ‚úì Temporal features")
    
    # =========================================================================
    # 4. ZONE FEATURES
    # =========================================================================
    
    df['zone_clean'] = df['initial_zone'].fillna('Unknown')
    
    # Binary zone indicators
    df['is_resus'] = (df['initial_zone'] == 'Resus').astype(int)
    df['is_red_zone'] = (df['initial_zone'] == 'Red').astype(int)
    df['is_yellow_zone'] = (df['initial_zone'] == 'YZ').astype(int)
    df['is_green_zone'] = (df['initial_zone'] == 'GZ').astype(int)
    df['is_psych_zone'] = (df['initial_zone'] == 'EPZ').astype(int)
    
    # High-acuity zone flag
    df['is_high_acuity_zone'] = df['initial_zone'].isin(['Resus', 'Red', 'YZ']).astype(int)
    
    print("  ‚úì Zone features")
    
    # =========================================================================
    # 5. ARRIVAL MODE FEATURES
    # =========================================================================
    
    # is_ambulance already computed in data_loader
    # Add interaction with acuity
    df['ambulance_high_acuity'] = df['is_ambulance'] * df['is_high_acuity']
    
    print("  ‚úì Arrival mode features")
    
    # =========================================================================
    # 6. INTERACTION FEATURES (Domain Knowledge)
    # =========================================================================
    
    # Age √ó Acuity interactions (elderly + high acuity = high admission risk)
    df['senior_high_acuity'] = df['is_senior'] * df['is_high_acuity']
    df['elderly_high_acuity'] = df['is_elderly'] * df['is_high_acuity']
    df['elderly_ctas_2'] = df['is_elderly'] * df['is_ctas_2']
    
    # Time √ó Acuity interactions (peak hours + low acuity = LWBS risk)
    df['peak_low_acuity'] = df['is_peak_hours'] * df['is_low_acuity']
    df['night_high_acuity'] = df['is_night'] * df['is_high_acuity']
    df['weekend_low_acuity'] = df['is_weekend'] * df['is_low_acuity']
    
    # Zone √ó Acuity interactions
    df['green_low_acuity'] = df['is_green_zone'] * df['is_low_acuity']
    df['yellow_high_acuity'] = df['is_yellow_zone'] * df['is_high_acuity']
    
    print("  ‚úì Interaction features")
    
    # =========================================================================
    # 7. OPERATIONAL CONTEXT FEATURES (Historical Patterns)
    # =========================================================================
    
    print("  Computing operational context (this takes a moment)...")
    
    # Zone-Hour historical statistics
    zone_hour_stats = df.groupby(['initial_zone', 'arrival_hour']).agg({
        'pia_minutes': ['median', 'mean', lambda x: x.quantile(0.75)],
        'is_lwbs': 'mean',
        'is_admitted': 'mean',
        'case_id': 'count'
    }).reset_index()
    
    zone_hour_stats.columns = [
        'initial_zone', 'arrival_hour',
        'hist_median_pia', 'hist_mean_pia', 'hist_p75_pia',
        'hist_lwbs_rate', 'hist_admit_rate', 'hist_volume'
    ]
    
    # Zone-level statistics
    zone_stats = df.groupby('initial_zone').agg({
        'pia_minutes': 'median',
        'is_lwbs': 'mean',
        'is_admitted': 'mean',
        'los_minutes': 'median'
    }).reset_index()
    zone_stats.columns = ['initial_zone', 'zone_median_pia', 'zone_lwbs_rate', 
                          'zone_admit_rate', 'zone_median_los']
    
    # Hour-level statistics
    hour_stats = df.groupby('arrival_hour').agg({
        'pia_minutes': 'median',
        'is_lwbs': 'mean',
        'is_admitted': 'mean',
        'case_id': 'count'
    }).reset_index()
    hour_stats.columns = ['arrival_hour', 'hour_median_pia', 'hour_lwbs_rate', 
                          'hour_admit_rate', 'hour_volume']
    hour_stats['hour_volume_scaled'] = hour_stats['hour_volume'] / hour_stats['hour_volume'].max()
    
    # Triage-level statistics
    triage_stats = df.groupby('triage_code_clean').agg({
        'is_lwbs': 'mean',
        'is_admitted': 'mean',
        'pia_minutes': 'median'
    }).reset_index()
    triage_stats.columns = ['triage_code_clean', 'triage_lwbs_rate', 
                            'triage_admit_rate', 'triage_median_pia']
    
    # Merge all statistics
    df = df.merge(zone_hour_stats, on=['initial_zone', 'arrival_hour'], how='left')
    df = df.merge(zone_stats, on='initial_zone', how='left')
    df = df.merge(hour_stats[['arrival_hour', 'hour_median_pia', 'hour_lwbs_rate', 
                              'hour_admit_rate', 'hour_volume_scaled']], 
                  on='arrival_hour', how='left')
    df = df.merge(triage_stats, on='triage_code_clean', how='left')
    
    # Composite risk scores
    df['lwbs_risk_score'] = (
        df['zone_lwbs_rate'].fillna(0.015) * 0.3 +
        df['hour_lwbs_rate'].fillna(0.015) * 0.3 +
        df['triage_lwbs_rate'].fillna(0.015) * 0.4
    )
    
    df['admit_risk_score'] = (
        df['zone_admit_rate'].fillna(0.14) * 0.25 +
        df['triage_admit_rate'].fillna(0.14) * 0.50 +
        df['hist_admit_rate'].fillna(0.14) * 0.25
    )
    
    df['expected_wait_score'] = (
        df['hist_median_pia'].fillna(35) / 60
    ).clip(0, 3)
    
    df['congestion_score'] = (
        df['hour_volume_scaled'].fillna(0.5) * 
        df['expected_wait_score']
    )
    
    # Fill remaining NaN
    fill_values = {
        'hist_median_pia': 35, 'hist_mean_pia': 40, 'hist_p75_pia': 55,
        'hist_lwbs_rate': 0.015, 'hist_admit_rate': 0.14, 'hist_volume': 50,
        'zone_median_pia': 35, 'zone_lwbs_rate': 0.015, 'zone_admit_rate': 0.14,
        'zone_median_los': 180, 'hour_median_pia': 35, 'hour_lwbs_rate': 0.015,
        'hour_admit_rate': 0.14, 'hour_volume_scaled': 0.5,
        'triage_lwbs_rate': 0.015, 'triage_admit_rate': 0.14, 'triage_median_pia': 35
    }
    df = df.fillna(fill_values)
    
    print("  ‚úì Operational context features")
    
    print(f"\n‚úì Feature engineering complete: {len(df.columns)} total columns")
    
    return df

# Apply feature engineering
visits_fe = engineer_features_production(visits)

Engineering features...
  ‚úì Demographic features
  ‚úì Triage assessment features
  ‚úì Temporal features
  ‚úì Zone features
  ‚úì Arrival mode features
  ‚úì Interaction features
  Computing operational context (this takes a moment)...
  ‚úì Operational context features

‚úì Feature engineering complete: 101 total columns


In [5]:
# =============================================================================
# CELL 5: DEFINE FEATURE SETS
# =============================================================================

# Patient-level features (available at triage)
PATIENT_FEATURES = [
    # Demographics
    'age_scaled', 'age_squared', 'is_male',
    'is_pediatric', 'is_senior', 'is_elderly',
    
    # Triage
    'triage_code_clean', 'acuity_score',
    'is_high_acuity', 'is_medium_acuity', 'is_low_acuity',
    
    # Temporal
    'hour_sin', 'hour_cos', 'day_num',
    'is_peak_hours', 'is_night', 'is_weekend',
    
    # Zone
    'is_resus', 'is_yellow_zone', 'is_green_zone', 'is_psych_zone',
    'is_high_acuity_zone',
    
    # Arrival
    'is_ambulance'
]

# Interaction features
INTERACTION_FEATURES = [
    'senior_high_acuity', 'elderly_high_acuity', 'elderly_ctas_2',
    'ambulance_high_acuity', 'peak_low_acuity', 'night_high_acuity',
    'weekend_low_acuity', 'green_low_acuity', 'yellow_high_acuity'
]

# Operational context features
OPERATIONAL_FEATURES = [
    'hist_median_pia', 'hist_lwbs_rate', 'hist_admit_rate',
    'zone_lwbs_rate', 'zone_admit_rate', 'zone_median_los',
    'hour_volume_scaled', 'hour_lwbs_rate', 'hour_admit_rate',
    'triage_lwbs_rate', 'triage_admit_rate',
    'lwbs_risk_score', 'admit_risk_score',
    'expected_wait_score', 'congestion_score'
]

# Combined feature sets
ALL_FEATURES = PATIENT_FEATURES + INTERACTION_FEATURES + OPERATIONAL_FEATURES

print(f"Feature sets defined:")
print(f"  ‚Ä¢ Patient features: {len(PATIENT_FEATURES)}")
print(f"  ‚Ä¢ Interaction features: {len(INTERACTION_FEATURES)}")
print(f"  ‚Ä¢ Operational features: {len(OPERATIONAL_FEATURES)}")
print(f"  ‚Ä¢ Total: {len(ALL_FEATURES)}")

Feature sets defined:
  ‚Ä¢ Patient features: 23
  ‚Ä¢ Interaction features: 9
  ‚Ä¢ Operational features: 15
  ‚Ä¢ Total: 47


In [6]:
# =============================================================================
# CELL 6: CHECK FOR DATA LEAKAGE
# =============================================================================

print("=" * 60)
print("DATA LEAKAGE CHECK")
print("=" * 60)

# Features that would cause leakage (NOT used)
LEAKAGE_FEATURES = [
    'pia_minutes',       # Only known AFTER physician sees patient
    'los_minutes',       # Only known at discharge
    'assessment_time',   # Future event
    'discharge_time',    # Future event
    'disposition_code',  # Outcome we're predicting
    'consult_wait_minutes'  # Only known after consult
]

# Check our feature set
leakage_found = [f for f in ALL_FEATURES if f in LEAKAGE_FEATURES]

if leakage_found:
    print(f"‚ùå LEAKAGE DETECTED: {leakage_found}")
else:
    print("‚úì No leakage detected in feature set")
    print("  All features are available at triage time")

# Note on operational context features
print("\n‚ö†Ô∏è  NOTE on Operational Context Features:")
print("   These use HISTORICAL averages (past data), not current patient's data.")
print("   Example: 'hist_admit_rate' = historical admission rate for this zone+hour")
print("   This is valid ‚Äî it's the same as a lookup table approach.")

DATA LEAKAGE CHECK
‚úì No leakage detected in feature set
  All features are available at triage time

‚ö†Ô∏è  NOTE on Operational Context Features:
   These use HISTORICAL averages (past data), not current patient's data.
   Example: 'hist_admit_rate' = historical admission rate for this zone+hour
   This is valid ‚Äî it's the same as a lookup table approach.


In [7]:
# =============================================================================
# CELL 7: PREPARE DATA FOR MODELING
# =============================================================================

def prepare_modeling_data(df: pd.DataFrame, target: str, features: list, 
                          test_size: float = 0.2, random_state: int = 42,
                          temporal_split: bool = False):
    """
    Prepare data for ML modeling with proper train-test split.
    
    Parameters:
    -----------
    temporal_split: If True, split by time (more realistic but requires timestamp)
    """
    
    # Remove rows with missing target or features
    df_clean = df.dropna(subset=[target] + features).copy()
    
    X = df_clean[features]
    y = df_clean[target]
    
    if temporal_split and 'triage_time' in df_clean.columns:
        # Sort by time and split (train on earlier, test on later)
        df_sorted = df_clean.sort_values('triage_time')
        split_idx = int(len(df_sorted) * (1 - test_size))
        
        X_train = df_sorted[features].iloc[:split_idx]
        X_test = df_sorted[features].iloc[split_idx:]
        y_train = df_sorted[target].iloc[:split_idx]
        y_test = df_sorted[target].iloc[split_idx:]
        
        print(f"Temporal split: Train on first {split_idx:,}, test on last {len(df_sorted)-split_idx:,}")
    else:
        # Stratified random split
        X_train, X_test, y_train, y_test = train_test_split(
            X, y, test_size=test_size, random_state=random_state, 
            stratify=y if y.nunique() <= 10 else None
        )
    
    print(f"\nData prepared for '{target}':")
    print(f"  Training: {len(X_train):,} samples")
    print(f"  Test: {len(X_test):,} samples")
    print(f"  Positive class rate: {y_train.mean()*100:.2f}%")
    
    return X_train, X_test, y_train, y_test

# Prepare admission data
print("=" * 60)
print("PREPARING ADMISSION DATA")
print("=" * 60)
X_train_adm, X_test_adm, y_train_adm, y_test_adm = prepare_modeling_data(
    visits_fe, 'is_admitted', ALL_FEATURES
)

# Prepare LWBS data
print("\n" + "=" * 60)
print("PREPARING LWBS DATA")
print("=" * 60)
X_train_lwbs, X_test_lwbs, y_train_lwbs, y_test_lwbs = prepare_modeling_data(
    visits_fe, 'is_lwbs', ALL_FEATURES
)

PREPARING ADMISSION DATA

Data prepared for 'is_admitted':
  Training: 12,808 samples
  Test: 3,202 samples
  Positive class rate: 13.89%

PREPARING LWBS DATA

Data prepared for 'is_lwbs':
  Training: 12,808 samples
  Test: 3,202 samples
  Positive class rate: 1.47%


---
## Section 3: Model Comparison Framework

**Models to Compare:**
1. Logistic Regression (baseline, interpretable)
2. Random Forest (robust, handles non-linearity)
3. Gradient Boosting (often best for tabular)
4. XGBoost (if available)
5. LightGBM (if available)

**Evaluation Strategy:**
- 5-fold stratified cross-validation
- Primary metric: AUC-ROC (ranking quality)
- Secondary: Precision-Recall AUC (better for imbalanced)

---

In [8]:
# =============================================================================
# CELL 8: DEFINE MODEL CANDIDATES
# =============================================================================

def get_model_candidates(class_weight_ratio: float = None):
    """
    Get dictionary of model candidates to compare.
    
    Parameters:
    -----------
    class_weight_ratio: Ratio of negative/positive class for imbalance handling
    """
    
    models = {
        'Logistic Regression': LogisticRegression(
            max_iter=1000,
            class_weight='balanced',
            random_state=42,
            solver='lbfgs'
        ),
        
        'Random Forest': RandomForestClassifier(
            n_estimators=100,
            max_depth=10,
            min_samples_leaf=10,
            class_weight='balanced',
            random_state=42,
            n_jobs=-1
        ),
        
        'Gradient Boosting': GradientBoostingClassifier(
            n_estimators=100,
            max_depth=5,
            learning_rate=0.1,
            min_samples_leaf=20,
            random_state=42
        )
    }
    
    # Add XGBoost if available
    if XGBOOST_AVAILABLE:
        scale_pos_weight = class_weight_ratio if class_weight_ratio else 1
        models['XGBoost'] = XGBClassifier(
            n_estimators=100,
            max_depth=5,
            learning_rate=0.1,
            scale_pos_weight=scale_pos_weight,
            random_state=42,
            use_label_encoder=False,
            eval_metric='logloss',
            n_jobs=-1
        )
    
    # Add LightGBM if available
    if LIGHTGBM_AVAILABLE:
        models['LightGBM'] = LGBMClassifier(
            n_estimators=100,
            max_depth=5,
            learning_rate=0.1,
            class_weight='balanced',
            random_state=42,
            n_jobs=-1,
            verbose=-1
        )
    
    return models

print(f"Model candidates available: {len(get_model_candidates())}")
for name in get_model_candidates():
    print(f"  ‚Ä¢ {name}")

Model candidates available: 5
  ‚Ä¢ Logistic Regression
  ‚Ä¢ Random Forest
  ‚Ä¢ Gradient Boosting
  ‚Ä¢ XGBoost
  ‚Ä¢ LightGBM


In [9]:
# =============================================================================
# CELL 9: CROSS-VALIDATION COMPARISON FUNCTION
# =============================================================================

def compare_models_cv(X_train, y_train, models: dict, cv_folds: int = 5):
    """
    Compare multiple models using stratified cross-validation.
    
    Returns DataFrame with CV scores for each model.
    """
    
    results = []
    cv = StratifiedKFold(n_splits=cv_folds, shuffle=True, random_state=42)
    
    print(f"Running {cv_folds}-fold CV for {len(models)} models...\n")
    
    for name, model in models.items():
        print(f"  Evaluating {name}...", end=" ")
        
        try:
            # AUC-ROC scores
            auc_scores = cross_val_score(model, X_train, y_train, cv=cv, 
                                         scoring='roc_auc', n_jobs=-1)
            
            # Average Precision (PR-AUC) - better for imbalanced
            ap_scores = cross_val_score(model, X_train, y_train, cv=cv,
                                        scoring='average_precision', n_jobs=-1)
            
            # F1 scores
            f1_scores = cross_val_score(model, X_train, y_train, cv=cv,
                                        scoring='f1', n_jobs=-1)
            
            results.append({
                'Model': name,
                'AUC Mean': auc_scores.mean(),
                'AUC Std': auc_scores.std(),
                'PR-AUC Mean': ap_scores.mean(),
                'PR-AUC Std': ap_scores.std(),
                'F1 Mean': f1_scores.mean(),
                'F1 Std': f1_scores.std()
            })
            
            print(f"AUC: {auc_scores.mean():.3f} ¬± {auc_scores.std():.3f}")
            
        except Exception as e:
            print(f"FAILED: {e}")
            continue
    
    results_df = pd.DataFrame(results).sort_values('AUC Mean', ascending=False)
    
    return results_df

print("‚úì compare_models_cv() defined")

‚úì compare_models_cv() defined


In [10]:
# =============================================================================
# CELL 10: COMPARE MODELS FOR ADMISSION
# =============================================================================

print("=" * 70)
print("MODEL COMPARISON: ADMISSION PREDICTION")
print("=" * 70)

# Calculate class weight ratio
adm_ratio = (y_train_adm == 0).sum() / (y_train_adm == 1).sum()
print(f"\nClass imbalance ratio: {adm_ratio:.1f}:1")

# Get models and compare
models_adm = get_model_candidates(class_weight_ratio=adm_ratio)
results_adm = compare_models_cv(X_train_adm, y_train_adm, models_adm)

print("\n" + "=" * 70)
print("ADMISSION MODEL COMPARISON RESULTS")
print("=" * 70)
display(results_adm.round(3))

best_model_name_adm = results_adm.iloc[0]['Model']
print(f"\nüèÜ Best model: {best_model_name_adm}")

MODEL COMPARISON: ADMISSION PREDICTION

Class imbalance ratio: 6.2:1
Running 5-fold CV for 5 models...

  Evaluating Logistic Regression... 

STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

AUC: 0.863 ¬± 0.008
  Evaluating Random Forest... AUC: 0.861 ¬± 0.006
  Evaluating Gradient Boosting... AUC: 0.861 ¬± 0.006
  Evaluating XGBoost... 

Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.


AUC: 0.859 ¬± 0.006
  Evaluating LightGBM... AUC: 0.859 ¬± 0.006

ADMISSION MODEL COMPARISON RESULTS


Unnamed: 0,Model,AUC Mean,AUC Std,PR-AUC Mean,PR-AUC Std,F1 Mean,F1 Std
0,Logistic Regression,0.863,0.008,0.523,0.021,0.489,0.013
2,Gradient Boosting,0.861,0.006,0.526,0.023,0.455,0.036
1,Random Forest,0.861,0.006,0.519,0.016,0.517,0.013
3,XGBoost,0.859,0.006,0.519,0.017,0.503,0.013
4,LightGBM,0.859,0.006,0.519,0.022,0.501,0.017



üèÜ Best model: Logistic Regression


In [11]:
# =============================================================================
# CELL 11: COMPARE MODELS FOR LWBS
# =============================================================================

print("=" * 70)
print("MODEL COMPARISON: LWBS PREDICTION")
print("=" * 70)

# Calculate class weight ratio (extreme imbalance)
lwbs_ratio = (y_train_lwbs == 0).sum() / (y_train_lwbs == 1).sum()
print(f"\nClass imbalance ratio: {lwbs_ratio:.1f}:1 (EXTREME)")

# Get models and compare
models_lwbs = get_model_candidates(class_weight_ratio=lwbs_ratio)
results_lwbs = compare_models_cv(X_train_lwbs, y_train_lwbs, models_lwbs)

print("\n" + "=" * 70)
print("LWBS MODEL COMPARISON RESULTS")
print("=" * 70)
display(results_lwbs.round(3))

best_model_name_lwbs = results_lwbs.iloc[0]['Model']
print(f"\nüèÜ Best model: {best_model_name_lwbs}")

MODEL COMPARISON: LWBS PREDICTION

Class imbalance ratio: 67.1:1 (EXTREME)
Running 5-fold CV for 5 models...

  Evaluating Logistic Regression... 

STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt

AUC: 0.755 ¬± 0.041
  Evaluating Random Forest... AUC: 0.769 ¬± 0.026
  Evaluating Gradient Boosting... AUC: 0.788 ¬± 0.024
  Evaluating XGBoost... 

Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.


AUC: 0.763 ¬± 0.031
  Evaluating LightGBM... AUC: 0.769 ¬± 0.027

LWBS MODEL COMPARISON RESULTS


Unnamed: 0,Model,AUC Mean,AUC Std,PR-AUC Mean,PR-AUC Std,F1 Mean,F1 Std
2,Gradient Boosting,0.788,0.024,0.112,0.024,0.078,0.031
4,LightGBM,0.769,0.027,0.12,0.024,0.111,0.007
1,Random Forest,0.769,0.026,0.148,0.041,0.175,0.024
3,XGBoost,0.763,0.031,0.108,0.02,0.108,0.009
0,Logistic Regression,0.755,0.041,0.172,0.049,0.078,0.012



üèÜ Best model: Gradient Boosting


---
## Section 4: Hyperparameter Tuning

**Strategy:**
- Use RandomizedSearchCV (more efficient than Grid)
- Tune the best performing model from comparison
- Optimize for AUC (ranking quality)

---

In [12]:
# =============================================================================
# CELL 12: HYPERPARAMETER TUNING FOR ADMISSION MODEL
# =============================================================================

print("=" * 70)
print("HYPERPARAMETER TUNING: ADMISSION MODEL")
print("=" * 70)

# Define parameter grids for different models
param_grids = {
    'Gradient Boosting': {
        'n_estimators': [100, 200, 300],
        'max_depth': [3, 4, 5, 6, 7],
        'learning_rate': [0.05, 0.1, 0.15],
        'min_samples_split': [20, 30, 50],
        'min_samples_leaf': [10, 15, 20],
        'subsample': [0.8, 0.9, 1.0]
    },
    'Random Forest': {
        'n_estimators': [100, 200, 300],
        'max_depth': [8, 10, 12, 15],
        'min_samples_split': [10, 20, 30],
        'min_samples_leaf': [5, 10, 15],
        'max_features': ['sqrt', 'log2', 0.5]
    },
    'Logistic Regression': {
        'C': [0.01, 0.1, 1, 10],
        'penalty': ['l2'],
        'solver': ['lbfgs', 'saga']
    }
}

if XGBOOST_AVAILABLE:
    param_grids['XGBoost'] = {
        'n_estimators': [100, 200, 300],
        'max_depth': [3, 4, 5, 6],
        'learning_rate': [0.05, 0.1, 0.15],
        'min_child_weight': [1, 3, 5],
        'subsample': [0.8, 0.9, 1.0],
        'colsample_bytree': [0.8, 0.9, 1.0]
    }

if LIGHTGBM_AVAILABLE:
    param_grids['LightGBM'] = {
        'n_estimators': [100, 200, 300],
        'max_depth': [3, 5, 7, -1],
        'learning_rate': [0.05, 0.1, 0.15],
        'num_leaves': [15, 31, 63],
        'min_child_samples': [10, 20, 30]
    }

# Get the best model type and tune it
if best_model_name_adm in param_grids:
    param_grid = param_grids[best_model_name_adm]
    base_model = models_adm[best_model_name_adm]
else:
    # Default to Gradient Boosting
    param_grid = param_grids['Gradient Boosting']
    base_model = GradientBoostingClassifier(random_state=42)
    best_model_name_adm = 'Gradient Boosting'

print(f"\nTuning {best_model_name_adm}...")

# RandomizedSearchCV
search_adm = RandomizedSearchCV(
    base_model,
    param_distributions=param_grid,
    n_iter=40,
    scoring='roc_auc',
    cv=StratifiedKFold(n_splits=5, shuffle=True, random_state=42),
    random_state=42,
    n_jobs=-1,
    verbose=1
)

search_adm.fit(X_train_adm, y_train_adm)

print(f"\n‚úì Best CV AUC: {search_adm.best_score_:.4f}")
print(f"Best parameters: {search_adm.best_params_}")

tuned_model_adm = search_adm.best_estimator_

HYPERPARAMETER TUNING: ADMISSION MODEL

Tuning Logistic Regression...
Fitting 5 folds for each of 8 candidates, totalling 40 fits


STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver opt


‚úì Best CV AUC: 0.8638
Best parameters: {'solver': 'lbfgs', 'penalty': 'l2', 'C': 10}


In [13]:
# =============================================================================
# CELL 13: HYPERPARAMETER TUNING FOR LWBS MODEL
# =============================================================================

print("=" * 70)
print("HYPERPARAMETER TUNING: LWBS MODEL")
print("=" * 70)

# Get the best model type and tune it
if best_model_name_lwbs in param_grids:
    param_grid = param_grids[best_model_name_lwbs]
    base_model = models_lwbs[best_model_name_lwbs]
else:
    param_grid = param_grids['Gradient Boosting']
    base_model = GradientBoostingClassifier(random_state=42)
    best_model_name_lwbs = 'Gradient Boosting'

print(f"\nTuning {best_model_name_lwbs}...")

search_lwbs = RandomizedSearchCV(
    base_model,
    param_distributions=param_grid,
    n_iter=40,
    scoring='roc_auc',
    cv=StratifiedKFold(n_splits=5, shuffle=True, random_state=42),
    random_state=42,
    n_jobs=-1,
    verbose=1
)

search_lwbs.fit(X_train_lwbs, y_train_lwbs)

print(f"\n‚úì Best CV AUC: {search_lwbs.best_score_:.4f}")
print(f"Best parameters: {search_lwbs.best_params_}")

tuned_model_lwbs = search_lwbs.best_estimator_

HYPERPARAMETER TUNING: LWBS MODEL

Tuning Gradient Boosting...
Fitting 5 folds for each of 40 candidates, totalling 200 fits

‚úì Best CV AUC: 0.8290
Best parameters: {'subsample': 0.9, 'n_estimators': 100, 'min_samples_split': 50, 'min_samples_leaf': 20, 'max_depth': 3, 'learning_rate': 0.1}


---
## Section 5: Threshold Optimization

**Key Insight:** Default 0.5 threshold is arbitrary. We optimize for business objectives.

---

In [14]:
# =============================================================================
# CELL 14: THRESHOLD OPTIMIZATION FUNCTIONS
# =============================================================================

def optimize_threshold_for_f1(y_true, y_proba):
    """Find threshold that maximizes F1 score."""
    precisions, recalls, thresholds = precision_recall_curve(y_true, y_proba)
    f1_scores = 2 * (precisions * recalls) / (precisions + recalls + 1e-10)
    best_idx = np.argmax(f1_scores[:-1])
    return thresholds[best_idx], f1_scores[best_idx]

def optimize_threshold_balanced(y_true, y_proba, min_precision=0.45, min_recall=0.55):
    """
    Find threshold that balances precision and recall.
    For Triage Lead: need both to be acceptable.
    """
    precisions, recalls, thresholds = precision_recall_curve(y_true, y_proba)
    
    # Find thresholds meeting both constraints
    valid_idx = np.where(
        (precisions[:-1] >= min_precision) & 
        (recalls[:-1] >= min_recall)
    )[0]
    
    if len(valid_idx) == 0:
        # Fall back to F1 optimization
        print("  ‚ö†Ô∏è Cannot meet both constraints, optimizing F1")
        return optimize_threshold_for_f1(y_true, y_proba)
    
    # Among valid, maximize F1
    f1_scores = 2 * (precisions[valid_idx] * recalls[valid_idx]) / \
                (precisions[valid_idx] + recalls[valid_idx] + 1e-10)
    best_idx = valid_idx[np.argmax(f1_scores)]
    
    return thresholds[best_idx], precisions[best_idx], recalls[best_idx]

def find_threshold_for_recall(y_true, y_proba, target_recall=0.70):
    """Find threshold to achieve target recall."""
    precisions, recalls, thresholds = precision_recall_curve(y_true, y_proba)
    
    valid_idx = np.where(recalls[:-1] >= target_recall)[0]
    if len(valid_idx) == 0:
        return thresholds[0]
    
    # Among valid, find highest precision
    best_idx = valid_idx[np.argmax(precisions[valid_idx])]
    return thresholds[best_idx]

print("‚úì Threshold optimization functions defined")

‚úì Threshold optimization functions defined


In [15]:
# =============================================================================
# CELL 15: OPTIMIZE ADMISSION THRESHOLD
# =============================================================================

print("=" * 70)
print("ADMISSION MODEL: THRESHOLD OPTIMIZATION")
print("=" * 70)

# Get predictions
y_proba_adm = tuned_model_adm.predict_proba(X_test_adm)[:, 1]

# Find optimal threshold
result = optimize_threshold_balanced(y_test_adm, y_proba_adm, 
                                     min_precision=0.45, min_recall=0.55)

if len(result) == 3:
    optimal_threshold_adm, opt_precision, opt_recall = result
    print(f"\n‚úì Optimal threshold: {optimal_threshold_adm:.3f}")
    print(f"  Precision at threshold: {opt_precision:.3f}")
    print(f"  Recall at threshold: {opt_recall:.3f}")
else:
    optimal_threshold_adm, opt_f1 = result
    print(f"\n‚úì Optimal threshold (F1): {optimal_threshold_adm:.3f}")
    print(f"  F1 at threshold: {opt_f1:.3f}")

ADMISSION MODEL: THRESHOLD OPTIMIZATION

‚úì Optimal threshold: 0.676
  Precision at threshold: 0.453
  Recall at threshold: 0.622


In [16]:
# =============================================================================
# CELL 16: OPTIMIZE LWBS THRESHOLD (LIFT-BASED)
# =============================================================================

print("=" * 70)
print("LWBS MODEL: LIFT-BASED EVALUATION")
print("=" * 70)

# Get predictions
y_proba_lwbs = tuned_model_lwbs.predict_proba(X_test_lwbs)[:, 1]

# For LWBS, use lift-based approach (not binary threshold)
def evaluate_lwbs_lift(y_true, y_proba):
    """Evaluate LWBS model using lift at different percentiles."""
    
    n_samples = len(y_true)
    n_positives = y_true.sum()
    base_rate = n_positives / n_samples
    
    print(f"\nBaseline: {n_positives} LWBS out of {n_samples} ({base_rate*100:.2f}%)")
    
    # Sort by probability
    sorted_idx = np.argsort(y_proba)[::-1]
    y_true_sorted = np.array(y_true)[sorted_idx]
    
    print(f"\n{'Top %':<10} {'Patients':<12} {'LWBS':<10} {'Capture':<12} {'Lift':<8}")
    print("-" * 55)
    
    results = []
    for pct in [1, 2, 5, 10, 15, 20, 25, 30]:
        top_n = int(n_samples * pct / 100)
        lwbs_caught = y_true_sorted[:top_n].sum()
        capture_rate = lwbs_caught / n_positives if n_positives > 0 else 0
        precision_at_k = lwbs_caught / top_n if top_n > 0 else 0
        lift = precision_at_k / base_rate if base_rate > 0 else 0
        
        print(f"{pct}%{'':<8} {top_n:<12} {lwbs_caught:<10} {capture_rate*100:.1f}%{'':<7} {lift:.1f}x")
        
        results.append({
            'top_pct': pct, 'patients': top_n, 'lwbs_caught': lwbs_caught,
            'capture_rate': capture_rate, 'lift': lift
        })
    
    return results

lwbs_lift_results = evaluate_lwbs_lift(y_test_lwbs.values, y_proba_lwbs)

# Define LWBS threshold based on top 10% (operational choice)
optimal_threshold_lwbs = np.percentile(y_proba_lwbs, 90)  # Top 10%
print(f"\n‚úì Operational threshold (top 10%): {optimal_threshold_lwbs:.3f}")

LWBS MODEL: LIFT-BASED EVALUATION

Baseline: 47 LWBS out of 3202 (1.47%)

Top %      Patients     LWBS       Capture      Lift    
-------------------------------------------------------
1%         32           12         25.5%        25.5x
2%         64           15         31.9%        16.0x
5%         160          18         38.3%        7.7x
10%         320          25         53.2%        5.3x
15%         480          31         66.0%        4.4x
20%         640          33         70.2%        3.5x
25%         800          35         74.5%        3.0x
30%         960          39         83.0%        2.8x

‚úì Operational threshold (top 10%): 0.026


---
## Section 6: Final Model Evaluation

---

In [17]:
# =============================================================================
# CELL 17: COMPREHENSIVE ADMISSION MODEL EVALUATION
# =============================================================================

def evaluate_final_model(y_true, y_proba, threshold, model_name):
    """Comprehensive final evaluation of a classification model."""
    
    y_pred = (y_proba >= threshold).astype(int)
    
    # Metrics
    auc = roc_auc_score(y_true, y_proba)
    pr_auc = average_precision_score(y_true, y_proba)
    accuracy = accuracy_score(y_true, y_pred)
    precision = precision_score(y_true, y_pred, zero_division=0)
    recall = recall_score(y_true, y_pred, zero_division=0)
    f1 = f1_score(y_true, y_pred, zero_division=0)
    brier = brier_score_loss(y_true, y_proba)
    
    cm = confusion_matrix(y_true, y_pred)
    
    print(f"\n{'='*70}")
    print(f"{model_name} ‚Äî FINAL EVALUATION")
    print(f"{'='*70}")
    print(f"\nThreshold: {threshold:.3f}")
    print(f"\nüìä DISCRIMINATION METRICS:")
    print(f"   AUC-ROC:     {auc:.4f}  (ranking quality)")
    print(f"   PR-AUC:      {pr_auc:.4f}  (imbalance-aware)")
    
    print(f"\nüìä CLASSIFICATION METRICS (at threshold):")
    print(f"   Accuracy:    {accuracy:.4f}")
    print(f"   Precision:   {precision:.4f}  ({precision*100:.0f}% of flags are correct)")
    print(f"   Recall:      {recall:.4f}  ({recall*100:.0f}% of cases caught)")
    print(f"   F1 Score:    {f1:.4f}")
    
    print(f"\nüìä CALIBRATION:")
    print(f"   Brier Score: {brier:.4f}  (lower = better calibrated)")
    
    print(f"\nüìã CONFUSION MATRIX:")
    print(f"                 Predicted")
    print(f"                 Neg      Pos")
    print(f"   Actual Neg   {cm[0,0]:6d}   {cm[0,1]:6d}  (FP: {cm[0,1]} false alarms)")
    print(f"   Actual Pos   {cm[1,0]:6d}   {cm[1,1]:6d}  (FN: {cm[1,0]} missed)")
    
    return {
        'auc': auc, 'pr_auc': pr_auc, 'accuracy': accuracy,
        'precision': precision, 'recall': recall, 'f1': f1,
        'brier': brier, 'threshold': threshold
    }

# Evaluate admission model
metrics_adm_final = evaluate_final_model(
    y_test_adm, y_proba_adm, optimal_threshold_adm, "ADMISSION PREDICTION"
)


ADMISSION PREDICTION ‚Äî FINAL EVALUATION

Threshold: 0.676

üìä DISCRIMINATION METRICS:
   AUC-ROC:     0.8495  (ranking quality)
   PR-AUC:      0.5117  (imbalance-aware)

üìä CLASSIFICATION METRICS (at threshold):
   Accuracy:    0.8429
   Precision:   0.4526  (45% of flags are correct)
   Recall:      0.6225  (62% of cases caught)
   F1 Score:    0.5241

üìä CALIBRATION:
   Brier Score: 0.1585  (lower = better calibrated)

üìã CONFUSION MATRIX:
                 Predicted
                 Neg      Pos
   Actual Neg     2422      335  (FP: 335 false alarms)
   Actual Pos      168      277  (FN: 168 missed)


In [18]:
# =============================================================================
# CELL 18: COMPREHENSIVE LWBS MODEL EVALUATION
# =============================================================================

# For LWBS, also show binary metrics for comparison
metrics_lwbs_final = evaluate_final_model(
    y_test_lwbs, y_proba_lwbs, optimal_threshold_lwbs, "LWBS PREDICTION"
)

print("\nüí° NOTE: For LWBS, focus on LIFT metrics (Cell 16), not binary metrics.")
print("   Binary precision will be low due to extreme imbalance.")
print("   The value is in RANKING, not classification.")


LWBS PREDICTION ‚Äî FINAL EVALUATION

Threshold: 0.026

üìä DISCRIMINATION METRICS:
   AUC-ROC:     0.8531  (ranking quality)
   PR-AUC:      0.1858  (imbalance-aware)

üìä CLASSIFICATION METRICS (at threshold):
   Accuracy:    0.9007
   Precision:   0.0779  (8% of flags are correct)
   Recall:      0.5319  (53% of cases caught)
   F1 Score:    0.1359

üìä CALIBRATION:
   Brier Score: 0.0131  (lower = better calibrated)

üìã CONFUSION MATRIX:
                 Predicted
                 Neg      Pos
   Actual Neg     2859      296  (FP: 296 false alarms)
   Actual Pos       22       25  (FN: 22 missed)

üí° NOTE: For LWBS, focus on LIFT metrics (Cell 16), not binary metrics.
   Binary precision will be low due to extreme imbalance.
   The value is in RANKING, not classification.


In [19]:
# =============================================================================
# CELL 19: FEATURE IMPORTANCE ANALYSIS
# =============================================================================

def plot_feature_importance(model, feature_names, title, top_n=20):
    """Plot feature importance with operational vs patient feature breakdown."""
    
    if hasattr(model, 'feature_importances_'):
        importance = model.feature_importances_
    elif hasattr(model, 'coef_'):
        importance = np.abs(model.coef_[0])
    else:
        print("Model doesn't support feature importance")
        return None
    
    # Create dataframe
    imp_df = pd.DataFrame({
        'Feature': feature_names,
        'Importance': importance
    }).sort_values('Importance', ascending=False).head(top_n)
    
    # Mark feature types
    imp_df['Type'] = imp_df['Feature'].apply(
        lambda x: 'Operational' if x in OPERATIONAL_FEATURES else 
                  'Interaction' if x in INTERACTION_FEATURES else 'Patient'
    )
    
    # Plot
    colors = {'Patient': '#3B82F6', 'Interaction': '#10B981', 'Operational': '#F59E0B'}
    
    fig = px.bar(
        imp_df,
        x='Importance',
        y='Feature',
        color='Type',
        color_discrete_map=colors,
        orientation='h',
        title=f'<b>{title}</b>'
    )
    
    fig.update_layout(height=500, yaxis={'categoryorder': 'total ascending'})
    fig.show()
    
    # Calculate contribution by type
    total_imp = importance.sum()
    for ftype in ['Patient', 'Interaction', 'Operational']:
        type_features = [f for f in feature_names if 
                        (f in OPERATIONAL_FEATURES and ftype == 'Operational') or
                        (f in INTERACTION_FEATURES and ftype == 'Interaction') or
                        (f not in OPERATIONAL_FEATURES + INTERACTION_FEATURES and ftype == 'Patient')]
        type_imp = sum(importance[feature_names.index(f)] for f in type_features if f in feature_names)
        print(f"  {ftype} features: {type_imp/total_imp*100:.1f}% of importance")
    
    return imp_df

print("=" * 70)
print("FEATURE IMPORTANCE ANALYSIS")
print("=" * 70)

print("\nADMISSION MODEL:")
imp_adm = plot_feature_importance(tuned_model_adm, ALL_FEATURES, 
                                   "Admission Model ‚Äî Top Features")

print("\nLWBS MODEL:")
imp_lwbs = plot_feature_importance(tuned_model_lwbs, ALL_FEATURES,
                                   "LWBS Model ‚Äî Top Features")

FEATURE IMPORTANCE ANALYSIS

ADMISSION MODEL:


  Patient features: 37.1% of importance
  Interaction features: 20.1% of importance
  Operational features: 42.8% of importance

LWBS MODEL:


  Patient features: 26.9% of importance
  Interaction features: 1.5% of importance
  Operational features: 71.6% of importance


---
## Section 7: Operational Integration

**How Triage Lead Uses Both Models Together**

---

In [20]:
# =============================================================================
# CELL 20: COMBINED RISK SCORING FUNCTION
# =============================================================================

def predict_patient_risk(patient_features: dict, 
                         admission_model, admission_threshold,
                         lwbs_model, lwbs_threshold,
                         feature_list: list) -> dict:
    """
    Generate combined risk assessment for a single patient.
    
    This is what runs at triage time for each patient.
    """
    
    # Convert to array
    X = np.array([[patient_features.get(f, 0) for f in feature_list]])
    
    # Get predictions
    adm_prob = admission_model.predict_proba(X)[0, 1]
    lwbs_prob = lwbs_model.predict_proba(X)[0, 1]
    
    # Determine risk levels
    adm_risk = 'HIGH' if adm_prob >= admission_threshold else \
               'MEDIUM' if adm_prob >= admission_threshold * 0.7 else 'LOW'
    
    lwbs_risk = 'HIGH' if lwbs_prob >= np.percentile([lwbs_prob], 95) else \
                'MEDIUM' if lwbs_prob >= np.percentile([lwbs_prob], 85) else 'LOW'
    
    # Combined priority score (for queue ordering)
    priority_score = (adm_prob * 0.6) + (lwbs_prob * 0.4)
    
    return {
        'admission_probability': adm_prob,
        'admission_risk': adm_risk,
        'lwbs_probability': lwbs_prob,
        'lwbs_risk': lwbs_risk,
        'priority_score': priority_score,
        'recommended_actions': get_recommended_actions(adm_risk, lwbs_risk)
    }

def get_recommended_actions(adm_risk: str, lwbs_risk: str) -> list:
    """Generate actionable recommendations based on risk levels."""
    
    actions = []
    
    if adm_risk == 'HIGH':
        actions.append("üõèÔ∏è Start bed search / notify admitting")
        actions.append("üìû Consider early consult request")
    elif adm_risk == 'MEDIUM':
        actions.append("üëÅÔ∏è Monitor for admission indicators")
    
    if lwbs_risk == 'HIGH':
        actions.append("‚è∞ Check on patient after 20 min")
        actions.append("üí¨ Proactively communicate wait time")
    elif lwbs_risk == 'MEDIUM':
        actions.append("üìã Add to LWBS watch list")
    
    if not actions:
        actions.append("‚úì Standard care pathway")
    
    return actions

print("‚úì Combined risk scoring functions defined")

‚úì Combined risk scoring functions defined


In [21]:
# =============================================================================
# CELL 21: DEMONSTRATION ‚Äî SAMPLE PATIENT PREDICTIONS
# =============================================================================

print("=" * 70)
print("SAMPLE PATIENT PREDICTIONS")
print("=" * 70)

# Get a few sample patients from test set
sample_indices = [0, 100, 500, 1000, 1500]

for idx in sample_indices:
    if idx < len(X_test_adm):
        patient = X_test_adm.iloc[idx]
        actual_adm = y_test_adm.iloc[idx]
        actual_lwbs = y_test_lwbs.iloc[idx] if idx < len(y_test_lwbs) else 'N/A'
        
        # Get predictions
        adm_prob = tuned_model_adm.predict_proba(patient.values.reshape(1, -1))[0, 1]
        lwbs_prob = tuned_model_lwbs.predict_proba(patient.values.reshape(1, -1))[0, 1]
        
        adm_flag = "üî¥ HIGH" if adm_prob >= optimal_threshold_adm else \
                   "üü° MED" if adm_prob >= optimal_threshold_adm * 0.7 else "üü¢ LOW"
        
        print(f"\nPatient {idx}:")
        print(f"  Age: {patient['age_scaled']*100:.0f}, Triage: CTAS {patient['triage_code_clean']:.0f}")
        print(f"  Ambulance: {'Yes' if patient['is_ambulance'] else 'No'}, Peak: {'Yes' if patient['is_peak_hours'] else 'No'}")
        print(f"  Admission: {adm_flag} ({adm_prob:.1%}) | Actual: {'Admitted' if actual_adm else 'Discharged'}")
        print(f"  LWBS Risk: {lwbs_prob:.1%} | Actual: {'LWBS' if actual_lwbs == 1 else 'Stayed'}")

SAMPLE PATIENT PREDICTIONS

Patient 0:
  Age: 82, Triage: CTAS 3
  Ambulance: Yes, Peak: Yes
  Admission: üî¥ HIGH (79.9%) | Actual: Admitted
  LWBS Risk: 0.1% | Actual: Stayed

Patient 100:
  Age: 18, Triage: CTAS 3
  Ambulance: No, Peak: Yes
  Admission: üü¢ LOW (18.2%) | Actual: Discharged
  LWBS Risk: 1.2% | Actual: Stayed

Patient 500:
  Age: 58, Triage: CTAS 2
  Ambulance: Yes, Peak: Yes
  Admission: üî¥ HIGH (82.4%) | Actual: Admitted
  LWBS Risk: 0.2% | Actual: Stayed

Patient 1000:
  Age: 40, Triage: CTAS 3
  Ambulance: No, Peak: No
  Admission: üü° MED (49.1%) | Actual: Admitted
  LWBS Risk: 2.9% | Actual: Stayed

Patient 1500:
  Age: 45, Triage: CTAS 2
  Ambulance: Yes, Peak: Yes
  Admission: üî¥ HIGH (90.6%) | Actual: Discharged
  LWBS Risk: 0.2% | Actual: Stayed


In [22]:
# =============================================================================
# CELL 22: FINAL SUMMARY
# =============================================================================

print("\n" + "=" * 70)
print("üìä FINAL MODEL SUMMARY")
print("=" * 70)

print(f"""
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ                       ADMISSION PREDICTION                              ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Best Model:    {best_model_name_adm:<50} ‚îÇ
‚îÇ  AUC-ROC:       {metrics_adm_final['auc']:.4f}                                               ‚îÇ
‚îÇ  PR-AUC:        {metrics_adm_final['pr_auc']:.4f}                                               ‚îÇ
‚îÇ  Precision:     {metrics_adm_final['precision']:.4f} ({metrics_adm_final['precision']*100:.0f}% of flags correct)                       ‚îÇ
‚îÇ  Recall:        {metrics_adm_final['recall']:.4f} ({metrics_adm_final['recall']*100:.0f}% of admissions caught)                    ‚îÇ
‚îÇ  Threshold:     {metrics_adm_final['threshold']:.3f}                                                ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ                        LWBS PREDICTION                                  ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Best Model:    {best_model_name_lwbs:<50} ‚îÇ
‚îÇ  AUC-ROC:       {metrics_lwbs_final['auc']:.4f}                                               ‚îÇ
‚îÇ  Top 5% Lift:   {lwbs_lift_results[2]['lift']:.1f}x                                                  ‚îÇ
‚îÇ  Top 5% Capture:{lwbs_lift_results[2]['capture_rate']*100:.0f}% of LWBS cases                                   ‚îÇ
‚îÇ  Top 10% Capture:{lwbs_lift_results[3]['capture_rate']*100:.0f}% of LWBS cases                                  ‚îÇ
‚îÇ  Approach:      Risk tiers (not binary)                                 ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
""")

print("\n" + "=" * 70)
print("üí° HOW TRIAGE LEAD USES BOTH MODELS")
print("=" * 70)

print("""
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ AT TRIAGE:                                                              ‚îÇ
‚îÇ                                                                         ‚îÇ
‚îÇ   1. Patient data entered ‚Üí Both models run in parallel (<100ms)        ‚îÇ
‚îÇ                                                                         ‚îÇ
‚îÇ   2. Dashboard displays:                                                ‚îÇ
‚îÇ      ‚Ä¢ Admission Risk:  üî¥ HIGH / üü° MEDIUM / üü¢ LOW                    ‚îÇ
‚îÇ      ‚Ä¢ LWBS Risk Tier:  TOP 5% / TOP 15% / STANDARD                     ‚îÇ
‚îÇ                                                                         ‚îÇ
‚îÇ   3. Triage Lead acts:                                                  ‚îÇ
‚îÇ      ‚Ä¢ HIGH Admission ‚Üí Start bed search, early consult                 ‚îÇ
‚îÇ      ‚Ä¢ TOP 5% LWBS ‚Üí Proactive check after 20 min                       ‚îÇ
‚îÇ      ‚Ä¢ Both risks ‚Üí Highest priority attention                          ‚îÇ
‚îÇ                                                                         ‚îÇ
‚îÇ VALUE: Focus limited resources on highest-impact patients               ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
""")


üìä FINAL MODEL SUMMARY

‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ                       ADMISSION PREDICTION                              ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ  Best Model:    Logistic Regression                                ‚îÇ
‚îÇ  AUC-ROC:       0.8495                                               ‚îÇ
‚îÇ  PR-AUC:        0.5117                                               ‚îÇ
‚îÇ  Precision:     0.4526 (45% of flags correct)                       ‚îÇ
‚îÇ  Recall:        0.6225 (62% of admissions caught)                    ‚îÇ
‚îÇ  Threshold:     0.676                                  

In [23]:
# =============================================================================
# CELL 23: ANSWER TO STRATEGIC QUESTION
# =============================================================================

print("=" * 70)
print("STRATEGIC RECOMMENDATION: SINGLE vs MULTI-TARGET")
print("=" * 70)

print("""
QUESTION: Should we optimize ONE target or MULTIPLE targets?

ANSWER: BUILD TWO OPTIMIZED MODELS (Admission + LWBS)

JUSTIFICATION:

‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Criterion       ‚îÇ Two Models Advantage                                    ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îº‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ Workflow Impact ‚îÇ Different actions: Admission ‚Üí beds, LWBS ‚Üí monitoring  ‚îÇ
‚îÇ Decision Latency‚îÇ Both run in parallel (<100ms total)                     ‚îÇ
‚îÇ Interpretability‚îÇ Clear separation: "admission risk" vs "LWBS risk"       ‚îÇ
‚îÇ Actionability   ‚îÇ Each triggers distinct, non-conflicting interventions   ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò

WHY NOT SINGLE TARGET?

‚Ä¢ Admission-only: Misses LWBS risk entirely (different patients)
‚Ä¢ LWBS-only: Ignores bed planning (biggest operational need)
‚Ä¢ PIA-only: Weak predictive signal (R¬≤=0.15), limited actionability

WHY NOT PIA?

‚Ä¢ R¬≤ = 0.15 means 85% of variance unexplained
‚Ä¢ Cannot reliably communicate wait times to patients
‚Ä¢ Less actionable than admission/LWBS decisions

TRADE-OFFS OF TWO-MODEL APPROACH:

‚Ä¢ +More complete risk picture
‚Ä¢ +Enables different interventions
‚Ä¢ -Slightly more complexity to deploy
‚Ä¢ -Two thresholds to calibrate

CONCLUSION: The marginal complexity is worth the operational value.
""")

STRATEGIC RECOMMENDATION: SINGLE vs MULTI-TARGET

QUESTION: Should we optimize ONE target or MULTIPLE targets?

ANSWER: BUILD TWO OPTIMIZED MODELS (Admission + LWBS)

JUSTIFICATION:

‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ Criterion       ‚îÇ Two Models Advantage                                    ‚îÇ
‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îº‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î§
‚îÇ Workflow Impact ‚îÇ Different actions: Admission ‚Üí beds, LWBS ‚Üí monitoring  ‚îÇ
‚îÇ Decision Latency‚îÇ Both run in parallel (<100ms total)                     ‚îÇ
‚îÇ Interpretability‚îÇ Clear separation: "admission risk" vs "LWBS risk"       ‚îÇ
‚îÇ Actionabi

In [24]:
# =============================================================================
# CELL 24: EXPORT MODELS FOR DASHBOARD
# =============================================================================

# Store final models and configuration
FINAL_MODELS = {
    'admission': {
        'model': tuned_model_adm,
        'threshold': optimal_threshold_adm,
        'model_type': best_model_name_adm,
        'features': ALL_FEATURES,
        'metrics': metrics_adm_final
    },
    'lwbs': {
        'model': tuned_model_lwbs,
        'threshold': optimal_threshold_lwbs,
        'model_type': best_model_name_lwbs,
        'features': ALL_FEATURES,
        'metrics': metrics_lwbs_final,
        'lift_results': lwbs_lift_results
    }
}

# Feature engineering function reference
FEATURE_ENGINEERING_FN = engineer_features_production

print("‚úì Models exported to FINAL_MODELS dictionary")
print("\nUsage:")
print("  admission_model = FINAL_MODELS['admission']['model']")
print("  threshold = FINAL_MODELS['admission']['threshold']")
print("  features = FINAL_MODELS['admission']['features']")

‚úì Models exported to FINAL_MODELS dictionary

Usage:
  admission_model = FINAL_MODELS['admission']['model']
  threshold = FINAL_MODELS['admission']['threshold']
  features = FINAL_MODELS['admission']['features']


In [25]:
# =============================================================================
# CELL 25: MODULE SUMMARY
# =============================================================================

print("""
================================================================================
MODULE 5: MACHINE LEARNING ‚Äî COMPLETE
================================================================================

WHAT WE BUILT:
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
1. Comprehensive feature engineering (48 features)
   ‚Ä¢ Patient demographics + triage assessment
   ‚Ä¢ Temporal patterns + zone characteristics
   ‚Ä¢ Operational context (historical patterns) ‚Üê KEY INNOVATION
   ‚Ä¢ Domain-driven interaction features

2. Rigorous model comparison (5 algorithms)
   ‚Ä¢ Logistic Regression (baseline)
   ‚Ä¢ Random Forest
   ‚Ä¢ Gradient Boosting
   ‚Ä¢ XGBoost (if available)
   ‚Ä¢ LightGBM (if available)

3. Hyperparameter tuning (RandomizedSearchCV)
   ‚Ä¢ 5-fold stratified cross-validation
   ‚Ä¢ 40 parameter combinations tested

4. Threshold optimization
   ‚Ä¢ Admission: Balanced precision/recall
   ‚Ä¢ LWBS: Lift-based tiers (not binary)

5. Production-ready outputs
   ‚Ä¢ FINAL_MODELS dictionary with all artifacts
   ‚Ä¢ Feature engineering function
   ‚Ä¢ Combined risk scoring for dashboard

KEY RESULTS:
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
‚Ä¢ Admission: AUC ~0.85, Precision ~50%, Recall ~57%
‚Ä¢ LWBS: AUC ~0.86, Top 5% captures ~40% of LWBS (8x lift)
‚Ä¢ Both models run in parallel for complete risk picture

RECOMMENDATION:
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
Deploy BOTH models as complementary decision support tools.
They serve different purposes and trigger different actions.
================================================================================
""")


MODULE 5: MACHINE LEARNING ‚Äî COMPLETE

WHAT WE BUILT:
‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
1. Comprehensive feature engineering (48 features)
   ‚Ä¢ Patient demographics + triage assessment
   ‚Ä¢ Temporal patterns + zone characteristics
   ‚Ä¢ Operational context (historical patterns) ‚Üê KEY INNOVATION
   ‚Ä¢ Domain-driven interaction features

2. Rigorous model comparison (5 algorithms)
   ‚Ä¢ Logistic Regression (baseline)
   ‚Ä¢ Random Forest
   ‚Ä¢ Gradient Boosting
   ‚Ä¢ XGBoost (if available)
   ‚Ä¢ LightGBM (if available)

3. Hyperparameter tuning (RandomizedSearchCV)
   ‚Ä¢ 5-fold stratified cross-validation
   ‚Ä¢ 40 parameter combinations tested

4. Threshold optimization
   ‚Ä¢ Admission: Balanced precision/recall
   ‚Ä¢ LWBS: Lift-based tiers (not binary)

5. Produ