# ICU Patient Deterioration Detection - Data Exploration & Feature Extraction

**Project Goal:** Build an explainable early warning system that predicts ICU patient deterioration using machine learning and generates clinical explanations using LLMs.

**This Notebook:** 
- Explores MIMIC-III waveform database
- Filters for usable segments (>= 4 hours with ECG + BP)
- Extracts vital sign features for ML training

**Dataset:** MIMIC-III Waveform Database (adult ICU patients)

## Step 1: Load Adult Patient List

The RECORDS-adults file contains paths to all adult ICU patients in the database. We'll load this list to identify which patients we've downloaded.

In [2]:
import os

# Load adult patient list
with open('RECORDS-adults', 'r') as f:
    adult_patients = [line.strip() for line in f]

# Check which ones you actually downloaded
downloaded = []
for patient in adult_patients:
    # Remove trailing slash and split
    patient_clean = patient.rstrip('/')
    parts = patient_clean.split('/')
    
    if len(parts) == 2:
        folder, patient_id = parts
        if os.path.exists(f'data/{folder}/{patient_id}'):
            downloaded.append(patient_clean)

print(f"Total adult patients available: {len(adult_patients)}")
print(f"You downloaded: {len(downloaded)} adult patients")
print(f"\nFirst 10 you downloaded:")
for i in range(min(10, len(downloaded))):
    print(downloaded[i])

Total adult patients available: 59344
You downloaded: 2273 adult patients

First 10 you downloaded:
30/3000003
30/3000031
30/3000060
30/3000063
30/3000065
30/3000086
30/3000100
30/3000103
30/3000105
30/3000125


## Step 2: Identify Downloaded Patients

Check which adult patients from the RECORDS-adults list actually exist in our local data folder. We downloaded folders 30-39, giving us access to thousands of patients, but we'll work with a manageable subset.

In [3]:
# Let's pick the first 30 patients you downloaded
working_patients = downloaded[:30]

print(f"Working with: {len(working_patients)} patients")
print("\nYour working set:")
for p in working_patients:
    print(p)

Working with: 30 patients

Your working set:
30/3000003
30/3000031
30/3000060
30/3000063
30/3000065
30/3000086
30/3000100
30/3000103
30/3000105
30/3000125
30/3000126
30/3000142
30/3000154
30/3000189
30/3000190
30/3000203
30/3000221
30/3000282
30/3000336
30/3000393
30/3000397
30/3000428
30/3000435
30/3000458
30/3000480
30/3000484
30/3000497
30/3000531
30/3000577
30/3000596


## Step 3: Select Working Set of Patients

For this project, we'll work with 30 patients. This is manageable within our 4-week timeline while still providing sufficient data for ML training (~100-150 segments expected).

In [4]:
import wfdb

# Find all segments for your working patients
all_segments = []

for patient in working_patients[:5]:  # Start with just 5 to test
    folder, patient_id = patient.split('/')
    patient_path = f'data/{folder}/{patient_id}'
    
    # Find all segment files
    files = os.listdir(patient_path)
    segment_files = [f for f in files if f.endswith('.hea') and '_' in f]
    
    print(f"\n{patient_id}:")
    for seg_file in segment_files[:5]:  # Show first 5 segments
        seg_name = seg_file.replace('.hea', '')
        print(f"  - {seg_name}")


3000003:
  - 3000003_0001
  - 3000003_0002
  - 3000003_0003
  - 3000003_0004
  - 3000003_0005

3000031:

3000060:
  - 3000060_0001
  - 3000060_0002
  - 3000060_0003
  - 3000060_0004
  - 3000060_0005

3000063:
  - 3000063_0001
  - 3000063_0002
  - 3000063_0003
  - 3000063_0004
  - 3000063_0005

3000065:
  - 3000065_0001
  - 3000065_layout


## Step 4: Explore Segments and Filter for Usability

Each patient has multiple recording segments representing different time periods during their ICU stay.

**What we're checking:**
- Which segments exist for each patient
- Segment duration (need >= 4 hours to see trends)
- Available signals (need ECG for heart rate + ABP for blood pressure)

**Usability criteria:**
- ‚úÖ Duration >= 4 hours
- ‚úÖ Has ECG signal (II or ECG)
- ‚úÖ Has ABP signal (blood pressure)

Many segments are too short or missing required signals.

In [5]:
# Check segment durations and signals
usable_segments = []

for patient in working_patients[:5]:  # Test with first 5
    folder, patient_id = patient.split('/')
    patient_path = f'data/{folder}/{patient_id}'
    
    files = os.listdir(patient_path)
    segment_files = [f.replace('.hea', '') for f in files if f.endswith('.hea') and '_' in f and 'layout' not in f]
    
    print(f"\n{patient_id}:")
    
    for seg_name in segment_files:
        try:
            # Load segment
            record = wfdb.rdrecord(f'{patient_path}/{seg_name}')
            duration_hrs = record.sig_len / record.fs / 3600
            
            # Check for ECG and BP
            has_ecg = any('II' in s or 'ECG' in s for s in record.sig_name)
            has_bp = any('ABP' in s or 'BP' in s for s in record.sig_name)
            
            status = "‚úÖ" if (duration_hrs >= 4 and has_ecg and has_bp) else "‚ùå"
            
            print(f"  {status} {seg_name}: {duration_hrs:.2f}h, signals: {record.sig_name}")
            
            if duration_hrs >= 4 and has_ecg and has_bp:
                usable_segments.append(f"{patient}/{seg_name}")
                
        except:
            print(f"  ‚ö†Ô∏è  {seg_name}: Error loading")

print(f"\n\n‚úÖ Total usable segments (>= 4hrs with ECG+BP): {len(usable_segments)}")


3000003:
  ‚ùå 3000003_0001: 0.04h, signals: ['II', 'V']
  ‚ùå 3000003_0002: 0.00h, signals: ['II', 'V']
  ‚ùå 3000003_0003: 0.00h, signals: ['II', 'V']
  ‚ùå 3000003_0004: 0.00h, signals: ['II', 'V']
  ‚ùå 3000003_0005: 0.98h, signals: ['II', 'V']
  ‚ùå 3000003_0006: 0.01h, signals: ['II', 'V', 'ABP']
  ‚úÖ 3000003_0007: 4.79h, signals: ['II', 'V', 'ABP']
  ‚ùå 3000003_0008: 0.48h, signals: ['II', 'V', 'ABP']
  ‚ùå 3000003_0009: 1.95h, signals: ['II', 'V', 'ABP']
  ‚ùå 3000003_0010: 2.99h, signals: ['II', 'V', 'ABP']
  ‚ùå 3000003_0011: 0.00h, signals: ['II', 'V', 'ABP']
  ‚ùå 3000003_0012: 0.00h, signals: ['II', 'V', 'ABP']
  ‚ùå 3000003_0013: 1.16h, signals: ['II', 'V', 'ABP']
  ‚úÖ 3000003_0014: 5.33h, signals: ['II', 'ABP']
  ‚ùå 3000003_0015: 11.58h, signals: ['II']
  ‚ùå 3000003_0016: 11.43h, signals: ['II']
  ‚ùå 3000003_0017: 0.22h, signals: ['II', 'V']

3000031:

3000060:
  ‚ùå 3000060_0001: 0.08h, signals: ['II']
  ‚ùå 3000060_0002: 0.77h, signals: ['II', 'PLETH']
  ‚ùå 300

## Step 6: Scan All 30 Working Patients

Now we systematically check all 30 patients to find every usable segment that meets our criteria.

**This creates our final dataset:**
- Each usable segment becomes one training example
- Expected: ~30-50 segments from 30 patients
- Result: Our ML-ready dataset for feature extraction

In [6]:
# Check all 30 working patients
all_usable_segments = []

for patient in working_patients:
    folder, patient_id = patient.split('/')
    patient_path = f'data/{folder}/{patient_id}'
    
    try:
        files = os.listdir(patient_path)
        segment_files = [f.replace('.hea', '') for f in files 
                        if f.endswith('.hea') and '_' in f and 'layout' not in f]
        
        for seg_name in segment_files:
            try:
                record = wfdb.rdrecord(f'{patient_path}/{seg_name}')
                duration_hrs = record.sig_len / record.fs / 3600
                
                has_ecg = any('II' in s or 'ECG' in s for s in record.sig_name)
                has_bp = any('ABP' in s or 'BP' in s for s in record.sig_name)
                
                if duration_hrs >= 4 and has_ecg and has_bp:
                    all_usable_segments.append({
                        'patient': patient,
                        'segment': seg_name,
                        'duration': duration_hrs,
                        'signals': record.sig_name
                    })
            except:
                pass
    except:
        pass

print(f"‚úÖ Total usable segments from 30 patients: {len(all_usable_segments)}")
print(f"\nFirst 10:")
for i, seg in enumerate(all_usable_segments[:10]):
    print(f"{i+1}. {seg['patient']}/{seg['segment']} - {seg['duration']:.1f}h")

‚úÖ Total usable segments from 30 patients: 34

First 10:
1. 30/3000003/3000003_0007 - 4.8h
2. 30/3000003/3000003_0014 - 5.3h
3. 30/3000063/3000063_0020 - 10.6h
4. 30/3000063/3000063_0022 - 13.0h
5. 30/3000063/3000063_0025 - 5.3h
6. 30/3000063/3000063_0029 - 19.7h
7. 30/3000126/3000126_0007 - 14.1h
8. 30/3000126/3000126_0009 - 7.0h
9. 30/3000126/3000126_0012 - 13.9h
10. 30/3000126/3000126_0013 - 18.1h


## Step 7: Save Usable Segments List

Save the list of usable segments to a CSV file so we can easily load it in future sessions without re-scanning all patients.

This gives us a permanent record of our 34 usable segments.

In [7]:
# Save the usable segments list
import pandas as pd

df = pd.DataFrame(all_usable_segments)
df.to_csv('usable_segments.csv', index=False)

print("‚úÖ Saved to: usable_segments.csv")
print(f"\nYou have {len(df)} segments to work with!")

‚úÖ Saved to: usable_segments.csv

You have 34 segments to work with!


# Feature Extraction

Now we'll extract meaningful features from raw waveform data. Each segment contains millions of data points - we need to compress this into ~20-30 features that ML models can use.

## Step 8: Load Example Segment

Load the first usable segment to develop and test our feature extraction pipeline.

**What we'll see:**
- Patient ID and segment name
- Recording duration
- Available signals (ECG, BP, etc.)
- Sampling rate

In [8]:
import numpy as np
import wfdb
import matplotlib.pyplot as plt

# Load the first usable segment
seg = all_usable_segments[0]
patient_path = f"data/{seg['patient'].split('/')[0]}/{seg['patient'].split('/')[1]}"
seg_name = seg['segment']

print(f"Loading: {seg['patient']}/{seg_name}")
print(f"Duration: {seg['duration']:.2f} hours")

# Load the record
record = wfdb.rdrecord(f"{patient_path}/{seg_name}")
print(f"Signals: {record.sig_name}")
print(f"Sampling rate: {record.fs} Hz")

Loading: 30/3000003/3000003_0007
Duration: 4.79 hours
Signals: ['II', 'V', 'ABP']
Sampling rate: 125 Hz


## Step 9: Extract Heart Rate from ECG

**Process:**
1. Load ECG signal (Lead II)
2. Detect R-peaks (the tall spikes in ECG = heartbeats)
3. Calculate time between peaks (RR intervals)
4. Convert to beats per minute (BPM)

**Features extracted:**
- Mean HR, Min HR, Max HR, Standard deviation

These statistics tell us if the heart rate is normal, abnormal, or highly variable.

In [9]:
# Get ECG signal (Lead II)
ecg = record.p_signal[:, 0]  # First signal = Lead II

# Simple heart rate extraction
# Count peaks in ECG to get heart rate
from scipy.signal import find_peaks

# Find R-peaks in ECG (the tall spikes)
peaks, _ = find_peaks(ecg, distance=record.fs*0.5, height=0.3)

# Calculate heart rate from peaks
# Time between peaks = RR interval
rr_intervals = np.diff(peaks) / record.fs  # in seconds
heart_rates = 60 / rr_intervals  # convert to beats per minute

print(f"Total heartbeats detected: {len(peaks)}")
print(f"Heart rate statistics:")
print(f"  Mean: {np.mean(heart_rates):.1f} bpm")
print(f"  Min: {np.min(heart_rates):.1f} bpm")
print(f"  Max: {np.max(heart_rates):.1f} bpm")
print(f"  Std: {np.std(heart_rates):.1f} bpm")

Total heartbeats detected: 19887
Heart rate statistics:
  Mean: 69.5 bpm
  Min: 31.2 bpm
  Max: 119.0 bpm
  Std: 4.3 bpm


## Step 10: Extract Blood Pressure from ABP

**Process:**
1. Load ABP (Arterial Blood Pressure) signal
2. Detect systolic peaks (high points in waveform)
3. Detect diastolic troughs (low points in waveform)
4. Calculate statistics

**Clinical significance:**
- **Systolic:** Peak pressure when heart contracts
- **Diastolic:** Minimum pressure when heart relaxes

Normal ranges: Systolic ~120 mmHg, Diastolic ~80 mmHg

In [10]:
# Get ABP signal (Arterial Blood Pressure)
abp = record.p_signal[:, 2]  # Third signal = ABP

# Find systolic peaks (high points)
systolic_peaks, _ = find_peaks(abp, distance=record.fs*0.5, height=50)
systolic_values = abp[systolic_peaks]

# Find diastolic troughs (low points)
# Invert signal to find valleys
diastolic_peaks, _ = find_peaks(-abp, distance=record.fs*0.5)
diastolic_values = abp[diastolic_peaks]

print(f"Blood Pressure statistics:")
print(f"  Systolic mean: {np.mean(systolic_values):.1f} mmHg")
print(f"  Systolic min: {np.min(systolic_values):.1f} mmHg")
print(f"  Systolic max: {np.max(systolic_values):.1f} mmHg")
print(f"  Diastolic mean: {np.mean(diastolic_values):.1f} mmHg")

Blood Pressure statistics:
  Systolic mean: 102.0 mmHg
  Systolic min: 50.4 mmHg
  Systolic max: 180.0 mmHg
  Diastolic mean: 58.5 mmHg


## Step 11: Calculate Trends (Early vs Late)

**Critical for deterioration detection!**

Compare the first half vs second half of the recording to detect changes over time:
- Is heart rate increasing or decreasing?
- Is blood pressure rising or falling?
- What is the magnitude of change?

**Clinical patterns:**
- ‚úÖ **Stable:** Minimal changes (<5 bpm HR, <10 mmHg BP)
- ‚ö†Ô∏è **HR increasing + BP decreasing:** Possible compensatory shock/deterioration
- üìä **Other changes:** May indicate clinical events requiring investigation

**Why this matters:**
Deterioration happens gradually over hours. Comparing early vs late reveals trends that single-time measurements miss.

In [11]:
# Split data into first half vs second half
midpoint_time = len(heart_rates) // 2
midpoint_bp = len(systolic_values) // 2

# Heart rate: early vs late
hr_early = heart_rates[:midpoint_time]
hr_late = heart_rates[midpoint_time:]

hr_early_mean = np.mean(hr_early)
hr_late_mean = np.mean(hr_late)
hr_change = hr_late_mean - hr_early_mean
hr_percent_change = (hr_change / hr_early_mean) * 100

# Blood pressure: early vs late
bp_early = systolic_values[:midpoint_bp]
bp_late = systolic_values[midpoint_bp:]

bp_early_mean = np.mean(bp_early)
bp_late_mean = np.mean(bp_late)
bp_change = bp_late_mean - bp_early_mean
bp_percent_change = (bp_change / bp_early_mean) * 100

print("TRENDS (First Half ‚Üí Second Half):")
print(f"\nHeart Rate:")
print(f"  Early: {hr_early_mean:.1f} bpm")
print(f"  Late: {hr_late_mean:.1f} bpm")
print(f"  Change: {hr_change:+.1f} bpm ({hr_percent_change:+.1f}%)")

print(f"\nBlood Pressure:")
print(f"  Early: {bp_early_mean:.1f} mmHg")
print(f"  Late: {bp_late_mean:.1f} mmHg")
print(f"  Change: {bp_change:+.1f} mmHg ({bp_percent_change:+.1f}%)")

# Clinical interpretation
if hr_change > 0 and bp_change < 0:
    print("\n‚ö†Ô∏è PATTERN: HR increasing + BP decreasing (possible deterioration)")
elif abs(hr_change) < 5 and abs(bp_change) < 10:
    print("\n‚úÖ PATTERN: Stable (minimal changes)")
else:
    print("\nüìä PATTERN: Changes detected, further analysis needed")

TRENDS (First Half ‚Üí Second Half):

Heart Rate:
  Early: 69.5 bpm
  Late: 69.4 bpm
  Change: -0.1 bpm (-0.1%)

Blood Pressure:
  Early: 101.8 mmHg
  Late: 102.3 mmHg
  Change: +0.4 mmHg (+0.4%)

‚úÖ PATTERN: Stable (minimal changes)


## Step 12: Create Reusable Feature Extraction Function

Package all extraction logic into a single function that can process any segment.

**What this function does:**
1. Loads a segment's waveform data
2. Extracts heart rate from ECG signal
3. Extracts blood pressure from ABP signal
4. Calculates trends (early vs late comparison)
5. Computes clinical metrics (shock index, etc.)
6. Returns a dictionary of ~15-20 features

**Why we need this:**
We have 34 segments to process. This function lets us extract features from all of them systematically and consistently.

**Output:** One feature dictionary per segment (ready for ML training)

In [12]:
def extract_features(patient, segment_name):
    """Extract features from one segment - with better error handling"""
    
    # Load record
    folder, patient_id = patient.split('/')
    record = wfdb.rdrecord(f"data/{folder}/{patient_id}/{segment_name}")
    
    # Find ECG signal (could be at different positions)
    ecg_idx = None
    for i, name in enumerate(record.sig_name):
        if 'II' in name or 'ECG' in name:
            ecg_idx = i
            break
    
    # Find ABP signal (could be at different positions)
    abp_idx = None
    for i, name in enumerate(record.sig_name):
        if 'ABP' in name or 'BP' in name:
            abp_idx = i
            break
    
    if ecg_idx is None or abp_idx is None:
        raise ValueError(f"Missing required signals. Available: {record.sig_name}")
    
    # Get signals
    ecg = record.p_signal[:, ecg_idx]
    abp = record.p_signal[:, abp_idx]
    
    # Extract heart rate
    peaks, _ = find_peaks(ecg, distance=record.fs*0.5, height=0.3)
    
    if len(peaks) < 10:  # Need at least 10 heartbeats
        raise ValueError(f"Too few heartbeats detected: {len(peaks)}")
    
    rr_intervals = np.diff(peaks) / record.fs
    heart_rates = 60 / rr_intervals
    
    # Extract blood pressure
    systolic_peaks, _ = find_peaks(abp, distance=record.fs*0.5, height=50)
    
    if len(systolic_peaks) < 10:  # Need at least 10 BP measurements
        raise ValueError(f"Too few BP peaks detected: {len(systolic_peaks)}")
    
    systolic_values = abp[systolic_peaks]
    
    # Calculate trends (early vs late)
    midpoint_hr = len(heart_rates) // 2
    midpoint_bp = len(systolic_values) // 2
    
    hr_early_mean = np.mean(heart_rates[:midpoint_hr])
    hr_late_mean = np.mean(heart_rates[midpoint_hr:])
    
    bp_early_mean = np.mean(systolic_values[:midpoint_bp])
    bp_late_mean = np.mean(systolic_values[midpoint_bp:])
    
    # Create feature dictionary
    features = {
        'patient_id': patient,
        'segment': segment_name,
        'duration_hours': record.sig_len / record.fs / 3600,
        
        # Heart rate features
        'hr_mean': np.mean(heart_rates),
        'hr_std': np.std(heart_rates),
        'hr_min': np.min(heart_rates),
        'hr_max': np.max(heart_rates),
        'hr_early_mean': hr_early_mean,
        'hr_late_mean': hr_late_mean,
        'hr_change': hr_late_mean - hr_early_mean,
        'hr_percent_change': ((hr_late_mean - hr_early_mean) / hr_early_mean) * 100,
        
        # Blood pressure features
        'bp_systolic_mean': np.mean(systolic_values),
        'bp_systolic_std': np.std(systolic_values),
        'bp_systolic_min': np.min(systolic_values),
        'bp_systolic_max': np.max(systolic_values),
        'bp_early_mean': bp_early_mean,
        'bp_late_mean': bp_late_mean,
        'bp_change': bp_late_mean - bp_early_mean,
        'bp_percent_change': ((bp_late_mean - bp_early_mean) / bp_early_mean) * 100,
        
        # Clinical metrics
        'shock_index': np.mean(heart_rates) / np.mean(systolic_values)
    }
    
    return features

## Step 13: Extract Features from All Segments

Apply the feature extraction function to all 34 usable segments. This creates our ML-ready dataset.

**What happens:**
- Process each of the 34 segments
- Extract 20 features per segment
- Handle any errors gracefully (some segments might fail)
- Save results to CSV

**Result:** A dataset with 34 rows (segments) √ó 20 columns (features)

**Processing time:** ~2-5 minutes for 34 segments

In [13]:
# Extract features from all segments (WITH FIXED FUNCTION)
print("Extracting features from all 34 segments (with improved error handling)...")

all_features = []

for i, seg_info in enumerate(all_usable_segments):
    try:
        features = extract_features(seg_info['patient'], seg_info['segment'])
        all_features.append(features)
        if (i + 1) % 5 == 0:
            print(f"‚úÖ Processed {i + 1}/{len(all_usable_segments)}")
    except Exception as e:
        print(f"‚ùå {seg_info['patient']}/{seg_info['segment']}: {str(e)}")

print(f"\n‚úÖ Complete! Successfully processed: {len(all_features)}/{len(all_usable_segments)}")

features_df = pd.DataFrame(all_features)
features_df.to_csv('extracted_features.csv', index=False)
print(f"üíæ Saved {len(features_df)} segments to: extracted_features.csv")

Extracting features from all 34 segments (with improved error handling)...
‚úÖ Processed 5/34
‚úÖ Processed 10/34
‚úÖ Processed 15/34
‚úÖ Processed 20/34
‚úÖ Processed 25/34
‚úÖ Processed 30/34

‚úÖ Complete! Successfully processed: 34/34
üíæ Saved 34 segments to: extracted_features.csv


# Check results

In [14]:
# Load and examine the dataset
features_df = pd.read_csv('extracted_features.csv')

print(f"üìä Dataset Summary:")
print(f"   Segments: {len(features_df)}")
print(f"   Features: {len(features_df.columns)}")
print(f"\nüìã Feature names:")
for col in features_df.columns:
    print(f"   - {col}")

print(f"\nüìà Statistical Summary:")
print(features_df[['hr_mean', 'hr_change', 'bp_systolic_mean', 'bp_change', 'shock_index']].describe())

üìä Dataset Summary:
   Segments: 34
   Features: 20

üìã Feature names:
   - patient_id
   - segment
   - duration_hours
   - hr_mean
   - hr_std
   - hr_min
   - hr_max
   - hr_early_mean
   - hr_late_mean
   - hr_change
   - hr_percent_change
   - bp_systolic_mean
   - bp_systolic_std
   - bp_systolic_min
   - bp_systolic_max
   - bp_early_mean
   - bp_late_mean
   - bp_change
   - bp_percent_change
   - shock_index

üìà Statistical Summary:
          hr_mean  hr_change  bp_systolic_mean  bp_change  shock_index
count   34.000000  34.000000         34.000000  34.000000    34.000000
mean    80.226847   1.065668        124.807971   1.869751     0.650514
std     11.046956   8.882720         14.527857   8.878541     0.112361
min     62.421025 -19.645487        100.509744 -16.690726     0.468115
25%     72.691654  -2.515789        112.515052  -3.953715     0.579054
50%     78.623016  -0.136152        125.156093   1.947831     0.636076
75%     85.990876   5.831549        134.957482   7.

# Data Labeling

Now we need to label each segment: Did the patient deteriorate or stay stable?

## Step 14: Rule-Based Labeling

We'll use clinical patterns to create labels:

**Deterioration indicators:**
- ‚ö†Ô∏è HR increasing + BP decreasing (compensatory shock)
- ‚ö†Ô∏è HR very high (>100 bpm) and rising
- ‚ö†Ô∏è BP very low (<100 mmHg) and falling
- ‚ö†Ô∏è High shock index (>0.9)

**Stable indicators:**
- ‚úÖ Minimal changes in HR and BP
- ‚úÖ Normal ranges maintained
- ‚úÖ Low shock index (<0.7)

In [15]:
# Create labels based on clinical patterns
def label_segment(row):
    """
    Label segment as deteriorating (1) or stable (0)
    Based on clinical deterioration patterns
    """
    
    # Pattern 1: HR increasing AND BP decreasing (classic deterioration)
    if row['hr_change'] > 10 and row['bp_change'] < -5:
        return 1
    
    # Pattern 2: Very high HR (tachycardia) and getting worse
    if row['hr_mean'] > 100 and row['hr_change'] > 5:
        return 1
    
    # Pattern 3: Low BP (hypotension) and getting worse
    if row['bp_systolic_mean'] < 100 and row['bp_change'] < -5:
        return 1
    
    # Pattern 4: High shock index (>0.9 = concerning)
    if row['shock_index'] > 0.9:
        return 1
    
    # Pattern 5: Large HR increase (>20 bpm)
    if row['hr_change'] > 20:
        return 1
    
    # Pattern 6: Large BP drop (>15 mmHg)
    if row['bp_change'] < -15:
        return 1
    
    # Otherwise: Stable
    return 0

# Apply labeling
features_df['deterioration'] = features_df.apply(label_segment, axis=1)

# Check distribution
print("üìä Label Distribution:")
print(f"   Stable (0): {(features_df['deterioration'] == 0).sum()} segments")
print(f"   Deteriorating (1): {(features_df['deterioration'] == 1).sum()} segments")
print(f"\n   Deterioration rate: {features_df['deterioration'].mean()*100:.1f}%")

# Save labeled dataset
features_df.to_csv('labeled_features.csv', index=False)
print(f"\nüíæ Saved labeled dataset to: labeled_features.csv")

# Show examples of each class
print("\n‚úÖ Example STABLE segments:")
print(features_df[features_df['deterioration']==0][['patient_id', 'hr_mean', 'hr_change', 'bp_systolic_mean', 'bp_change', 'shock_index']].head(3))

print("\n‚ö†Ô∏è  Example DETERIORATING segments:")
print(features_df[features_df['deterioration']==1][['patient_id', 'hr_mean', 'hr_change', 'bp_systolic_mean', 'bp_change', 'shock_index']].head(3))

üìä Label Distribution:
   Stable (0): 30 segments
   Deteriorating (1): 4 segments

   Deterioration rate: 11.8%

üíæ Saved labeled dataset to: labeled_features.csv

‚úÖ Example STABLE segments:
   patient_id     hr_mean  hr_change  bp_systolic_mean  bp_change  shock_index
0  30/3000003   69.467815  -0.062783        102.030018   0.448434     0.680857
1  30/3000003   66.065046   5.424126        100.509744  13.713921     0.657300
5  30/3000063  103.762332 -11.996534        128.044793   9.685414     0.810360

‚ö†Ô∏è  Example DETERIORATING segments:
   patient_id     hr_mean  hr_change  bp_systolic_mean  bp_change  shock_index
2  30/3000063   98.907667   6.456491        109.826543 -14.026278     0.900581
3  30/3000063   79.143347  35.405384        117.546987  11.274592     0.673291
4  30/3000063  112.330839  -2.551825        118.427341  -4.305803     0.948521


# Finding More Deteriorating Patients

With only 4 deteriorating segments, our ML model can't learn effectively. Let's scan more patients to find more deterioration cases.

## Step 16: Scan Additional Patients for Deterioration

We'll process 50 more patients (total 80) and keep segments that show deterioration patterns.

In [16]:
# Use next 50 patients from our downloaded list
additional_patients = downloaded[30:80]  # Patients 31-80

print(f"üîç Scanning {len(additional_patients)} additional patients for deterioration...")
print("Looking specifically for segments with deterioration patterns...")
print("This may take 5-10 minutes...\n")

new_segments = []
deteriorating_found = 0

for i, patient in enumerate(additional_patients):
    folder, patient_id = patient.split('/')
    patient_path = f'data/{folder}/{patient_id}'
    
    try:
        files = os.listdir(patient_path)
        segment_files = [f.replace('.hea', '') for f in files 
                        if f.endswith('.hea') and '_' in f and 'layout' not in f]
        
        for seg_name in segment_files:
            try:
                # Quick check: load and verify
                record = wfdb.rdrecord(f'{patient_path}/{seg_name}')
                duration_hrs = record.sig_len / record.fs / 3600
                
                # Only process segments >= 4 hours with ECG + BP
                has_ecg = any('II' in s or 'ECG' in s for s in record.sig_name)
                has_bp = any('ABP' in s or 'BP' in s for s in record.sig_name)
                
                if duration_hrs >= 4 and has_ecg and has_bp:
                    # Extract features
                    features = extract_features(patient, seg_name)
                    
                    # Check if shows deterioration
                    label = label_segment(features)
                    features['deterioration'] = label
                    
                    if label == 1:  # Deteriorating!
                        new_segments.append(features)
                        deteriorating_found += 1
                        print(f"‚úÖ Found deteriorating: {patient}/{seg_name}")
                    elif len(new_segments) < 50:  # Also keep some stable for balance
                        new_segments.append(features)
                        
            except:
                pass
                
        if (i + 1) % 10 == 0:
            print(f"Scanned {i + 1}/{len(additional_patients)} patients... Found {deteriorating_found} deteriorating")
            
    except:
        pass

print(f"\n‚úÖ Scan complete!")
print(f"   New deteriorating segments found: {deteriorating_found}")
print(f"   Total new segments: {len(new_segments)}")

üîç Scanning 50 additional patients for deterioration...
Looking specifically for segments with deterioration patterns...
This may take 5-10 minutes...

‚úÖ Found deteriorating: 30/3000717/3000717_0059
‚úÖ Found deteriorating: 30/3000717/3000717_0069
‚úÖ Found deteriorating: 30/3000717/3000717_0077
‚úÖ Found deteriorating: 30/3000717/3000717_0078
Scanned 10/50 patients... Found 4 deteriorating
‚úÖ Found deteriorating: 30/3000860/3000860_0007
Scanned 20/50 patients... Found 5 deteriorating
‚úÖ Found deteriorating: 30/3001099/3001099_0008
Scanned 30/50 patients... Found 6 deteriorating
‚úÖ Found deteriorating: 30/3001203/3001203_0018
‚úÖ Found deteriorating: 30/3001203/3001203_0044
Scanned 40/50 patients... Found 8 deteriorating
‚úÖ Found deteriorating: 30/3001281/3001281_0003
Scanned 50/50 patients... Found 9 deteriorating

‚úÖ Scan complete!
   New deteriorating segments found: 9
   Total new segments: 44


In [17]:
# Scan MORE patients, ONLY keep deteriorating segments
print(f"üîç Scanning patients 80-200 for DETERIORATING segments only...")
print("Target: Find ~20-30 more deteriorating cases\n")

more_patients = downloaded[80:200]  # Next 120 patients
deteriorating_segments = []

for i, patient in enumerate(more_patients):
    folder, patient_id = patient.split('/')
    patient_path = f'data/{folder}/{patient_id}'
    
    try:
        files = os.listdir(patient_path)
        segment_files = [f.replace('.hea', '') for f in files 
                        if f.endswith('.hea') and '_' in f and 'layout' not in f]
        
        for seg_name in segment_files:
            try:
                record = wfdb.rdrecord(f'{patient_path}/{seg_name}')
                duration_hrs = record.sig_len / record.fs / 3600
                
                has_ecg = any('II' in s or 'ECG' in s for s in record.sig_name)
                has_bp = any('ABP' in s or 'BP' in s for s in record.sig_name)
                
                if duration_hrs >= 4 and has_ecg and has_bp:
                    features = extract_features(patient, seg_name)
                    label = label_segment(features)
                    
                    # ONLY keep deteriorating!
                    if label == 1:
                        features['deterioration'] = label
                        deteriorating_segments.append(features)
                        print(f"‚úÖ Found #{len(deteriorating_segments)}: {patient}/{seg_name}")
                        
            except:
                pass
                
        if (i + 1) % 20 == 0:
            print(f"Scanned {i + 1}/{len(more_patients)} patients... Found {len(deteriorating_segments)} deteriorating")
            
        # Stop if we found enough
        if len(deteriorating_segments) >= 30:
            print(f"\nüéØ Target reached! Found {len(deteriorating_segments)} deteriorating segments")
            break
            
    except:
        pass

print(f"\n‚úÖ Scan complete!")
print(f"   Total deteriorating segments found: {len(deteriorating_segments)}")

üîç Scanning patients 80-200 for DETERIORATING segments only...
Target: Find ~20-30 more deteriorating cases

‚úÖ Found #1: 30/3001570/3001570_0002
Scanned 20/120 patients... Found 1 deteriorating
‚úÖ Found #2: 30/3001937/3001937_0002
‚úÖ Found #3: 30/3001937/3001937_0009
‚úÖ Found #4: 30/3002090/3002090_0008
Scanned 40/120 patients... Found 4 deteriorating
Scanned 60/120 patients... Found 4 deteriorating
‚úÖ Found #5: 31/3100038/3100038_0117
‚úÖ Found #6: 31/3100038/3100038_0157
Scanned 80/120 patients... Found 6 deteriorating
‚úÖ Found #7: 31/3100198/3100198_0013
‚úÖ Found #8: 31/3100198/3100198_0015
‚úÖ Found #9: 31/3100237/3100237_0043
‚úÖ Found #10: 31/3100240/3100240_0006
‚úÖ Found #11: 31/3100305/3100305_0004
Scanned 100/120 patients... Found 11 deteriorating
‚úÖ Found #12: 31/3100461/3100461_0010
‚úÖ Found #13: 31/3100618/3100618_0030
Scanned 120/120 patients... Found 13 deteriorating

‚úÖ Scan complete!
   Total deteriorating segments found: 13


In [18]:
# First, check what we have
print("üìÅ Available CSV files:")
import os
for f in os.listdir('.'):
    if f.endswith('.csv'):
        print(f"  - {f}")

# Load what exists
labeled = pd.read_csv('labeled_features.csv')
print(f"\n‚úÖ Loaded labeled_features.csv: {len(labeled)} segments")

# Check if we have the additional data in memory
if 'new_segments' in locals() and len(new_segments) > 0:
    print(f"‚úÖ new_segments in memory: {len(new_segments)}")
    all_data = pd.concat([labeled, pd.DataFrame(new_segments)], ignore_index=True)
    print(f"Combined: {len(all_data)} segments")
else:
    print("‚ö†Ô∏è Only using labeled_features.csv (no new_segments in memory)")
    all_data = labeled

if 'deteriorating_segments' in locals() and len(deteriorating_segments) > 0:
    print(f"‚úÖ deteriorating_segments in memory: {len(deteriorating_segments)}")
    all_data = pd.concat([all_data, pd.DataFrame(deteriorating_segments)], ignore_index=True)
    print(f"Combined: {len(all_data)} segments")

# Save master dataset
all_data.to_csv('MASTER_DATASET.csv', index=False)
print(f"\nüíæ Created MASTER_DATASET.csv: {len(all_data)} segments")
print(f"   Stable: {(all_data['deterioration']==0).sum()}")
print(f"   Deteriorating: {(all_data['deterioration']==1).sum()}")

üìÅ Available CSV files:
  - complete_features_all_vitals.csv
  - complete_segments_with_all_vitals.csv
  - extracted_features.csv
  - FINAL_COMPLETE_DATASET.csv
  - labeled_features.csv
  - MASTER_DATASET.csv
  - ML_CLAUDE_EXPLANATIONS.csv
  - usable_segments.csv

‚úÖ Loaded labeled_features.csv: 34 segments
‚úÖ new_segments in memory: 44
Combined: 78 segments
‚úÖ deteriorating_segments in memory: 13
Combined: 91 segments

üíæ Created MASTER_DATASET.csv: 91 segments
   Stable: 65
   Deteriorating: 26


In [19]:
# Load the master dataset
all_data = pd.read_csv('MASTER_DATASET.csv')

print(f"‚úÖ Loaded: {len(all_data)} segments")
print(f"   Stable: {(all_data['deterioration']==0).sum()}")
print(f"   Deteriorating: {(all_data['deterioration']==1).sum()}")
print(f"   Deterioration rate: {(all_data['deterioration'].mean())*100:.1f}%")

# Check what signals are available
print(f"\nüîç Checking signals in all {len(all_data)} segments...")

signal_counts = {}

for i, row in all_data.iterrows():
    patient = row['patient_id']
    segment = row['segment']
    
    try:
        folder, patient_id = patient.split('/')
        record = wfdb.rdrecord(f"data/{folder}/{patient_id}/{segment}")
        
        for sig in record.sig_name:
            signal_counts[sig] = signal_counts.get(sig, 0) + 1
            
    except:
        pass
    
    if (i+1) % 20 == 0:
        print(f"  Checked {i+1}/{len(all_data)}...")

print(f"\nüìä Signal Availability Across Your {len(all_data)} Segments:")
print("="*60)
for sig, count in sorted(signal_counts.items(), key=lambda x: -x[1]):
    pct = (count/len(all_data))*100
    print(f"  {sig:15s}: {count:3d}/{len(all_data)} segments ({pct:.1f}%)")

print(f"\nüéØ Checking for Professor's Required Signals:")
print(f"  ‚úÖ Heart Rate (ECG): {'‚úÖ YES' if any('II' in s or 'ECG' in s for s in signal_counts.keys()) else '‚ùå NO'}")
print(f"  ‚úÖ Blood Pressure (ABP): {'‚úÖ YES' if 'ABP' in signal_counts else '‚ùå NO'}")
print(f"  ‚ö†Ô∏è  Respiration (RESP): {'‚úÖ YES' if 'RESP' in signal_counts else '‚ùå NO'}")
print(f"  ‚ö†Ô∏è  Temperature (TEMP): {'‚úÖ YES' if 'TEMP' in signal_counts else '‚ùå NO'}")
print(f"  ‚ö†Ô∏è  Oxygen (SpO2/PLETH): {'‚úÖ YES' if any(s in signal_counts for s in ['SpO2', 'PLETH']) else '‚ùå NO'}")

‚úÖ Loaded: 91 segments
   Stable: 65
   Deteriorating: 26
   Deterioration rate: 28.6%

üîç Checking signals in all 91 segments...
  Checked 20/91...
  Checked 40/91...
  Checked 60/91...
  Checked 80/91...

üìä Signal Availability Across Your 91 Segments:
  ABP            :  91/91 segments (100.0%)
  II             :  90/91 segments (98.9%)
  PLETH          :  47/91 segments (51.6%)
  RESP           :  42/91 segments (46.2%)
  V              :  35/91 segments (38.5%)
  AVR            :  15/91 segments (16.5%)
  III            :  11/91 segments (12.1%)
  CVP            :   9/91 segments (9.9%)
  PAP            :   7/91 segments (7.7%)
  AVF            :   3/91 segments (3.3%)
  ICP            :   1/91 segments (1.1%)

üéØ Checking for Professor's Required Signals:
  ‚úÖ Heart Rate (ECG): ‚úÖ YES
  ‚úÖ Blood Pressure (ABP): ‚úÖ YES
  ‚ö†Ô∏è  Respiration (RESP): ‚úÖ YES
  ‚ö†Ô∏è  Temperature (TEMP): ‚ùå NO
  ‚ö†Ô∏è  Oxygen (SpO2/PLETH): ‚úÖ YES


In [20]:
# Search for segments with complete vital signs
print("üîç Searching 2,273 patients for segments with COMPLETE vital signs...")
print("Required: ECG + ABP + RESP + PLETH/SpO2 + >= 4 hours\n")

complete_segments = []

for i, patient in enumerate(downloaded[:500]):  # Start with first 500 patients
    folder, patient_id = patient.split('/')
    patient_path = f'data/{folder}/{patient_id}'
    
    try:
        files = os.listdir(patient_path)
        segment_files = [f.replace('.hea', '') for f in files 
                        if f.endswith('.hea') and '_' in f and 'layout' not in f]
        
        for seg_name in segment_files:
            try:
                record = wfdb.rdrecord(f'{patient_path}/{seg_name}')
                duration_hrs = record.sig_len / record.fs / 3600
                
                # Check for ALL required signals
                has_ecg = any('II' in s or 'ECG' in s for s in record.sig_name)
                has_abp = any('ABP' in s or 'BP' in s for s in record.sig_name)
                has_resp = any('RESP' in s for s in record.sig_name)
                has_spo2 = any('PLETH' in s or 'SpO2' in s for s in record.sig_name)
                
                # Check duration
                if duration_hrs >= 4 and has_ecg and has_abp and has_resp and has_spo2:
                    complete_segments.append({
                        'patient': patient,
                        'segment': seg_name,
                        'duration': duration_hrs,
                        'signals': record.sig_name
                    })
                    print(f"‚úÖ #{len(complete_segments)}: {patient}/{seg_name} ({duration_hrs:.1f}h)")
                    
                    # Stop if we found enough
                    if len(complete_segments) >= 100:
                        print(f"\nüéØ Target reached! Found {len(complete_segments)} complete segments")
                        break
                        
            except:
                pass
                
        if len(complete_segments) >= 100:
            break
            
        if (i + 1) % 50 == 0:
            print(f"Scanned {i + 1} patients... Found {len(complete_segments)} complete segments")
            
    except:
        pass

print(f"\n‚úÖ Search complete!")
print(f"   Found {len(complete_segments)} segments with ALL required signals")

üîç Searching 2,273 patients for segments with COMPLETE vital signs...
Required: ECG + ABP + RESP + PLETH/SpO2 + >= 4 hours

‚úÖ #1: 30/3000393/3000393_0005 (4.8h)
‚úÖ #2: 30/3000393/3000393_0008 (4.1h)
‚úÖ #3: 30/3000393/3000393_0020 (5.2h)
‚úÖ #4: 30/3000480/3000480_0018 (18.0h)
‚úÖ #5: 30/3000714/3000714_0012 (8.3h)
‚úÖ #6: 30/3000714/3000714_0035 (7.3h)
‚úÖ #7: 30/3000781/3000781_0005 (6.9h)
‚úÖ #8: 30/3000860/3000860_0007 (7.7h)
Scanned 50 patients... Found 8 complete segments
‚úÖ #9: 30/3000989/3000989_0002 (28.0h)
‚úÖ #10: 30/3000989/3000989_0004 (14.4h)
‚úÖ #11: 30/3000989/3000989_0006 (16.2h)
‚úÖ #12: 30/3000989/3000989_0012 (17.6h)
‚úÖ #13: 30/3000989/3000989_0017 (7.2h)
‚úÖ #14: 30/3000989/3000989_0019 (10.4h)
‚úÖ #15: 30/3000989/3000989_0022 (13.3h)
‚úÖ #16: 30/3000989/3000989_0025 (23.0h)
‚úÖ #17: 30/3000989/3000989_0028 (11.2h)
‚úÖ #18: 30/3001203/3001203_0015 (9.1h)
‚úÖ #19: 30/3001203/3001203_0018 (10.4h)
‚úÖ #20: 30/3001203/3001203_0044 (6.1h)
‚úÖ #21: 30/3001203/3001

In [21]:
# Save complete segments list
complete_df = pd.DataFrame(complete_segments)
complete_df.to_csv('complete_segments_with_all_vitals.csv', index=False)

print(f"üíæ Saved {len(complete_segments)} complete segments")
print(f"\nüìä Preview:")
print(complete_df.head(10))

üíæ Saved 100 complete segments

üìä Preview:
      patient       segment   duration                              signals
0  30/3000393  3000393_0005   4.813056       [ABP, RESP, PLETH, II, V, AVR]
1  30/3000393  3000393_0008   4.051389       [ABP, RESP, PLETH, II, V, AVR]
2  30/3000393  3000393_0020   5.158333       [ABP, RESP, PLETH, II, V, AVR]
3  30/3000480  3000480_0018  18.029722       [ABP, RESP, V, III, II, PLETH]
4  30/3000714  3000714_0012   8.267222  [PLETH, RESP, ABP, II, V, AVR, CVP]
5  30/3000714  3000714_0035   7.296111  [PLETH, RESP, ABP, II, V, AVR, CVP]
6  30/3000781  3000781_0005   6.898056       [PLETH, RESP, II, V, AVR, ABP]
7  30/3000860  3000860_0007   7.743333       [RESP, PLETH, II, V, AVR, ABP]
8  30/3000989  3000989_0002  27.978056       [RESP, ABP, II, III, V, PLETH]
9  30/3000989  3000989_0004  14.375000       [RESP, ABP, II, III, V, PLETH]


# Complete Feature Extraction with ALL Vital Signs

Now we extract features from ECG, ABP, RESP, and PLETH to meet professor's requirements.

## Step 18: Enhanced Feature Extraction Function

Extract features from all 4 vital sign categories:
1. Heart Rate (from ECG)
2. Blood Pressure (from ABP)
3. Respiration Rate (from RESP)
4. Oxygen Saturation (from PLETH)

In [22]:
def extract_complete_features(patient, segment_name):
    """Extract features from ALL vital signs including RESP and SpO2"""
    
    # Load record
    folder, patient_id = patient.split('/')
    record = wfdb.rdrecord(f"data/{folder}/{patient_id}/{segment_name}")
    
    # Find signal indices
    ecg_idx = None
    abp_idx = None
    resp_idx = None
    pleth_idx = None
    
    for i, name in enumerate(record.sig_name):
        if 'II' in name or 'ECG' in name:
            ecg_idx = i
        if 'ABP' in name or 'BP' in name:
            abp_idx = i
        if 'RESP' in name:
            resp_idx = i
        if 'PLETH' in name or 'SpO2' in name:
            pleth_idx = i
    
    if ecg_idx is None or abp_idx is None or resp_idx is None or pleth_idx is None:
        raise ValueError(f"Missing required signals")
    
    # Get signals
    ecg = record.p_signal[:, ecg_idx]
    abp = record.p_signal[:, abp_idx]
    resp = record.p_signal[:, resp_idx]
    pleth = record.p_signal[:, pleth_idx]
    
    # === HEART RATE ===
    peaks_ecg, _ = find_peaks(ecg, distance=record.fs*0.5, height=0.3)
    if len(peaks_ecg) < 10:
        raise ValueError(f"Too few heartbeats")
    rr_intervals = np.diff(peaks_ecg) / record.fs
    heart_rates = 60 / rr_intervals
    
    # === BLOOD PRESSURE ===
    systolic_peaks, _ = find_peaks(abp, distance=record.fs*0.5, height=50)
    if len(systolic_peaks) < 10:
        raise ValueError(f"Too few BP peaks")
    systolic_values = abp[systolic_peaks]
    
    # === RESPIRATION RATE (FIXED!) ===
    resp_peaks, _ = find_peaks(resp, distance=record.fs*2, height=np.mean(resp))
    if len(resp_peaks) < 5:
        raise ValueError(f"Too few respiratory cycles")
    resp_intervals = np.diff(resp_peaks) / record.fs  # seconds between breaths
    resp_rates = 60 / resp_intervals  # breaths per minute
    
    # === OXYGEN SATURATION ===
    pleth_peaks, _ = find_peaks(pleth, distance=record.fs*0.5)
    if len(pleth_peaks) > 10:
        pleth_amplitudes = pleth[pleth_peaks]
        pleth_quality = np.std(pleth_amplitudes)
    else:
        pleth_quality = 0
    
    # Calculate trends
    midpoint_hr = len(heart_rates) // 2
    midpoint_bp = len(systolic_values) // 2
    midpoint_resp = len(resp_rates) // 2
    
    hr_early = np.mean(heart_rates[:midpoint_hr])
    hr_late = np.mean(heart_rates[midpoint_hr:])
    
    bp_early = np.mean(systolic_values[:midpoint_bp])
    bp_late = np.mean(systolic_values[midpoint_bp:])
    
    resp_early = np.mean(resp_rates[:midpoint_resp])
    resp_late = np.mean(resp_rates[midpoint_resp:])
    
    # Create features
    features = {
        'patient_id': patient,
        'segment': segment_name,
        'duration_hours': record.sig_len / record.fs / 3600,
        
        'hr_mean': np.mean(heart_rates),
        'hr_std': np.std(heart_rates),
        'hr_min': np.min(heart_rates),
        'hr_max': np.max(heart_rates),
        'hr_early_mean': hr_early,
        'hr_late_mean': hr_late,
        'hr_change': hr_late - hr_early,
        'hr_percent_change': ((hr_late - hr_early) / hr_early) * 100,
        
        'bp_systolic_mean': np.mean(systolic_values),
        'bp_systolic_std': np.std(systolic_values),
        'bp_systolic_min': np.min(systolic_values),
        'bp_systolic_max': np.max(systolic_values),
        'bp_early_mean': bp_early,
        'bp_late_mean': bp_late,
        'bp_change': bp_late - bp_early,
        'bp_percent_change': ((bp_late - bp_early) / bp_early) * 100,
        
        'resp_rate_mean': np.mean(resp_rates),
        'resp_rate_std': np.std(resp_rates),
        'resp_rate_min': np.min(resp_rates),
        'resp_rate_max': np.max(resp_rates),
        'resp_early_mean': resp_early,
        'resp_late_mean': resp_late,
        'resp_change': resp_late - resp_early,
        'resp_percent_change': ((resp_late - resp_early) / resp_early) * 100,
        
        'pleth_quality': pleth_quality,
        'pleth_mean': np.mean(pleth),
        'pleth_std': np.std(pleth),
        
        'shock_index': np.mean(heart_rates) / np.mean(systolic_values)
    }
    
    return features

print("‚úÖ Fixed function defined!")

‚úÖ Fixed function defined!


## Step 19: Test Feature Extraction Function

Before processing all 100 segments, we test the enhanced feature extraction function on ONE patient to verify it works correctly.

**What this code does:**
1. Loads the list of 100 complete segments (with all vital signs)
2. Selects the first patient as a test case
3. Runs the `extract_complete_features()` function
4. Displays sample features to verify output

**What we're checking:**
- ‚úÖ All 31 features are extracted correctly
- ‚úÖ Heart Rate, Blood Pressure, Respiration, and PLETH values are reasonable
- ‚úÖ No errors occur
- ‚úÖ Feature values make clinical sense

**Expected output:**
- Feature count: 31 features
- Sample values showing HR (~73 bpm), BP (~153 mmHg), Respiration (~21 br/min), PLETH quality (~0.33), Shock Index (~0.48)

**Why this is important:**
Testing on one patient before processing all 100 saves time if there are bugs. If this works, we know the function is ready for batch processing.

In [23]:
# Test enhanced feature extraction on first complete segment
print("üß™ Testing enhanced feature extraction with ALL vitals...")

# Load complete segments
complete_df = pd.read_csv('complete_segments_with_all_vitals.csv')
print(f"‚úÖ Loaded {len(complete_df)} complete segments\n")

# Test on first segment
test_seg = complete_df.iloc[0]
print(f"Testing: {test_seg['patient']}/{test_seg['segment']}")

try:
    test_features = extract_complete_features(test_seg['patient'], test_seg['segment'])
    
    print(f"\n‚úÖ SUCCESS! Extracted {len(test_features)} features")
    print(f"\nüìä Sample features:")
    print(f"  Heart Rate: {test_features['hr_mean']:.1f} bpm (change: {test_features['hr_change']:+.1f})")
    print(f"  Blood Pressure: {test_features['bp_systolic_mean']:.1f} mmHg (change: {test_features['bp_change']:+.1f})")
    print(f"  Respiration: {test_features['resp_rate_mean']:.1f} breaths/min (change: {test_features['resp_change']:+.1f})")
    print(f"  PLETH Quality: {test_features['pleth_quality']:.2f}")
    print(f"  Shock Index: {test_features['shock_index']:.2f}")
    
except Exception as e:
    print(f"‚ùå ERROR: {e}")
    print("\nLet's debug this!")

üß™ Testing enhanced feature extraction with ALL vitals...
‚úÖ Loaded 100 complete segments

Testing: 30/3000393/3000393_0005

‚úÖ SUCCESS! Extracted 31 features

üìä Sample features:
  Heart Rate: 73.4 bpm (change: +10.6)
  Blood Pressure: 153.3 mmHg (change: +2.2)
  Respiration: 20.8 breaths/min (change: -2.4)
  PLETH Quality: 0.33
  Shock Index: 0.48


## Step 20: Extract Features from All 100 Complete Segments

Now we process all 100 segments that have complete vital signs (ECG, ABP, RESP, PLETH).

**What this code does:**
1. Loops through all 100 complete segments
2. For each segment, calls `extract_complete_features()` to extract 31 features
3. Handles errors gracefully (some segments may have poor signal quality)
4. Shows progress every 10 segments
5. Converts results to DataFrame and saves to CSV

**Processing time:** 5-10 minutes

**Expected results:**
- Successfully extract features from ~90-95 segments
- 5-10 segments may fail due to:
  - Too few respiratory cycles detected (most common error)
  - Too few heartbeats or BP peaks detected
  - Poor signal quality or sensor disconnections

**Output file:** `complete_features_all_vitals.csv`
- Contains: 31 features √ó ~91 segments
- Includes: HR, BP, Respiration, PLETH/SpO2, duration, and clinical metrics (shock index)

**Why some segments fail:**
ICU waveform data is noisy. Some recordings have gaps, artifacts, or sensor disconnections. This is normal and expected in real-world medical data. We still get 91 high-quality segments, which is excellent for our analysis.

**What happens next:**
After this completes, we'll label each segment as "stable" or "deteriorating" based on clinical patterns in the vital signs.

In [24]:
print("üöÄ Extracting features from all 100 complete segments...")
print("This will take 5-10 minutes...\n")

all_complete_features = []
errors = []

for i, row in complete_df.iterrows():
    try:
        features = extract_complete_features(row['patient'], row['segment'])
        all_complete_features.append(features)
        
        if (i + 1) % 10 == 0:
            print(f"‚úÖ Processed {i + 1}/100 segments...")
            
    except Exception as e:
        errors.append({'patient': row['patient'], 'segment': row['segment'], 'error': str(e)})
        print(f"‚ùå Error: {row['patient']}/{row['segment']}: {str(e)}")

print(f"\nüéâ COMPLETE!")
print(f"   Successfully extracted: {len(all_complete_features)}/100 segments")
if errors:
    print(f"   Errors: {len(errors)} segments failed")

# Convert to DataFrame
complete_features_df = pd.DataFrame(all_complete_features)

# Save
complete_features_df.to_csv('complete_features_all_vitals.csv', index=False)
print(f"\nüíæ Saved to: complete_features_all_vitals.csv")
print(f"   Dataset: {len(complete_features_df)} segments √ó {len(complete_features_df.columns)} features")

print(f"\nüìä Preview:")
print(complete_features_df.head())

üöÄ Extracting features from all 100 complete segments...
This will take 5-10 minutes...

‚úÖ Processed 10/100 segments...
‚ùå Error: 30/3000989/3000989_0006: Too few respiratory cycles
‚úÖ Processed 20/100 segments...
‚úÖ Processed 30/100 segments...
‚úÖ Processed 40/100 segments...
‚ùå Error: 31/3101412/3101412_0001: Too few respiratory cycles
‚ùå Error: 31/3101412/3101412_0003: Too few respiratory cycles
‚úÖ Processed 50/100 segments...
‚úÖ Processed 60/100 segments...
‚ùå Error: 31/3102651/3102651_0015: Too few BP peaks
‚ùå Error: 31/3102779/3102779_0012: Too few respiratory cycles
‚ùå Error: 31/3102912/3102912_0016: Too few respiratory cycles
‚úÖ Processed 70/100 segments...
‚ùå Error: 31/3103105/3103105_0002: Too few respiratory cycles
‚ùå Error: 31/3103807/3103807_0014: Too few respiratory cycles
‚ùå Error: 31/3103807/3103807_0021: Too few respiratory cycles
‚úÖ Processed 100/100 segments...

üéâ COMPLETE!
   Successfully extracted: 91/100 segments
   Errors: 9 segments failed

## Step 21: Label Segments as Stable or Deteriorating

Now we create labels for our ML model by analyzing vital sign patterns to identify which segments show deterioration.

**What is labeling?**
We need to tell our ML model which segments represent "deteriorating" patients vs "stable" patients. We define clinical patterns that indicate deterioration based on established ICU early warning criteria.

**Deterioration Patterns (label = 1):**
1. **Classic compensatory shock:** HR increasing + BP decreasing
2. **Tachycardia worsening:** High HR (>100 bpm) and rising
3. **Hypotension worsening:** Low BP (<100 mmHg) and falling  
4. **High shock index:** Ratio > 0.9 indicates hemodynamic instability
5. **Large HR increase:** >20 bpm change
6. **Large BP drop:** >15 mmHg decrease
7. **Respiratory distress (NEW!):** Tachypnea (>25 br/min) and worsening
8. **Bradypnea (NEW!):** Dangerously slow breathing (<8 br/min)

**Stable Pattern (label = 0):**
- No concerning patterns detected
- Vital signs remain within normal ranges

**What this code does:**
1. Loads the extracted features from CSV
2. Defines the `label_complete_segment()` function with 8 deterioration patterns
3. Applies labeling to all 91 segments
4. Calculates deterioration rate
5. Saves labeled dataset to `FINAL_COMPLETE_DATASET.csv`
6. Shows example segments from each category

**Expected results:**
- **Stable (0):** ~70 segments (77%)
- **Deteriorating (1):** ~21 segments (23%)
- **Deterioration rate:** ~23% (clinically realistic for ICU populations)

**Why 23% is good:**
In real ICU settings, 20-30% of patients show deterioration signs. Most patients are successfully stabilized, so our rate reflects actual clinical practice.

**Output file:** `FINAL_COMPLETE_DATASET.csv`
- All 31 features + deterioration label
- Ready for ML training
- 91 rows √ó 32 columns

In [25]:
# Load the complete features
complete_features_df = pd.read_csv('complete_features_all_vitals.csv')

print(f"üìä Dataset: {len(complete_features_df)} segments with {len(complete_features_df.columns)} features")

# Enhanced labeling function with respiratory rate
def label_complete_segment(row):
    """
    Label segment with ALL vital signs
    Deterioration indicators:
    - HR increasing + BP decreasing
    - High HR + rising
    - Low BP + falling
    - High shock index
    - Abnormal respiration patterns
    """
    
    # Pattern 1: Classic deterioration (HR up, BP down)
    if row['hr_change'] > 10 and row['bp_change'] < -5:
        return 1
    
    # Pattern 2: Tachycardia worsening
    if row['hr_mean'] > 100 and row['hr_change'] > 5:
        return 1
    
    # Pattern 3: Hypotension worsening
    if row['bp_systolic_mean'] < 100 and row['bp_change'] < -5:
        return 1
    
    # Pattern 4: High shock index
    if row['shock_index'] > 0.9:
        return 1
    
    # Pattern 5: Large HR increase
    if row['hr_change'] > 20:
        return 1
    
    # Pattern 6: Large BP drop
    if row['bp_change'] < -15:
        return 1
    
    # Pattern 7 (NEW!): Respiratory distress (tachypnea worsening)
    if row['resp_rate_mean'] > 25 and row['resp_change'] > 3:
        return 1
    
    # Pattern 8 (NEW!): Bradypnea (dangerously slow breathing)
    if row['resp_rate_mean'] < 8:
        return 1
    
    # Otherwise: Stable
    return 0

# Apply labeling
complete_features_df['deterioration'] = complete_features_df.apply(label_complete_segment, axis=1)

# Check distribution
print(f"\nüìä Label Distribution:")
print(f"   Stable (0): {(complete_features_df['deterioration']==0).sum()} segments")
print(f"   Deteriorating (1): {(complete_features_df['deterioration']==1).sum()} segments")
print(f"   Deterioration rate: {complete_features_df['deterioration'].mean()*100:.1f}%")

# Save labeled dataset
complete_features_df.to_csv('FINAL_COMPLETE_DATASET.csv', index=False)
print(f"\nüíæ Saved to: FINAL_COMPLETE_DATASET.csv")

# Show examples
print(f"\n‚úÖ Example STABLE segments:")
print(complete_features_df[complete_features_df['deterioration']==0][
    ['patient_id', 'hr_mean', 'hr_change', 'bp_systolic_mean', 'bp_change', 'resp_rate_mean', 'resp_change']
].head(3))

print(f"\n‚ö†Ô∏è  Example DETERIORATING segments:")
print(complete_features_df[complete_features_df['deterioration']==1][
    ['patient_id', 'hr_mean', 'hr_change', 'bp_systolic_mean', 'bp_change', 'resp_rate_mean', 'resp_change']
].head(3))

üìä Dataset: 91 segments with 31 features

üìä Label Distribution:
   Stable (0): 70 segments
   Deteriorating (1): 21 segments
   Deterioration rate: 23.1%

üíæ Saved to: FINAL_COMPLETE_DATASET.csv

‚úÖ Example STABLE segments:
   patient_id    hr_mean  hr_change  bp_systolic_mean  bp_change  \
0  30/3000393  73.357148  10.594376        153.341455   2.170418   
2  30/3000393  70.761289   6.802265        133.634807  17.892585   
3  30/3000480  87.323933  -8.663530        111.273350   3.352389   

   resp_rate_mean  resp_change  
0       20.776781    -2.350273  
2       17.879977     0.110679  
3       15.331300    -1.048867  

‚ö†Ô∏è  Example DETERIORATING segments:
    patient_id    hr_mean  hr_change  bp_systolic_mean  bp_change  \
1   30/3000393  74.753314   0.166996        142.456722 -16.690726   
7   30/3000860  82.754216  12.898548        119.935149 -10.600357   
17  30/3001203  95.974713   5.298301        102.974386  -5.038004   

    resp_rate_mean  resp_change  
1        19

## Step 22: Train Machine Learning Model

Now we train a Random Forest classifier to predict patient deterioration based on vital sign features.

**What is Random Forest?**
An ensemble machine learning algorithm that creates multiple decision trees and combines their predictions. It's robust, handles imbalanced data well, and provides feature importance rankings.

**Model Configuration:**
- **Algorithm:** Random Forest Classifier
- **Trees:** 100 decision trees
- **Max depth:** 8 (prevents overfitting)
- **Class weight:** Balanced (handles our 77% stable / 23% deteriorating imbalance)
- **Random state:** 42 (for reproducibility)

**What this code does:**
1. Loads the labeled dataset (91 segments with 31 features + label)
2. Separates features (X) from labels (y)
3. Splits data into train (80%) and test (20%) sets
4. Trains Random Forest model on training data
5. Makes predictions on test set
6. Evaluates performance with multiple metrics
7. Identifies most important features
8. Saves trained model for later use

**Data Split:**
- **Train set:** 72 segments (55 stable, 17 deteriorating)
- **Test set:** 19 segments (15 stable, 4 deteriorating)
- Uses stratified split to maintain class balance in both sets

**Performance Metrics:**
- **Accuracy:** Overall correctness (% of correct predictions)
- **Precision:** Of patients flagged as deteriorating, how many actually are? (avoids false alarms)
- **Recall:** Of actual deteriorating patients, how many did we catch? (sensitivity)
- **Confusion Matrix:** Shows true positives, false positives, true negatives, false negatives

**Expected Results:**
- Accuracy: ~85-90%
- Precision: ~80-100% (low false alarm rate)
- Recall: ~50-75% (catches most deteriorating patients)
- Trade-off: Conservative model that avoids false alarms

**Feature Importance:**
The model ranks which features are most predictive. Expected top features:
1. Blood pressure changes (BP dropping = strong deterioration signal)
2. Heart rate trends
3. Shock index
4. Respiratory changes

**Output file:** `final_model_all_vitals.pkl`
- Saved trained model
- Can be loaded later to make predictions on new patients
- Contains all learned patterns from training data

**Clinical interpretation:**
A high-precision model is preferred in healthcare to avoid "alarm fatigue" from false alarms, even if it means missing some cases (which triggers closer monitoring rather than missed diagnoses).

In [26]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, confusion_matrix, classification_report

# Load final dataset
final_data = pd.read_csv('FINAL_COMPLETE_DATASET.csv')

print(f"üìä Training ML Model with Complete Vital Signs")
print(f"   Total segments: {len(final_data)}")
print(f"   Stable: {(final_data['deterioration']==0).sum()}")
print(f"   Deteriorating: {(final_data['deterioration']==1).sum()}")

# Select feature columns (exclude patient_id, segment, deterioration)
feature_columns = [col for col in final_data.columns 
                   if col not in ['patient_id', 'segment', 'deterioration']]

print(f"\nüìã Using {len(feature_columns)} features:")
print(f"   Heart Rate: 8 features")
print(f"   Blood Pressure: 8 features")
print(f"   Respiration: 8 features")
print(f"   PLETH/SpO2: 3 features")
print(f"   Clinical: 1 feature (shock index)")
print(f"   Duration: 1 feature")

X = final_data[feature_columns]
y = final_data['deterioration']

# Split (80/20)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"\nüìö Train set: {len(X_train)} (Stable: {(y_train==0).sum()}, Deteriorating: {(y_train==1).sum()})")
print(f"üß™ Test set: {len(X_test)} (Stable: {(y_test==0).sum()}, Deteriorating: {(y_test==1).sum()})")

# Train Random Forest
print(f"\nü§ñ Training Random Forest...")
model = RandomForestClassifier(
    n_estimators=100,
    max_depth=8,
    class_weight='balanced',
    random_state=42
)

model.fit(X_train, y_train)
print(f"‚úÖ Model trained!")

# Predictions
y_pred = model.predict(X_test)
y_prob = model.predict_proba(X_test)[:, 1]

# Evaluation
print(f"\nüìà MODEL PERFORMANCE:")
print(f"   Accuracy: {accuracy_score(y_test, y_pred):.1%}")
print(f"   Precision: {precision_score(y_test, y_pred):.1%}")
print(f"   Recall: {recall_score(y_test, y_pred):.1%}")

print(f"\nüìä Confusion Matrix:")
cm = confusion_matrix(y_test, y_pred)
print(cm)
print(f"\nTrue Negatives: {cm[0,0]}, False Positives: {cm[0,1]}")
print(f"False Negatives: {cm[1,0]}, True Positives: {cm[1,1]}")

print(f"\nüìã Detailed Classification Report:")
print(classification_report(y_test, y_pred, target_names=['Stable', 'Deteriorating']))

# Feature importance
feature_importance = pd.DataFrame({
    'feature': feature_columns,
    'importance': model.feature_importances_
}).sort_values('importance', ascending=False)

print(f"\nüîç Top 10 Most Important Features:")
print(feature_importance.head(10).to_string(index=False))

# Save model
import pickle
with open('final_model_all_vitals.pkl', 'wb') as f:
    pickle.dump(model, f)
    
print(f"\nüíæ Model saved to: final_model_all_vitals.pkl")

üìä Training ML Model with Complete Vital Signs
   Total segments: 91
   Stable: 70
   Deteriorating: 21

üìã Using 29 features:
   Heart Rate: 8 features
   Blood Pressure: 8 features
   Respiration: 8 features
   PLETH/SpO2: 3 features
   Clinical: 1 feature (shock index)
   Duration: 1 feature

üìö Train set: 72 (Stable: 55, Deteriorating: 17)
üß™ Test set: 19 (Stable: 15, Deteriorating: 4)

ü§ñ Training Random Forest...
‚úÖ Model trained!

üìà MODEL PERFORMANCE:
   Accuracy: 89.5%
   Precision: 100.0%
   Recall: 50.0%

üìä Confusion Matrix:
[[15  0]
 [ 2  2]]

True Negatives: 15, False Positives: 0
False Negatives: 2, True Positives: 2

üìã Detailed Classification Report:
               precision    recall  f1-score   support

       Stable       0.88      1.00      0.94        15
Deteriorating       1.00      0.50      0.67         4

     accuracy                           0.89        19
    macro avg       0.94      0.75      0.80        19
 weighted avg       0.91      

## Step 23: Test Complete System - ML Prediction on Sample Patient

Before integrating the LLM, we test the trained ML model on one patient to verify it works correctly and see what data we'll send to the LLM.

**What this code does:**
1. Selects the first patient from our dataset as a test case
2. Prepares their features for the ML model
3. Gets ML prediction (stable vs deteriorating) and probability
4. Displays comprehensive vital signs analysis
5. Shows what information will be sent to the LLM for explanation

**ML Model Output:**
- **Classification:** STABLE or DETERIORATING (binary prediction)
- **Probability:** Risk percentage (0-100%)
- Higher probability = higher confidence in deterioration prediction

**Vital Signs Display:**
Shows the complete picture for clinical interpretation:
- **Heart Rate:** Early vs Late values, absolute change, percent change
- **Blood Pressure:** Early vs Late values, absolute change, percent change  
- **Respiration Rate:** Early vs Late values, absolute change, percent change
- **Shock Index:** Calculated ratio (HR/BP), with risk level indicator
  - Normal: <0.7 ‚úì
  - Elevated: 0.7-0.9 ‚ö°
  - High: >0.9 ‚ö†Ô∏è (concerning!)

**Why we test on one patient first:**
- Verify ML model makes reasonable predictions
- See actual vital sign values and trends
- Understand what context the LLM will receive
- Debug before processing all 91 patients

**What happens next:**
This vital signs summary will be formatted into a clinical prompt and sent to Claude API, which will generate a plain-English explanation of the patient's condition and deterioration risk.

**Expected output:**
- Patient identification
- ML risk assessment  
- Detailed vital signs breakdown
- Confirmation that system is ready for LLM integration

In [27]:
# Test the complete system: ML + LLM explanation

# Select a test case
test_idx = 0
test_patient = final_data.iloc[test_idx]
test_features_input = test_patient[feature_columns].values.reshape(1, -1)

# ML Prediction
prediction = model.predict(test_features_input)[0]
probability = model.predict_proba(test_features_input)[0, 1]

print("üî¨ COMPLETE SYSTEM TEST: ML + LLM")
print("="*70)
print(f"\nüë§ Patient: {test_patient['patient_id']}")
print(f"üìã Segment: {test_patient['segment']}")
print(f"‚è±Ô∏è  Duration: {test_patient['duration_hours']:.1f} hours")

print(f"\nü§ñ ML MODEL PREDICTION:")
print(f"   Risk Level: {'‚ö†Ô∏è DETERIORATING' if prediction == 1 else '‚úÖ STABLE'}")
print(f"   Probability: {probability:.1%}")

print(f"\nüìä VITAL SIGNS ANALYSIS:")
print(f"\n  üíì Heart Rate:")
print(f"     Early: {test_patient['hr_early_mean']:.1f} bpm ‚Üí Late: {test_patient['hr_late_mean']:.1f} bpm")
print(f"     Change: {test_patient['hr_change']:+.1f} bpm ({test_patient['hr_percent_change']:+.1f}%)")

print(f"\n  ü©∫ Blood Pressure:")
print(f"     Early: {test_patient['bp_early_mean']:.1f} mmHg ‚Üí Late: {test_patient['bp_late_mean']:.1f} mmHg")
print(f"     Change: {test_patient['bp_change']:+.1f} mmHg ({test_patient['bp_percent_change']:+.1f}%)")

print(f"\n  ü´Å Respiration Rate:")
print(f"     Early: {test_patient['resp_early_mean']:.1f} br/min ‚Üí Late: {test_patient['resp_late_mean']:.1f} br/min")
print(f"     Change: {test_patient['resp_change']:+.1f} br/min ({test_patient['resp_percent_change']:+.1f}%)")

print(f"\n  üíâ Clinical Metrics:")
print(f"     Shock Index: {test_patient['shock_index']:.2f} ({'‚ö†Ô∏è HIGH' if test_patient['shock_index'] > 0.9 else '‚úì Normal' if test_patient['shock_index'] < 0.7 else '‚ö° Elevated'})")

print(f"\nüìù READY FOR LLM EXPLANATION!")
print(f"Next: We'll send this data to Claude API for clinical interpretation")

üî¨ COMPLETE SYSTEM TEST: ML + LLM

üë§ Patient: 30/3000393
üìã Segment: 3000393_0005
‚è±Ô∏è  Duration: 4.8 hours

ü§ñ ML MODEL PREDICTION:
   Risk Level: ‚úÖ STABLE
   Probability: 11.0%

üìä VITAL SIGNS ANALYSIS:

  üíì Heart Rate:
     Early: 68.1 bpm ‚Üí Late: 78.7 bpm
     Change: +10.6 bpm (+15.6%)

  ü©∫ Blood Pressure:
     Early: 152.3 mmHg ‚Üí Late: 154.4 mmHg
     Change: +2.2 mmHg (+1.4%)

  ü´Å Respiration Rate:
     Early: 22.0 br/min ‚Üí Late: 19.6 br/min
     Change: -2.4 br/min (-10.7%)

  üíâ Clinical Metrics:
     Shock Index: 0.48 (‚úì Normal)

üìù READY FOR LLM EXPLANATION!
Next: We'll send this data to Claude API for clinical interpretation




## Step 24: Create Clinical Prompt for LLM Explanation

Now we create a structured prompt that will be sent to Claude API to generate clinical explanations.

**What is a prompt?**
A prompt is the input text we send to the LLM (Large Language Model). It provides context and instructions for what we want the AI to generate.

**What this function does:**
The `create_clinical_prompt()` function takes:
- **Input:** Patient vital signs data, ML prediction, and probability
- **Output:** Formatted text prompt ready for Claude API

**Prompt Structure:**
1. **Role setting:** "You are a clinical expert analyzing ICU patient vital signs"
2. **Context:** Patient monitoring duration and time period
3. **Vital signs data:** 
   - Heart Rate (early, late, change, average)
   - Blood Pressure (early, late, change, average)
   - Respiration Rate (early, late, change, average)
4. **Clinical metrics:** Shock index with normal reference range
5. **ML assessment:** Risk probability and classification
6. **Instructions:** Specific request for 3-4 sentence clinical assessment

**Why this format works:**
- **Structured data:** LLM receives organized, clear information
- **Clinical context:** Includes reference ranges (e.g., shock index <0.7)
- **Specific instructions:** Tells LLM exactly what to analyze and explain
- **Professional tone:** Requests medical-grade explanation

**What we're asking Claude to do:**
1. Interpret the vital sign patterns
2. Assess deterioration risk
3. Identify key concerning or reassuring findings
4. Provide clinical reasoning

**Test output:**
This code generates a sample prompt using the test patient and displays it so we can verify:
- ‚úÖ All vital signs are included
- ‚úÖ Values are formatted correctly  
- ‚úÖ Instructions are clear
- ‚úÖ Ready to send to Claude API

**Next step:**
After verifying the prompt looks good, we'll integrate with Claude API to actually generate explanations for all patients.

In [28]:
# Create LLM explanation prompt
def create_clinical_prompt(patient_data, prediction, probability):
    """Generate prompt for Claude to explain the prediction"""
    
    prompt = f"""You are a clinical expert analyzing ICU patient vital signs.

PATIENT MONITORING DATA ({patient_data['duration_hours']:.1f} hour period):

HEART RATE:
- Early period: {patient_data['hr_early_mean']:.1f} bpm
- Late period: {patient_data['hr_late_mean']:.1f} bpm
- Change: {patient_data['hr_change']:+.1f} bpm ({patient_data['hr_percent_change']:+.1f}%)
- Average: {patient_data['hr_mean']:.1f} bpm

BLOOD PRESSURE (Systolic):
- Early period: {patient_data['bp_early_mean']:.1f} mmHg
- Late period: {patient_data['bp_late_mean']:.1f} mmHg
- Change: {patient_data['bp_change']:+.1f} mmHg ({patient_data['bp_percent_change']:+.1f}%)
- Average: {patient_data['bp_systolic_mean']:.1f} mmHg

RESPIRATION RATE:
- Early period: {patient_data['resp_early_mean']:.1f} breaths/min
- Late period: {patient_data['resp_late_mean']:.1f} breaths/min
- Change: {patient_data['resp_change']:+.1f} breaths/min ({patient_data['resp_percent_change']:+.1f}%)
- Average: {patient_data['resp_rate_mean']:.1f} breaths/min

CLINICAL METRICS:
- Shock Index: {patient_data['shock_index']:.2f} (normal <0.7)

ML MODEL ASSESSMENT:
- Deterioration Risk: {probability:.1%}
- Classification: {"DETERIORATING" if prediction == 1 else "STABLE"}

Provide a clinical assessment in 3-4 sentences explaining:
1. What these vital sign patterns indicate
2. Whether this patient shows signs of deterioration
3. Key concerning or reassuring findings
4. Clinical reasoning for the assessment

Be specific, professional, and focus on the most clinically significant changes."""

    return prompt

# Test prompt generation
test_prompt = create_clinical_prompt(test_patient, prediction, probability)

print("üìù CLINICAL PROMPT FOR LLM:")
print("="*70)
print(test_prompt)
print("\n" + "="*70)
print("\n‚úÖ Prompt ready to send to Claude API!")
print("\nüí° To actually call Claude API, you need:")
print("   1. Anthropic API key")
print("   2. anthropic Python package")
print("\nFor now, this demonstrates what we would send to Claude.")

üìù CLINICAL PROMPT FOR LLM:
You are a clinical expert analyzing ICU patient vital signs.

PATIENT MONITORING DATA (4.8 hour period):

HEART RATE:
- Early period: 68.1 bpm
- Late period: 78.7 bpm
- Change: +10.6 bpm (+15.6%)
- Average: 73.4 bpm

BLOOD PRESSURE (Systolic):
- Early period: 152.3 mmHg
- Late period: 154.4 mmHg
- Change: +2.2 mmHg (+1.4%)
- Average: 153.3 mmHg

RESPIRATION RATE:
- Early period: 22.0 breaths/min
- Late period: 19.6 breaths/min
- Change: -2.4 breaths/min (-10.7%)
- Average: 20.8 breaths/min

CLINICAL METRICS:
- Shock Index: 0.48 (normal <0.7)

ML MODEL ASSESSMENT:
- Deterioration Risk: 11.0%
- Classification: STABLE

Provide a clinical assessment in 3-4 sentences explaining:
1. What these vital sign patterns indicate
2. Whether this patient shows signs of deterioration
3. Key concerning or reassuring findings
4. Clinical reasoning for the assessment

Be specific, professional, and focus on the most clinically significant changes.


‚úÖ Prompt ready to send 

## Step 25: Test Complete ML + LLM System

Now we test the entire end-to-end system: ML prediction ‚Üí LLM explanation.

**What this code does:**
1. Loads the final labeled dataset
2. Selects one test patient
3. Gets ML model prediction (stable/deteriorating + probability)
4. Displays vital signs summary
5. Creates clinical prompt with all patient data
6. Calls Claude API to generate explanation
7. Displays the LLM's clinical interpretation

**The Complete Pipeline:**
```
Patient Data ‚Üí ML Model ‚Üí Prediction + Probability
                              ‚Üì
                    Format Clinical Prompt
                              ‚Üì
                         Claude API
                              ‚Üì
                    Clinical Explanation
```

**What you should see:**
1. **Patient identification:** ID and segment number
2. **ML prediction:** Classification (stable/deteriorating) and risk %
3. **Vital signs summary:** HR, BP, RR changes with arrows showing trends
4. **Claude's explanation:** 3-4 sentences of clinical reasoning including:
   - Interpretation of vital sign patterns
   - Assessment of deterioration risk
   - Key concerning or reassuring findings
   - Clinical reasoning and recommendations

**Why this test is important:**
- Verifies ML model works correctly
- Tests Claude API connection and authentication
- Validates prompt format produces good explanations
- Confirms entire system integrates properly
- Shows what the final demo will look like

**Expected Claude response:**
For a **stable** patient, Claude might say:
> "This patient demonstrates stable hemodynamics with reassuring vital sign trends. The heart rate increase is moderate and accompanied by stable blood pressure, indicating adequate cardiovascular compensation. The shock index remains well within normal limits, and the absence of concurrent hypotension rules out compensatory shock."

For a **deteriorating** patient, Claude might say:
> "‚ö†Ô∏è This patient exhibits concerning signs of early compensatory shock requiring immediate attention. The combination of rising heart rate and falling blood pressure represents classic hemodynamic decompensation. The elevated shock index exceeds critical thresholds, indicating significant hemodynamic instability that typically precedes cardiovascular collapse by 4-6 hours."

**Troubleshooting:**
- If you see "Error calling GPT/Claude": Check API key and credits
- If explanation is too short/generic: Adjust prompt instructions
- If formatting is off: Check that all patient features exist

**Success indicator:** ‚úÖ You should see a detailed, clinically accurate explanation that makes sense given the vital signs!

In [29]:
# Step 4: Test LLM explanation on ONE patient

# Load your final dataset
final_data = pd.read_csv('FINAL_COMPLETE_DATASET.csv')

# Select first patient for testing
test_idx = 0
test_patient = final_data.iloc[test_idx]

# Prepare features for ML model
feature_columns = [col for col in final_data.columns 
                   if col not in ['patient_id', 'segment', 'deterioration']]
test_features_input = test_patient[feature_columns].values.reshape(1, -1)

# Get ML prediction
prediction = model.predict(test_features_input)[0]
probability = model.predict_proba(test_features_input)[0, 1]

# Display test case
print("üî¨ TESTING COMPLETE SYSTEM: ML + LLM")
print("="*70)
print(f"\nüë§ Patient: {test_patient['patient_id']}")
print(f"üìã Segment: {test_patient['segment']}")
print(f"‚è±Ô∏è  Duration: {test_patient['duration_hours']:.1f} hours")

print(f"\nü§ñ ML MODEL PREDICTION:")
print(f"   Classification: {'‚ö†Ô∏è DETERIORATING' if prediction == 1 else '‚úÖ STABLE'}")
print(f"   Risk Probability: {probability:.1%}")

print(f"\nüìä VITAL SIGNS SUMMARY:")
print(f"   üíì HR:  {test_patient['hr_early_mean']:.1f} ‚Üí {test_patient['hr_late_mean']:.1f} bpm ({test_patient['hr_change']:+.1f})")
print(f"   ü©∫ BP:  {test_patient['bp_early_mean']:.1f} ‚Üí {test_patient['bp_late_mean']:.1f} mmHg ({test_patient['bp_change']:+.1f})")
print(f"   ü´Å RR:  {test_patient['resp_early_mean']:.1f} ‚Üí {test_patient['resp_late_mean']:.1f} br/min ({test_patient['resp_change']:+.1f})")
print(f"   üíâ Shock Index: {test_patient['shock_index']:.2f}")

# Generate LLM explanation
print(f"\nü§ñ Generating GPT explanation...")
prompt = create_clinical_prompt(test_patient, prediction, probability)
explanation = get_llm_explanation(prompt)

print(f"\nüí¨ GPT CLINICAL EXPLANATION:")
print("="*70)
print(explanation)
print("="*70)
print(f"\n‚úÖ SYSTEM WORKS! ML prediction + LLM explanation complete!")



üî¨ TESTING COMPLETE SYSTEM: ML + LLM

üë§ Patient: 30/3000393
üìã Segment: 3000393_0005
‚è±Ô∏è  Duration: 4.8 hours

ü§ñ ML MODEL PREDICTION:
   Classification: ‚úÖ STABLE
   Risk Probability: 11.0%

üìä VITAL SIGNS SUMMARY:
   üíì HR:  68.1 ‚Üí 78.7 bpm (+10.6)
   ü©∫ BP:  152.3 ‚Üí 154.4 mmHg (+2.2)
   ü´Å RR:  22.0 ‚Üí 19.6 br/min (-2.4)
   üíâ Shock Index: 0.48

ü§ñ Generating GPT explanation...


NameError: name 'get_llm_explanation' is not defined

# LLM Integration - Claude API

Now we integrate Claude AI to generate clinical explanations for our ML predictions.

## Step 26: Install Anthropic Package

Install the official Anthropic Python SDK to communicate with Claude API.

In [None]:
# Install Anthropic package
!pip install anthropic

print("‚úÖ Anthropic package installed!")

In [None]:
# Set up Claude API
import anthropic

# REPLACE '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' with your actual API key!
client = anthropic.Anthropic(
    api_key="sk-ant-a..."  # ‚Üê PASTE YOUR KEY HERE
)

print("‚úÖ Claude API client initialized!")
print("üîë API key loaded successfully")

## Step 28: Create LLM Explanation Function

Define the function that sends clinical prompts to Claude and receives explanations.

**What this function does:**
- Takes a clinical prompt as input
- Sends it to Claude API using the Sonnet 4 model
- Returns Claude's clinical explanation
- Handles errors gracefully if API call fails

**Model:** `claude-sonnet-4-20250514` (best for medical reasoning)

**Cost:** ~$0.003 per explanation (very cheap!)

In [None]:
# Create LLM explanation function
def get_llm_explanation(prompt):
    """Call Claude API for clinical explanation"""
    
    try:
        message = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=500,
            messages=[
                {"role": "user", "content": prompt}
            ]
        )
        
        return message.content[0].text
        
    except Exception as e:
        return f"‚ùå Error calling Claude: {str(e)}"

print("‚úÖ LLM explanation function ready!")
print("ü§ñ Claude Sonnet 4 configured for clinical reasoning")

## Step 29: Test Complete ML + LLM System on One Patient

Test the entire end-to-end system: ML prediction ‚Üí Clinical prompt ‚Üí Claude explanation.

**What this code does:**
1. Loads the final labeled dataset
2. Selects one test patient
3. Gets ML model prediction (stable/deteriorating + probability)
4. Displays vital signs summary
5. Creates clinical prompt with all patient data
6. Calls Claude API to generate explanation
7. Displays Claude's clinical interpretation

**Expected result:** 
You should see ML prediction + detailed vital signs + Claude's 3-4 sentence clinical explanation.

**Cost:** ~$0.003 (less than 1 cent) for this one test

In [None]:
# Test LLM explanation on ONE patient

# Load your final dataset
final_data = pd.read_csv('FINAL_COMPLETE_DATASET.csv')

# Select first patient for testing
test_idx = 0
test_patient = final_data.iloc[test_idx]

# Prepare features for ML model
feature_columns = [col for col in final_data.columns 
                   if col not in ['patient_id', 'segment', 'deterioration']]
test_features_input = test_patient[feature_columns].values.reshape(1, -1)

# Get ML prediction
prediction = model.predict(test_features_input)[0]
probability = model.predict_proba(test_features_input)[0, 1]

# Display test case
print("üî¨ TESTING COMPLETE SYSTEM: ML + LLM")
print("="*70)
print(f"\nüë§ Patient: {test_patient['patient_id']}")
print(f"üìã Segment: {test_patient['segment']}")
print(f"‚è±Ô∏è  Duration: {test_patient['duration_hours']:.1f} hours")

print(f"\nü§ñ ML MODEL PREDICTION:")
print(f"   Classification: {'‚ö†Ô∏è DETERIORATING' if prediction == 1 else '‚úÖ STABLE'}")
print(f"   Risk Probability: {probability:.1%}")

print(f"\nüìä VITAL SIGNS SUMMARY:")
print(f"   üíì HR:  {test_patient['hr_early_mean']:.1f} ‚Üí {test_patient['hr_late_mean']:.1f} bpm ({test_patient['hr_change']:+.1f})")
print(f"   ü©∫ BP:  {test_patient['bp_early_mean']:.1f} ‚Üí {test_patient['bp_late_mean']:.1f} mmHg ({test_patient['bp_change']:+.1f})")
print(f"   ü´Å RR:  {test_patient['resp_early_mean']:.1f} ‚Üí {test_patient['resp_late_mean']:.1f} br/min ({test_patient['resp_change']:+.1f})")
print(f"   üíâ Shock Index: {test_patient['shock_index']:.2f}")

# Generate Claude explanation
print(f"\nü§ñ Generating Claude explanation...")
prompt = create_clinical_prompt(test_patient, prediction, probability)
explanation = get_llm_explanation(prompt)

print(f"\nüí¨ CLAUDE CLINICAL EXPLANATION:")
print("="*70)
print(explanation)
print("="*70)
print(f"\n‚úÖ SYSTEM WORKS! ML prediction + LLM explanation complete!")

## Step 30: Generate Claude Explanations for All 91 Patients

Now that we've verified the system works on one patient, let's generate clinical explanations for all segments in our dataset.

**What this code does:**
1. Loops through all 91 segments
2. For each segment:
   - Gets ML prediction
   - Creates clinical prompt
   - Calls Claude API for explanation
   - Saves results
3. Shows progress every 10 patients
4. Saves complete results to CSV

**Processing time:** ~3-5 minutes (91 API calls)
**Cost:** ~$0.27 (27 cents total)

**Output file:** `ML_CLAUDE_EXPLANATIONS.csv`
- Contains: Patient ID, ML prediction, probability, Claude explanation
- Ready for analysis and presentation
- Can be used to compare stable vs deteriorating patient explanations

In [None]:
# Generate Claude explanations for ALL 91 patients

print("üöÄ Generating Claude explanations for all 91 patients...")
print("üí∞ Cost: ~$0.27 (27 cents total)")
print("‚è±Ô∏è  Time: ~3-5 minutes\n")

all_explanations = []

for idx, row in final_data.iterrows():
    try:
        # Prepare features
        features_input = row[feature_columns].values.reshape(1, -1)
        
        # Get ML prediction
        pred = model.predict(features_input)[0]
        prob = model.predict_proba(features_input)[0, 1]
        
        # Generate prompt
        prompt = create_clinical_prompt(row, pred, prob)
        
        # Get Claude explanation
        explanation = get_llm_explanation(prompt)
        
        # Store result
        all_explanations.append({
            'patient_id': row['patient_id'],
            'segment': row['segment'],
            'ml_prediction': 'DETERIORATING' if pred == 1 else 'STABLE',
            'ml_probability': prob,
            'claude_explanation': explanation
        })
        
        if (idx + 1) % 10 == 0:
            print(f"‚úÖ Processed {idx + 1}/91 patients...")
            
    except Exception as e:
        print(f"‚ùå Error on patient {idx}: {str(e)}")

print(f"\nüéâ COMPLETE! Generated {len(all_explanations)} explanations")

# Save to CSV
explanations_df = pd.DataFrame(all_explanations)
explanations_df.to_csv('ML_CLAUDE_EXPLANATIONS.csv', index=False)

print(f"üíæ Saved to: ML_CLAUDE_EXPLANATIONS.csv")
print(f"üìä Dataset: {len(explanations_df)} patients √ó {len(explanations_df.columns)} columns")

# Show summary statistics
print(f"\nüìà Summary:")
print(f"   Stable predictions: {(explanations_df['ml_prediction']=='STABLE').sum()}")
print(f"   Deteriorating predictions: {(explanations_df['ml_prediction']=='DETERIORATING').sum()}")

# Show example explanations
print(f"\nüìã Example explanations:\n")
for i in range(min(3, len(explanations_df))):
    print(f"Patient {i+1}: {explanations_df.iloc[i]['patient_id']}")
    print(f"Prediction: {explanations_df.iloc[i]['ml_prediction']} ({explanations_df.iloc[i]['ml_probability']:.1%})")
    print(f"Claude: {explanations_df.iloc[i]['claude_explanation'][:200]}...")
    print("-"*70)

print("\n‚úÖ ALL DONE! Your complete ML + LLM system is ready for presentation!")

# PART 2: Corrected Methodology - Patient-Level Splitting

## Identifying and Fixing Data Leakage

**Problem discovered:** Our initial approach treated segments from the same patient as independent examples. This violates the independence assumption in machine learning and can lead to optimistic performance estimates.

**Our current code does this:** 
train_test_split(X, y, test_size=0.2, random_state=42)

**This randomly splits SEGMENTS, so we might get:**
TRAINING SET:
- 3000003_0001 ‚úì
- 3000003_0002 ‚úì  
- 3000031_0001 ‚úì
- 3000393_0005 ‚úì

**TESTING SET:**
- 3000003_0005 ‚Üê PROBLEM! Same patient as training!
- 3000031_0004 ‚Üê PROBLEM! Same patient as training!

**Solution:** Split data by PATIENT first, ensuring no patient appears in both training and testing sets.

**Expected impact:** More honest accuracy estimate (likely 5-15% lower, but methodologically correct)

---

## Step 31: Analyze Patient Distribution

Let's first understand how many unique patients we have and how segments are distributed across patients.

**This analysis will show us:**
- How many segments each patient has
- Confirm no patient should appear in both train and test sets
- Implement the correct patient-level train/test split

### Load and Analyze patient distribution

In [2]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import joblib

# Load your complete dataset
df = pd.read_csv('FINAL_COMPLETE_DATASET.csv')

print("Dataset Overview:")
print(f"Total segments: {len(df)}")
print(f"Unique patients: {df['patient_id'].nunique()}")
print(f"Average segments per patient: {len(df) / df['patient_id'].nunique():.1f}")

# Show distribution of segments per patient
segments_per_patient = df.groupby('patient_id').size().sort_values(ascending=False)
print(f"\nSegments per patient:")
print(f"Maximum: {segments_per_patient.max()}")
print(f"Minimum: {segments_per_patient.min()}")
print(f"Median: {segments_per_patient.median():.1f}")

Dataset Overview:
Total segments: 91
Unique patients: 27
Average segments per patient: 3.4

Segments per patient:
Maximum: 12
Minimum: 1
Median: 2.0


### Implement Patient-Level Split

In [3]:
# Get unique patient IDs
unique_patients = df['patient_id'].unique()
print(f"Total unique patients: {len(unique_patients)}")

# Split PATIENTS (not segments) into train/test
train_patients, test_patients = train_test_split(
    unique_patients, 
    test_size=0.2, 
    random_state=42
)

print(f"\nPatient-Level Split:")
print(f"Training patients: {len(train_patients)}")
print(f"Testing patients: {len(test_patients)}")

# Create train/test sets based on patient assignment
train_data = df[df['patient_id'].isin(train_patients)]
test_data = df[df['patient_id'].isin(test_patients)]

print(f"\nSegment Distribution:")
print(f"Training segments: {len(train_data)} (from {len(train_patients)} patients)")
print(f"Testing segments: {len(test_data)} (from {len(test_patients)} patients)")

# Verify no patient overlap
train_patient_check = set(train_data['patient_id'].unique())
test_patient_check = set(test_data['patient_id'].unique())
overlap = train_patient_check.intersection(test_patient_check)
print(f"\nVerification - Patient overlap: {len(overlap)} (should be 0)")

Total unique patients: 27

Patient-Level Split:
Training patients: 21
Testing patients: 6

Segment Distribution:
Training segments: 69 (from 21 patients)
Testing segments: 22 (from 6 patients)

Verification - Patient overlap: 0 (should be 0)


### Train Corrected Model

In [5]:
# Prepare features and labels
feature_columns = [col for col in df.columns 
                  if col not in ['patient_id', 'segment', 'deterioration']]

X_train = train_data[feature_columns].values
y_train = train_data['deterioration'].values
X_test = test_data[feature_columns].values
y_test = test_data['deterioration'].values

# Train new model with correct split
print("Training Random Forest with patient-level split...")
model_fixed = RandomForestClassifier(
    n_estimators=100,
    max_depth=10,
    random_state=42,
    class_weight='balanced'
)

model_fixed.fit(X_train, y_train)

# Evaluate performance
train_pred = model_fixed.predict(X_train)
test_pred = model_fixed.predict(X_test)

print("\nCORRECTED Model Performance:")
print("="*50)
print("Training Accuracy: {:.1f}%".format(accuracy_score(y_train, train_pred) * 100))
print("Testing Accuracy: {:.1f}%".format(accuracy_score(y_test, test_pred) * 100))

print("\nTest Set Classification Report:")
print(classification_report(y_test, test_pred, 
                          target_names=['STABLE', 'DETERIORATING']))

Training Random Forest with patient-level split...

CORRECTED Model Performance:
Training Accuracy: 100.0%
Testing Accuracy: 86.4%

Test Set Classification Report:
               precision    recall  f1-score   support

       STABLE       0.84      1.00      0.91        16
DETERIORATING       1.00      0.50      0.67         6

     accuracy                           0.86        22
    macro avg       0.92      0.75      0.79        22
 weighted avg       0.89      0.86      0.85        22



### Compare Methods

In [6]:
# Wrong way (for comparison)
X_train_wrong, X_test_wrong, y_train_wrong, y_test_wrong = train_test_split(
    df[feature_columns].values, 
    df['deterioration'].values,
    test_size=0.2, 
    random_state=42
)

model_wrong = RandomForestClassifier(n_estimators=100, random_state=42)
model_wrong.fit(X_train_wrong, y_train_wrong)

wrong_test_acc = accuracy_score(y_test_wrong, model_wrong.predict(X_test_wrong))
correct_test_acc = accuracy_score(y_test, test_pred)

print("COMPARISON WITH INCORRECT METHOD:")
print("="*50)
print(f"Old method (data leakage): {wrong_test_acc:.1%} accuracy")
print(f"Correct method (patient-split): {correct_test_acc:.1%} accuracy")
print(f"Difference: {(wrong_test_acc - correct_test_acc):.1%}")
print("\nThe correct method gives a more honest estimate of real-world performance!")

COMPARISON WITH INCORRECT METHOD:
Old method (data leakage): 89.5% accuracy
Correct method (patient-split): 86.4% accuracy
Difference: 3.1%

The correct method gives a more honest estimate of real-world performance!


In [7]:
# Save the correctly trained model
joblib.dump(model_fixed, 'final_model_patient_split_CORRECT.pkl')
print("Saved corrected model to: final_model_patient_split_CORRECT.pkl")

# Save the train/test split info for documentation
split_info = {
    'train_patients': train_patients.tolist(),
    'test_patients': test_patients.tolist(),
    'train_segments': len(train_data),
    'test_segments': len(test_data),
    'test_accuracy': correct_test_acc
}

import json
with open('patient_split_info.json', 'w') as f:
    json.dump(split_info, f, indent=2)
print("Saved split information to: patient_split_info.json")

Saved corrected model to: final_model_patient_split_CORRECT.pkl
Saved split information to: patient_split_info.json


## Finally! The model is corrected. Let's Generate Claude Explanations using the corrected model predictions

In [13]:
# Let's see what columns you actually have in your dataset
print("Columns in your dataframe:")
print(df.columns.tolist())
print("\n")

# Show a sample row to understand the data structure
print("Sample data from first row:")
sample = df.iloc[0]
for col in df.columns:
    print(f"{col}: {sample[col]}")

Columns in your dataframe:
['patient_id', 'segment', 'duration_hours', 'hr_mean', 'hr_std', 'hr_min', 'hr_max', 'hr_early_mean', 'hr_late_mean', 'hr_change', 'hr_percent_change', 'bp_systolic_mean', 'bp_systolic_std', 'bp_systolic_min', 'bp_systolic_max', 'bp_early_mean', 'bp_late_mean', 'bp_change', 'bp_percent_change', 'resp_rate_mean', 'resp_rate_std', 'resp_rate_min', 'resp_rate_max', 'resp_early_mean', 'resp_late_mean', 'resp_change', 'resp_percent_change', 'pleth_quality', 'pleth_mean', 'pleth_std', 'shock_index', 'deterioration']


Sample data from first row:
patient_id: 30/3000393
segment: 3000393_0005
duration_hours: 4.813055555555556
hr_mean: 73.35714780031789
hr_std: 8.054185650837764
hr_min: 51.02040816326531
hr_max: 119.04761904761904
hr_early_mean: 68.05995996401344
hr_late_mean: 78.65433563662232
hr_change: 10.594375672608876
hr_percent_change: 15.566238472973874
bp_systolic_mean: 153.3414553051391
bp_systolic_std: 16.460775730936902
bp_systolic_min: 51.48168590310307
bp

# Step 32: AI EXPLAINABILITY

In [14]:
import anthropic
import os

# Initialize Claude client
client = anthropic.Anthropic(api_key=os.environ.get('ANTHROPIC_API_KEY'))

def create_clinical_prompt(patient_data, ml_prediction, ml_probability):
    """
    Create a clinical prompt for Claude to explain the ML prediction
    """
    prompt = f"""You are an ICU physician reviewing patient vital signs and ML model predictions.

Patient Data:
- Segment ID: {patient_data['segment']}
- Duration: {patient_data['duration_hours']:.1f} hours
- ML Prediction: {'DETERIORATING' if ml_prediction == 1 else 'STABLE'}
- Risk Probability: {ml_probability:.1%}

Vital Sign Changes (Early ‚Üí Late Period):
- Heart Rate: {patient_data['hr_early_mean']:.1f} ‚Üí {patient_data['hr_late_mean']:.1f} bpm (Change: {patient_data['hr_percent_change']:.1f}%)
- Systolic BP: {patient_data['bp_early_mean']:.1f} ‚Üí {patient_data['bp_late_mean']:.1f} mmHg (Change: {patient_data['bp_percent_change']:.1f}%)
- Respiratory Rate: {patient_data['resp_early_mean']:.1f} ‚Üí {patient_data['resp_late_mean']:.1f} br/min (Change: {patient_data['resp_percent_change']:.1f}%)
- Shock Index: {patient_data['shock_index']:.2f} (Normal < 0.7)

Provide a brief (3-4 sentence) clinical assessment explaining:
1. Whether these vital sign trends support the ML prediction
2. The most concerning or reassuring findings
3. Clinical significance of the patterns observed

Be specific about the physiological implications."""
    
    return prompt

def get_llm_explanation(prompt):
    """
    Get explanation from Claude API
    """
    try:
        response = client.messages.create(
            model="claude-3-sonnet-20240229",
            max_tokens=300,
            temperature=0.3,
            messages=[
                {"role": "user", "content": prompt}
            ]
        )
        return response.content[0].text
    except Exception as e:
        return f"Error getting explanation: {str(e)}"

In [15]:
# Generate Claude explanations for CORRECTED predictions
import time

print("üöÄ Generating Claude explanations with CORRECTED model predictions...")
print("This will use the patient-split model for honest predictions\n")

# Get predictions from CORRECTED model for ALL data
all_predictions = model_fixed.predict(df[feature_columns].values)
all_probabilities = model_fixed.predict_proba(df[feature_columns].values)[:, 1]

# Store results
corrected_explanations = []

for idx, row in df.iterrows():
    try:
        # Use the CORRECTED model's predictions
        pred = all_predictions[idx]
        prob = all_probabilities[idx]
        
        # Generate prompt with corrected predictions
        prompt = create_clinical_prompt(row, pred, prob)
        
        # Get Claude explanation
        explanation = get_llm_explanation(prompt)
        
        corrected_explanations.append({
            'patient_id': row['patient_id'],
            'segment': row['segment'],
            'ml_prediction': 'DETERIORATING' if pred == 1 else 'STABLE',
            'ml_probability': prob,
            'claude_explanation': explanation,
            'is_test_set': row['patient_id'] in test_patients  # Mark if it was in test set
        })
        
        if (idx + 1) % 10 == 0:
            print(f"‚úÖ Processed {idx + 1}/91 patients...")
        
        # Small delay to avoid rate limiting
        time.sleep(0.5)
            
    except Exception as e:
        print(f"‚ùå Error on patient {idx}: {str(e)}")

# Save corrected explanations
corrected_df = pd.DataFrame(corrected_explanations)
corrected_df.to_csv('ML_CLAUDE_EXPLANATIONS_CORRECTED.csv', index=False)
print(f"\nüíæ Saved to: ML_CLAUDE_EXPLANATIONS_CORRECTED.csv")

# Show example from test set
test_examples = corrected_df[corrected_df['is_test_set'] == True].head(2)
for _, ex in test_examples.iterrows():
    print(f"\nüìã Test Patient: {ex['patient_id']}")
    print(f"Prediction: {ex['ml_prediction']} ({ex['ml_probability']:.1%})")
    print(f"Claude: {ex['claude_explanation'][:300]}...")

üöÄ Generating Claude explanations with CORRECTED model predictions...
This will use the patient-split model for honest predictions



Please migrate to a newer model. Visit https://docs.anthropic.com/en/docs/resources/model-deprecations for more information.
  response = client.messages.create(


‚úÖ Processed 10/91 patients...
‚úÖ Processed 20/91 patients...
‚úÖ Processed 30/91 patients...
‚úÖ Processed 40/91 patients...
‚úÖ Processed 50/91 patients...
‚úÖ Processed 60/91 patients...
‚úÖ Processed 70/91 patients...
‚úÖ Processed 80/91 patients...
‚úÖ Processed 90/91 patients...

üíæ Saved to: ML_CLAUDE_EXPLANATIONS_CORRECTED.csv

üìã Test Patient: 30/3000393
Prediction: STABLE (23.0%)
Claude: Error getting explanation: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted"...

üìã Test Patient: 30/3000393
Prediction: DETERIORATING (61.0%)
Claude: Error getting explanation: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted"...


### API Key not working ... let's try to connect API key again

In [16]:
import anthropic
import os

# Set your API key directly (replace with your actual key)
api_key = "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"  # PUT YOUR ACTUAL KEY HERE

# Initialize Claude client with the key directly
client = anthropic.Anthropic(api_key=api_key)

def create_clinical_prompt(patient_data, ml_prediction, ml_probability):
    """
    Create a clinical prompt for Claude to explain the ML prediction
    """
    prompt = f"""You are an ICU physician reviewing patient vital signs and ML model predictions.

Patient Data:
- Segment ID: {patient_data['segment']}
- Duration: {patient_data['duration_hours']:.1f} hours
- ML Prediction: {'DETERIORATING' if ml_prediction == 1 else 'STABLE'}
- Risk Probability: {ml_probability:.1%}

Vital Sign Changes (Early ‚Üí Late Period):
- Heart Rate: {patient_data['hr_early_mean']:.1f} ‚Üí {patient_data['hr_late_mean']:.1f} bpm (Change: {patient_data['hr_percent_change']:.1f}%)
- Systolic BP: {patient_data['bp_early_mean']:.1f} ‚Üí {patient_data['bp_late_mean']:.1f} mmHg (Change: {patient_data['bp_percent_change']:.1f}%)
- Respiratory Rate: {patient_data['resp_early_mean']:.1f} ‚Üí {patient_data['resp_late_mean']:.1f} br/min (Change: {patient_data['resp_percent_change']:.1f}%)
- Shock Index: {patient_data['shock_index']:.2f} (Normal < 0.7)

Provide a brief (3-4 sentence) clinical assessment explaining:
1. Whether these vital sign trends support the ML prediction
2. The most concerning or reassuring findings
3. Clinical significance of the patterns observed

Be specific about the physiological implications."""
    
    return prompt

def get_llm_explanation(prompt):
    """
    Get explanation from Claude API with updated model
    """
    try:
        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",  # Updated to newer model
            max_tokens=300,
            temperature=0.3,
            messages=[
                {"role": "user", "content": prompt}
            ]
        )
        return response.content[0].text
    except Exception as e:
        return f"Error: {str(e)}"

# Test the API connection
print("Testing Claude API connection...")
test_response = get_llm_explanation("Say 'API working!' if you receive this.")
print(f"Test response: {test_response}")

Testing Claude API connection...


Please migrate to a newer model. Visit https://docs.anthropic.com/en/docs/resources/model-deprecations for more information.
  response = client.messages.create(


Test response: Error: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-3-5-sonnet-20241022'}, 'request_id': 'req_011CUr3X3CAPwTaz6YpQgce1'}


### Verify which model is available

In [21]:
import anthropic

# Your API key is fine - just need correct model name
api_key = "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"  # Your existing key
client = anthropic.Anthropic(api_key=api_key)

# Try different model names until we find one that works
model_names_to_try = [
    "claude-3-5-sonnet-20241022",  # Latest Sonnet 3.5
    "claude-3-opus-20240229",       # Opus
    "claude-3-sonnet-20240229",     # Older Sonnet (works until July 2025)
    "claude-3-haiku-20240307",      # Haiku (fast and cheap)
]

working_model = None

for model_name in model_names_to_try:
    try:
        print(f"Trying {model_name}...")
        response = client.messages.create(
            model=model_name,
            max_tokens=50,
            messages=[{"role": "user", "content": "Say 'working'"}]
        )
        print(f"‚úÖ SUCCESS! Model {model_name} works!")
        print(f"Response: {response.content[0].text}")
        working_model = model_name
        break
    except Exception as e:
        print(f"‚ùå {model_name} failed: {str(e)[:100]}")

if working_model:
    print("\nü§î None of the standard models worked. Let me check...")

Trying claude-3-5-sonnet-20241022...


Please migrate to a newer model. Visit https://docs.anthropic.com/en/docs/resources/model-deprecations for more information.
  response = client.messages.create(


‚ùå claude-3-5-sonnet-20241022 failed: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-3
Trying claude-3-opus-20240229...
‚ùå claude-3-opus-20240229 failed: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-3
Trying claude-3-sonnet-20240229...


Please migrate to a newer model. Visit https://docs.anthropic.com/en/docs/resources/model-deprecations for more information.
  response = client.messages.create(
Please migrate to a newer model. Visit https://docs.anthropic.com/en/docs/resources/model-deprecations for more information.
  response = client.messages.create(


‚ùå claude-3-sonnet-20240229 failed: Error code: 404 - {'type': 'error', 'error': {'type': 'not_found_error', 'message': 'model: claude-3
Trying claude-3-haiku-20240307...
‚úÖ SUCCESS! Model claude-3-haiku-20240307 works!
Response: Working

ü§î None of the standard models worked. Let me check...


In [23]:
# Set up with the working model
WORKING_MODEL = "claude-3-haiku-20240307"
client = anthropic.Anthropic(api_key=api_key)

def create_clinical_prompt(patient_data, ml_prediction, ml_probability):
    """Create clinical prompt for Claude"""
    prompt = f"""You are an ICU physician reviewing patient vital signs and ML predictions.

Patient: {patient_data['segment']}
Duration: {patient_data['duration_hours']:.1f} hours
ML Prediction: {'DETERIORATING' if ml_prediction == 1 else 'STABLE'}
Risk: {ml_probability:.1%}

Vital Changes:
- Heart Rate: {patient_data['hr_early_mean']:.1f} ‚Üí {patient_data['hr_late_mean']:.1f} bpm ({patient_data['hr_percent_change']:.1f}%)
- Systolic BP: {patient_data['bp_early_mean']:.1f} ‚Üí {patient_data['bp_late_mean']:.1f} mmHg ({patient_data['bp_percent_change']:.1f}%)
- Respiratory: {patient_data['resp_early_mean']:.1f} ‚Üí {patient_data['resp_late_mean']:.1f} br/min ({patient_data['resp_percent_change']:.1f}%)
- Shock Index: {patient_data['shock_index']:.2f} (Normal < 0.7)

Provide a 3-4 sentence clinical assessment of whether the vital signs support the ML prediction and the clinical significance."""
    return prompt

def get_llm_explanation(prompt):
    """Get explanation from Claude Haiku"""
    try:
        response = client.messages.create(
            model=WORKING_MODEL,
            max_tokens=300,
            temperature=0.3,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.content[0].text
    except Exception as e:
        return f"Error: {str(e)}"

print("‚úÖ Claude Haiku API is working! Generating explanations...")

‚úÖ Claude Haiku API is working! Generating explanations...


In [24]:
import time

print(f"üöÄ Generating explanations using {WORKING_MODEL}")
print("Cost estimate: ~$0.05 for all 91 segments (Haiku is 10x cheaper!)\n")

# Get corrected model predictions
all_predictions = model_fixed.predict(df[feature_columns].values)
all_probabilities = model_fixed.predict_proba(df[feature_columns].values)[:, 1]

corrected_explanations = []
successful = 0

for idx, row in df.iterrows():
    pred = all_predictions[idx]
    prob = all_probabilities[idx]
    
    prompt = create_clinical_prompt(row, pred, prob)
    explanation = get_llm_explanation(prompt)
    
    if not explanation.startswith("Error"):
        successful += 1
    
    is_test = row['patient_id'] in test_patients
    
    corrected_explanations.append({
        'patient_id': row['patient_id'],
        'segment': row['segment'],
        'duration_hours': row['duration_hours'],
        'hr_change': row['hr_percent_change'],
        'bp_change': row['bp_percent_change'],
        'resp_change': row['resp_percent_change'],
        'shock_index': row['shock_index'],
        'ml_prediction': 'DETERIORATING' if pred == 1 else 'STABLE',
        'ml_probability': prob,
        'claude_explanation': explanation,
        'is_test_set': is_test
    })
    
    if (idx + 1) % 20 == 0:
        print(f"‚úÖ Processed {idx + 1}/91...")
        time.sleep(0.2)  # Small delay for rate limiting

# Save results
final_df = pd.DataFrame(corrected_explanations)
final_df.to_csv('FINAL_WITH_HAIKU_EXPLANATIONS.csv', index=False)

print(f"\n‚úÖ Generated {successful}/91 explanations successfully!")
print(f"üíæ Saved to: FINAL_WITH_HAIKU_EXPLANATIONS.csv")

# Show example
if successful > 0:
    ex = final_df[final_df['is_test_set'] == True].iloc[0]
    print(f"\nüìã Example Test Patient:")
    print(f"Patient: {ex['patient_id']}")
    print(f"Prediction: {ex['ml_prediction']} ({ex['ml_probability']:.1%})")
    print(f"Claude says: {ex['claude_explanation'][:300]}...")

üöÄ Generating explanations using claude-3-haiku-20240307
Cost estimate: ~$0.05 for all 91 patients (Haiku is 10x cheaper!)

‚úÖ Processed 20/91...
‚úÖ Processed 40/91...
‚úÖ Processed 60/91...
‚úÖ Processed 80/91...

‚úÖ Generated 91/91 explanations successfully!
üíæ Saved to: FINAL_WITH_HAIKU_EXPLANATIONS.csv

üìã Example Test Patient:
Patient: 30/3000393
Prediction: STABLE (23.0%)
Claude says: The vital sign changes observed in this patient do not fully support the ML prediction of STABLE. The increase in heart rate by 15.6% and the normal Shock Index suggest some hemodynamic instability, which may indicate the need for closer monitoring. However, the relatively small changes in systolic ...


# Step 33: Interactive Dashboard using Streamlit

In [26]:
pip install streamlit plotly pandas numpy

Note: you may need to restart the kernel to use updated packages.


In [27]:
import streamlit as st
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
import numpy as np
import anthropic

# Page configuration
st.set_page_config(
    page_title="ICU Deterioration Detection System",
    page_icon="üè•",
    layout="wide",
    initial_sidebar_state="expanded"
)

# Custom CSS for better styling
st.markdown("""
    <style>
    .main {padding: 0rem 0rem;}
    .stMetric {background-color: #f0f2f6; padding: 10px; border-radius: 5px;}
    </style>
    """, unsafe_allow_html=True)

# Load your data
@st.cache_data
def load_data():
    return pd.read_csv('FINAL_WITH_HAIKU_EXPLANATIONS.csv')

df = load_data()

# Initialize Claude (for live queries)
api_key = "sk-ant-api03-..."  # Your API key here
client = anthropic.Anthropic(api_key=api_key)

# Header
st.title("üè• ICU Patient Deterioration Detection System")
st.markdown("**AI-Powered Early Warning System** | MIMIC-III Waveform Database Analysis")

# Sidebar
st.sidebar.image("https://via.placeholder.com/300x100/4A90E2/FFFFFF?text=ICU+Monitor", use_column_width=True)
st.sidebar.markdown("---")
st.sidebar.header("üìä Control Panel")

# Mode selection
mode = st.sidebar.selectbox(
    "Select Mode:",
    ["üë• Patient Database", "üéÆ Live Simulation", "üìà Analytics Overview"]
)

if mode == "üë• Patient Database":
    # Patient selection
    patients = df['patient_id'].unique()
    selected_patient = st.sidebar.selectbox(
        "Select Patient ID:",
        patients,
        format_func=lambda x: f"Patient {x}"
    )
    
    # Get patient's segments
    patient_segments = df[df['patient_id'] == selected_patient]
    if len(patient_segments) > 1:
        segment_idx = st.sidebar.selectbox(
            "Select Recording Segment:",
            range(len(patient_segments)),
            format_func=lambda x: f"Segment {x+1} - {patient_segments.iloc[x]['segment']}"
        )
        patient_data = patient_segments.iloc[segment_idx]
    else:
        patient_data = patient_segments.iloc[0]
    
    # Main dashboard layout
    st.markdown("---")
    
    # Top metrics row
    col1, col2, col3, col4, col5 = st.columns(5)
    
    with col1:
        status_color = "üî¥" if patient_data['ml_prediction'] == "DETERIORATING" else "üü¢"
        st.metric(
            label="Status",
            value=patient_data['ml_prediction'],
            delta=f"{status_color} Risk: {patient_data['ml_probability']:.1%}"
        )
    
    with col2:
        hr_icon = "‚¨ÜÔ∏è" if patient_data['hr_change'] > 10 else "‚¨áÔ∏è" if patient_data['hr_change'] < -10 else "‚û°Ô∏è"
        st.metric(
            label="üíì Heart Rate",
            value=f"{patient_data['hr_change']:.1f}%",
            delta=hr_icon
        )
    
    with col3:
        bp_icon = "‚¨ÜÔ∏è" if patient_data['bp_change'] > 10 else "‚¨áÔ∏è" if patient_data['bp_change'] < -10 else "‚û°Ô∏è"
        st.metric(
            label="ü©∫ Blood Pressure",
            value=f"{patient_data['bp_change']:.1f}%",
            delta=bp_icon
        )
    
    with col4:
        resp_icon = "‚¨ÜÔ∏è" if patient_data['resp_change'] > 15 else "‚¨áÔ∏è" if patient_data['resp_change'] < -15 else "‚û°Ô∏è"
        st.metric(
            label="ü´Å Respiratory",
            value=f"{patient_data['resp_change']:.1f}%",
            delta=resp_icon
        )
    
    with col5:
        shock_status = "‚ö†Ô∏è High" if patient_data['shock_index'] > 0.7 else "‚úÖ Normal"
        st.metric(
            label="‚ö° Shock Index",
            value=f"{patient_data['shock_index']:.2f}",
            delta=shock_status
        )
    
    st.markdown("---")
    
    # Risk visualization and AI analysis
    col1, col2 = st.columns([1, 2])
    
    with col1:
        st.subheader("‚öïÔ∏è Risk Assessment")
        
        # Risk gauge
        fig = go.Figure(go.Indicator(
            mode = "gauge+number+delta",
            value = patient_data['ml_probability'] * 100,
            domain = {'x': [0, 1], 'y': [0, 1]},
            title = {'text': "Deterioration Risk", 'font': {'size': 20}},
            delta = {'reference': 50, 'increasing': {'color': "red"}},
            gauge = {
                'axis': {'range': [None, 100], 'tickwidth': 1, 'tickcolor': "darkblue"},
                'bar': {'color': "darkred" if patient_data['ml_probability'] > 0.5 else "darkgreen"},
                'bgcolor': "white",
                'borderwidth': 2,
                'bordercolor': "gray",
                'steps': [
                    {'range': [0, 25], 'color': '#90EE90'},
                    {'range': [25, 50], 'color': '#FFFFE0'},
                    {'range': [50, 75], 'color': '#FFD700'},
                    {'range': [75, 100], 'color': '#FF6B6B'}
                ],
                'threshold': {
                    'line': {'color': "red", 'width': 4},
                    'thickness': 0.75,
                    'value': 90
                }
            }
        ))
        fig.update_layout(height=300, font={'color': "darkblue", 'family': "Arial"})
        st.plotly_chart(fig, use_container_width=True)
        
        # Test/Train indicator
        if patient_data['is_test_set']:
            st.info("üìù This patient was in the TEST set")
        else:
            st.success("üìö This patient was in the TRAINING set")
    
    with col2:
        st.subheader("ü§ñ AI Clinical Assessment")
        
        # Tabs for different views
        tab1, tab2, tab3 = st.tabs(["üí¨ Analysis", "üîÑ Get Fresh Opinion", "üìä Details"])
        
        with tab1:
            st.info(patient_data['claude_explanation'])
        
        with tab2:
            if st.button("üîÆ Get New AI Analysis", type="primary"):
                with st.spinner("Consulting AI physician..."):
                    prompt = f"""As an ICU physician, provide a brief (3 sentence) assessment of:
                    Patient with {patient_data['hr_change']:.1f}% HR change, 
                    {patient_data['bp_change']:.1f}% BP change, 
                    {patient_data['resp_change']:.1f}% respiratory change.
                    Shock index: {patient_data['shock_index']:.2f}
                    ML predicts {patient_data['ml_probability']:.1%} deterioration risk."""
                    
                    response = client.messages.create(
                        model="claude-3-haiku-20240307",
                        max_tokens=200,
                        messages=[{"role": "user", "content": prompt}]
                    )
                    st.success(response.content[0].text)
        
        with tab3:
            st.write(f"**Recording Duration:** {patient_data['duration_hours']:.1f} hours")
            st.write(f"**Patient ID:** {patient_data['patient_id']}")
            st.write(f"**Segment:** {patient_data['segment']}")
            st.write(f"**Model Confidence:** {patient_data['ml_probability']:.1%}")

elif mode == "üéÆ Live Simulation":
    st.subheader("üéÆ Interactive Patient Simulator")
    st.markdown("Adjust vital signs to see real-time risk assessment")
    
    col1, col2 = st.columns(2)
    
    with col1:
        st.markdown("### üìä Vital Sign Changes")
        hr_change = st.slider("Heart Rate Change (%)", -30, 30, 0, 1)
        bp_change = st.slider("Blood Pressure Change (%)", -30, 30, 0, 1)
        resp_change = st.slider("Respiratory Change (%)", -30, 30, 0, 1)
        shock_index = st.slider("Shock Index", 0.3, 1.2, 0.6, 0.01)
        
        if st.button("üî¨ Analyze Patient", type="primary"):
            # Simple risk calculation based on thresholds
            risk_score = 0
            if abs(hr_change) > 10: risk_score += 0.3
            if bp_change < -10: risk_score += 0.3
            if resp_change > 15: risk_score += 0.3
            if shock_index > 0.7: risk_score += 0.1
            
            risk_score = min(risk_score, 0.99)
            
            with col2:
                st.markdown("### üè• Analysis Results")
                
                if risk_score > 0.5:
                    st.error(f"‚ö†Ô∏è HIGH RISK - Deterioration likely ({risk_score:.1%})")
                else:
                    st.success(f"‚úÖ LOW RISK - Stable patient ({risk_score:.1%})")
                
                # Get AI explanation
                with st.spinner("Getting AI assessment..."):
                    prompt = f"""Brief 2-sentence assessment: Patient with HR change {hr_change}%, 
                    BP change {bp_change}%, Resp change {resp_change}%, Shock index {shock_index:.2f}"""
                    
                    response = client.messages.create(
                        model="claude-3-haiku-20240307",
                        max_tokens=100,
                        messages=[{"role": "user", "content": prompt}]
                    )
                    st.info(response.content[0].text)

else:  # Analytics Overview
    st.subheader("üìà Model Performance Analytics")
    
    col1, col2, col3 = st.columns(3)
    
    with col1:
        total_patients = df['patient_id'].nunique()
        st.metric("Total Patients", total_patients)
    
    with col2:
        high_risk = (df['ml_probability'] > 0.5).sum()
        st.metric("High Risk Cases", f"{high_risk}/{len(df)}")
    
    with col3:
        avg_risk = df['ml_probability'].mean()
        st.metric("Average Risk", f"{avg_risk:.1%}")
    
    # Risk distribution
    fig = px.histogram(df, x='ml_probability', nbins=20, 
                       title="Risk Score Distribution",
                       labels={'ml_probability': 'Deterioration Risk', 'count': 'Number of Patients'})
    st.plotly_chart(fig, use_container_width=True)

# Footer with chat
st.markdown("---")
st.markdown("### üí¨ Ask the AI Physician")
user_question = st.text_input("Ask any question about ICU patient monitoring:")

if user_question:
    with st.spinner("Thinking..."):
        response = client.messages.create(
            model="claude-3-haiku-20240307",
            max_tokens=200,
            messages=[{"role": "user", "content": f"As an ICU physician, answer: {user_question}"}]
        )
        st.write(f"**AI Physician:** {response.content[0].text}")

# Info footer
st.markdown("---")
st.caption("üè• ICU Deterioration Detection System | Powered by ML + Claude AI | MIMIC-III Database")

2025-11-06 00:11:50.012 
  command:

    streamlit run C:\Users\river\anaconda3\Lib\site-packages\ipykernel_launcher.py [ARGUMENTS]
2025-11-06 00:11:50.012 No runtime found, using MemoryCacheStorageManager
2025-11-06 00:11:50.014 No runtime found, using MemoryCacheStorageManager
2025-11-06 00:11:50.397 Session state does not function when running a script without `streamlit run`


DeltaGenerator()