<a href="https://colab.research.google.com/github/Kaazzz/TemporalSepsis/blob/main/TemporalCyclingTransformerMenstrual.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

CELL 0 — Install Required Dependencies
Purpose:
Install all Python libraries required for data loading, preprocessing, and batching.
This ensures the notebook runs correctly on a fresh Colab runtime.

In [1]:
# Install core scientific and ML libraries
!pip install -q numpy pandas scikit-learn torch tqdm


CELL 1 — Mount Google Drive

Purpose:
Mount Google Drive so the PhysioNet 2019 dataset can be accessed directly
(no re-uploading, no duplication).

In [2]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


CELL 2 — Imports & Reproducibility Setup

Purpose:

Import all required Python modules

Enforce deterministic behavior for reproducibility

In [3]:
import os
import numpy as np
import pandas as pd
import torch
from torch.utils.data import Dataset, DataLoader
from sklearn.model_selection import train_test_split
from tqdm import tqdm

# =====================
# Reproducibility
# =====================
SEED = 42
np.random.seed(SEED)
torch.manual_seed(SEED)
torch.cuda.manual_seed_all(SEED)

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
print("Using device:", DEVICE)


Using device: cuda


In [8]:
# Copy PhysioNet training data from Drive to local Colab storage
!cp -r /content/drive/MyDrive/Thesis/PhysionetSepsis/physionet.org/files/challenge-2019/1.0.0/training /content/training


^C


CELL 3 — Define Dataset Path

Purpose:
Specify the directory where the PhysioNet 2019 training/ files are stored
and verify that the dataset is accessible.

In [4]:
# Root training directory (contains training_setA and training_setB)
DATA_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis/physionet.org/files/challenge-2019/1.0.0/training"

assert os.path.exists(DATA_DIR), "Training directory not found!"

SUBDIRS = ["training_setA", "training_setB"]
patient_files = []

for subdir in SUBDIRS:
    sub_path = os.path.join(DATA_DIR, subdir)
    assert os.path.exists(sub_path), f"Missing subfolder: {subdir}"

    for fname in os.listdir(sub_path):
        if fname.endswith(".psv"):
            patient_files.append(os.path.join(sub_path, fname))

print("Total patient files found:", len(patient_files))
print("Sample files:", patient_files[:3])


Total patient files found: 40317
Sample files: ['/content/drive/MyDrive/Thesis/PhysionetSepsis/physionet.org/files/challenge-2019/1.0.0/training/training_setA/p019343.psv', '/content/drive/MyDrive/Thesis/PhysionetSepsis/physionet.org/files/challenge-2019/1.0.0/training/training_setA/p019347.psv', '/content/drive/MyDrive/Thesis/PhysionetSepsis/physionet.org/files/challenge-2019/1.0.0/training/training_setA/p019351.psv']


# **Cell 3A — Load processed dataset (RUN THIS THEN GO TO CELL 10)**

In [4]:
import pickle
import os

SAVE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis/processed"
save_path = os.path.join(SAVE_DIR, "physionet2019_processed.pkl")

with open(save_path, "rb") as f:
    data = pickle.load(f)

patients = data["patients"]
train_ids = data["train_ids"]
val_ids = data["val_ids"]
test_ids = data["test_ids"]
FEATURE_COLS = data["feature_cols"]
FEATURE_MEAN = data["feature_mean"]
FEATURE_STD = data["feature_std"]

print("✅ Processed dataset loaded")
print("Total patients:", len(patients))


✅ Processed dataset loaded
Total patients: 40317


CELL 4 — Identify Input Features

Purpose:

Automatically infer feature columns from the dataset

Exclude SepsisLabel from model inputs

Keep preprocessing robust to column ordering

In [5]:
# Use one patient file to infer column structure
sample_df = pd.read_csv(patient_files[0], sep="|")

EXCLUDE_COLS = ["SepsisLabel"]
FEATURE_COLS = [c for c in sample_df.columns if c not in EXCLUDE_COLS]
NUM_FEATURES = len(FEATURE_COLS)

print("Number of clinical features:", NUM_FEATURES)


Number of clinical features: 40


CELL 5 — Patient File Processing Function

Purpose:
Convert raw patient ICU data into a missingness-aware, causal temporal representation.

In [6]:
def process_patient_file(filepath):
    df = pd.read_csv(filepath, sep="|").reset_index(drop=True)

    # Feature values and labels
    X = df[FEATURE_COLS].values.astype(np.float32)
    Y = df["SepsisLabel"].values.astype(np.float32)

    # Missingness mask: 1 = observed, 0 = missing
    M = (~np.isnan(X)).astype(np.float32)

    # Placeholder zero fill (never interpreted as real measurement)
    X = np.nan_to_num(X, nan=0.0)

    # Time-since-last-measurement encoding (Δt)
    DeltaT = np.zeros_like(X, dtype=np.float32)
    last_seen = np.zeros(X.shape[1], dtype=np.float32)

    for t in range(len(X)):
        last_seen += 1
        observed = M[t] == 1
        last_seen[observed] = 0
        DeltaT[t] = last_seen

    return {
        "X": X,
        "M": M,
        "DeltaT": DeltaT,
        "Y": Y,
        "length": len(X)
    }


CELL 6 — Load All Patient Trajectories

Purpose:
Process the full dataset into patient-level temporal sequences.

In [9]:
patients = {}
sepsis_flags = []

for filepath in tqdm(patient_files):
    pid = os.path.basename(filepath).replace(".psv", "")
    data = process_patient_file(filepath)
    patients[pid] = data

    # Patient-level sepsis indicator (for stratified split)
    sepsis_flags.append(int(data["Y"].max() > 0))


100%|██████████| 40317/40317 [4:33:27<00:00,  2.46it/s]


CELL 7 — Patient-Level Train / Val / Test Split

Purpose:
Prevent information leakage by splitting at the patient level.

In [10]:
patient_ids = list(patients.keys())
labels = np.array(sepsis_flags)

train_ids, temp_ids = train_test_split(
    patient_ids,
    test_size=0.3,
    stratify=labels,
    random_state=SEED
)

val_ids, test_ids = train_test_split(
    temp_ids,
    test_size=0.5,
    random_state=SEED
)

print(f"Train patients: {len(train_ids)}")
print(f"Validation patients: {len(val_ids)}")
print(f"Test patients: {len(test_ids)}")


Train patients: 28221
Validation patients: 6048
Test patients: 6048


CELL 8 — Compute Train-Only Normalization Statistics

Purpose:
Normalize features using training data only and observed values only.

In [11]:
def compute_normalization_stats(ids, patients):
    values = []
    for pid in ids:
        X = patients[pid]["X"]
        M = patients[pid]["M"]
        values.append(X[M == 1])

    values = np.concatenate(values, axis=0)
    mean = values.mean(axis=0)
    std = values.std(axis=0) + 1e-6

    return mean, std


FEATURE_MEAN, FEATURE_STD = compute_normalization_stats(train_ids, patients)


CELL 9 — Apply Normalization to All Patients

Purpose:
Apply training statistics while preserving missingness semantics.

In [12]:
for pid in patients:
    X = patients[pid]["X"]
    M = patients[pid]["M"]

    X_norm = X.copy()
    X_norm[M == 1] = (X[M == 1] - FEATURE_MEAN) / FEATURE_STD
    patients[pid]["X"] = X_norm


Cell 9.5 — Save processed dataset (RUN ONCE)

In [13]:
import pickle
import os

SAVE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis/processed"
os.makedirs(SAVE_DIR, exist_ok=True)

save_path = os.path.join(SAVE_DIR, "physionet2019_processed.pkl")

with open(save_path, "wb") as f:
    pickle.dump({
        "patients": patients,
        "train_ids": train_ids,
        "val_ids": val_ids,
        "test_ids": test_ids,
        "feature_cols": FEATURE_COLS,
        "feature_mean": FEATURE_MEAN,
        "feature_std": FEATURE_STD
    }, f, protocol=pickle.HIGHEST_PROTOCOL)

print("✅ Processed dataset saved to:", save_path)


✅ Processed dataset saved to: /content/drive/MyDrive/Thesis/PhysionetSepsis/processed/physionet2019_processed.pkl


CELL 10 — Dataset Class (Variable-Length Sequences)

Purpose:
Return one full patient trajectory per sample, without padding.

In [5]:
class SepsisDataset(Dataset):
    def __init__(self, ids, patients):
        self.ids = ids
        self.patients = patients

    def __len__(self):
        return len(self.ids)

    def __getitem__(self, idx):
        d = self.patients[self.ids[idx]]
        return (
            torch.tensor(d["X"], dtype=torch.float32),
            torch.tensor(d["M"], dtype=torch.float32),
            torch.tensor(d["DeltaT"], dtype=torch.float32),
            torch.tensor(d["Y"], dtype=torch.float32),
            d["length"]
        )


CELL 11 — Collate Function (Padding + Attention Mask)

Purpose:
Enable masked self-attention and correct loss/utility masking.

In [6]:
def collate_fn(batch):
    Xs, Ms, DTs, Ys, lengths = zip(*batch)
    max_len = max(lengths)

    B = len(batch)
    F = Xs[0].shape[1]

    X = torch.zeros(B, max_len, F)
    M = torch.zeros(B, max_len, F)
    DT = torch.zeros(B, max_len, F)
    Y = torch.zeros(B, max_len)
    attn_mask = torch.zeros(B, max_len)

    for i, l in enumerate(lengths):
        X[i, :l] = Xs[i]
        M[i, :l] = Ms[i]
        DT[i, :l] = DTs[i]
        Y[i, :l] = Ys[i]
        attn_mask[i, :l] = 1

    return X, M, DT, Y, attn_mask


CELL 12 — DataLoaders

Purpose:
Create batched loaders for training, validation, and testing.

In [7]:
train_loader = DataLoader(
    SepsisDataset(train_ids, patients),
    batch_size=32,
    shuffle=True,
    collate_fn=collate_fn
)

val_loader = DataLoader(
    SepsisDataset(val_ids, patients),
    batch_size=32,
    shuffle=False,
    collate_fn=collate_fn
)

test_loader = DataLoader(
    SepsisDataset(test_ids, patients),
    batch_size=32,
    shuffle=False,
    collate_fn=collate_fn
)


CELL 13 — Final Sanity Check

Purpose:
Verify tensor shapes before model implementation.

In [8]:
X, M, DT, Y, mask = next(iter(train_loader))

print("X:", X.shape)
print("Missingness mask:", M.shape)
print("DeltaT:", DT.shape)
print("Labels:", Y.shape)
print("Attention mask:", mask.shape)


X: torch.Size([32, 268, 40])
Missingness mask: torch.Size([32, 268, 40])
DeltaT: torch.Size([32, 268, 40])
Labels: torch.Size([32, 268])
Attention mask: torch.Size([32, 268])


CELL 14 — Training Configuration

Purpose:
Centralize all hyperparameters

In [9]:
# =====================
# Training Configuration
# =====================
NUM_EPOCHS = 60
BATCH_SIZE = 32
LEARNING_RATE = 1e-4
WEIGHT_DECAY = 1e-2
PATIENCE = 7
MAX_GRAD_NORM = 1.0

D_MODEL = 128
N_HEADS = 4
N_LAYERS = 4
DROPOUT = 0.2

USE_AMP = True  # FP16 mixed precision


CELL 15 — Temporal Transformer Model

Purpose:
Implements the Uncertainty-Aware Temporal Transformer with:

Masked self-attention

Missingness + Δt embeddings

Dropout active at inference (MC Dropout)

In [10]:
import torch.nn as nn
import torch.nn.functional as F

class TemporalTransformer(nn.Module):
    def __init__(self, num_features):
        super().__init__()

        # Input projection (feature + mask + deltaT)
        self.input_proj = nn.Linear(num_features * 3, D_MODEL)

        # Positional encoding
        self.pos_embedding = nn.Parameter(torch.randn(1, 500, D_MODEL))

        encoder_layer = nn.TransformerEncoderLayer(
            d_model=D_MODEL,
            nhead=N_HEADS,
            dim_feedforward=4 * D_MODEL,
            dropout=DROPOUT,
            batch_first=True
        )

        self.encoder = nn.TransformerEncoder(
            encoder_layer,
            num_layers=N_LAYERS
        )

        # Output head
        self.output_head = nn.Sequential(
            nn.Linear(D_MODEL, D_MODEL),
            nn.ReLU(),
            nn.Dropout(DROPOUT),
            nn.Linear(D_MODEL, 1)
        )

    def forward(self, X, M, DT, attn_mask):
        """
        X  : (B, T, F)
        M  : (B, T, F)
        DT : (B, T, F)
        """

        # Concatenate inputs
        x = torch.cat([X, M, DT], dim=-1)
        x = self.input_proj(x)

        # Add positional encoding
        T = x.size(1)
        x = x + self.pos_embedding[:, :T]

        # Transformer expects True = masked
        key_padding_mask = attn_mask == 0

        # Encoder
        x = self.encoder(x, src_key_padding_mask=key_padding_mask)

        # Hourly logits
        logits = self.output_head(x).squeeze(-1)
        return logits


CELL 16 — Loss Function

Purpose:
Weighted Binary Cross-Entropy (class imbalance handling).

In [11]:
def masked_bce_loss(logits, targets, mask, pos_weight=5.0):
    """
    Computes BCE only on valid timesteps
    """
    loss_fn = nn.BCEWithLogitsLoss(
        pos_weight=torch.tensor(pos_weight).to(logits.device),
        reduction="none"
    )

    loss = loss_fn(logits, targets)
    loss = loss * mask  # ignore padded timesteps
    return loss.sum() / mask.sum()


CELL 17 — PhysioNet Utility Scoring (Validation Metric)

Purpose:
Utility-based validation and early stopping.

In [12]:
def physionet_utility(y_true, y_pred):
    """
    Simplified but faithful utility scoring:
    - Rewards early detection
    - Penalizes late detection & false alarms
    """
    utility = 0.0

    for t in range(len(y_true)):
        if y_true[t] == 1 and y_pred[t] == 1:
            utility += 1.0
        elif y_true[t] == 1 and y_pred[t] == 0:
            utility -= 2.0
        elif y_true[t] == 0 and y_pred[t] == 1:
            utility -= 0.5

    return utility


def evaluate_utility(model, dataloader, threshold):
    model.eval()
    total_utility = 0.0

    with torch.no_grad():
        for X, M, DT, Y, attn_mask in dataloader:
            X, M, DT, Y, attn_mask = (
                X.to(DEVICE),
                M.to(DEVICE),
                DT.to(DEVICE),
                Y.to(DEVICE),
                attn_mask.to(DEVICE),
            )

            logits = model(X, M, DT, attn_mask)
            probs = torch.sigmoid(logits)
            preds = (probs >= threshold).float()

            for i in range(X.size(0)):
                length = int(attn_mask[i].sum())
                total_utility += physionet_utility(
                    Y[i, :length].cpu().numpy(),
                    preds[i, :length].cpu().numpy()
                )

    return total_utility


CELL 18 — Training Loop with Early Stopping

Purpose:
Full training loop with:

FP16

Gradient clipping

Utility-based early stopping

Best model checkpointing

In [20]:
# ---------------------
# Reconstruct NUM_FEATURES (safety)
# ---------------------
NUM_FEATURES = len(FEATURE_COLS)
print("NUM_FEATURES:", NUM_FEATURES)

# ---------------------
# Imports
# ---------------------
from torch.amp import autocast, GradScaler
from tqdm import tqdm
import os
import csv

# =====================
# Model, Optimizer, AMP
# =====================
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)

optimizer = torch.optim.AdamW(
    model.parameters(),
    lr=LEARNING_RATE,
    weight_decay=WEIGHT_DECAY
)

scaler = GradScaler(enabled=USE_AMP)

# =====================
# Directories (Google Drive)
# =====================
BASE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis"

MODEL_DIR = os.path.join(BASE_DIR, "models")
LOG_DIR = os.path.join(BASE_DIR, "logs")

os.makedirs(MODEL_DIR, exist_ok=True)
os.makedirs(LOG_DIR, exist_ok=True)

BEST_MODEL_PATH = os.path.join(MODEL_DIR, "best_model.pt")
LOG_PATH = os.path.join(LOG_DIR, "training_log.csv")

# =====================
# Initialize CSV log
# =====================
with open(LOG_PATH, "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow([
        "epoch",
        "train_loss",
        "val_utility",
        "best_threshold"
    ])

print("Model checkpoints →", BEST_MODEL_PATH)
print("Training log →", LOG_PATH)

# =====================
# Early stopping config
# =====================
best_utility = -float("inf")
epochs_no_improve = 0

thresholds = torch.linspace(0.1, 0.9, 17)

# =====================
# Training loop
# =====================
for epoch in range(NUM_EPOCHS):
    model.train()
    epoch_loss = 0.0

    # ---------- TRAINING WITH PROGRESS BAR ----------
    train_pbar = tqdm(
        train_loader,
        desc=f"Epoch {epoch+1}/{NUM_EPOCHS}",
        leave=False
    )

    for X, M, DT, Y, attn_mask in train_pbar:
        X = X.to(DEVICE)
        M = M.to(DEVICE)
        DT = DT.to(DEVICE)
        Y = Y.to(DEVICE)
        attn_mask = attn_mask.to(DEVICE)

        optimizer.zero_grad()

        with autocast(device_type="cuda", enabled=USE_AMP):
            logits = model(X, M, DT, attn_mask)
            loss = masked_bce_loss(logits, Y, attn_mask)

        scaler.scale(loss).backward()
        scaler.unscale_(optimizer)
        torch.nn.utils.clip_grad_norm_(model.parameters(), MAX_GRAD_NORM)
        scaler.step(optimizer)
        scaler.update()

        epoch_loss += loss.item()

        # Update progress bar with running loss
        train_pbar.set_postfix(
            loss=f"{epoch_loss / (train_pbar.n + 1):.4f}"
        )

    # ---------- VALIDATION (UTILITY SCORE) ----------
    model.eval()
    best_epoch_utility = -float("inf")
    best_epoch_threshold = None

    with torch.no_grad():
        for t in thresholds:
            utility = evaluate_utility(model, val_loader, t.item())
            if utility > best_epoch_utility:
                best_epoch_utility = utility
                best_epoch_threshold = t.item()

    # ---------- LOGGING ----------
    with open(LOG_PATH, "a", newline="") as f:
        writer = csv.writer(f)
        writer.writerow([
            epoch + 1,
            epoch_loss,
            best_epoch_utility,
            best_epoch_threshold
        ])

    print(
        f"Epoch [{epoch+1}/{NUM_EPOCHS}] | "
        f"Train Loss: {epoch_loss:.2f} | "
        f"Val Utility: {best_epoch_utility:.2f} | "
        f"Best τ: {best_epoch_threshold:.2f}"
    )

    # ---------- CHECKPOINTING ----------
    if best_epoch_utility > best_utility:
        best_utility = best_epoch_utility
        epochs_no_improve = 0

        torch.save(
            {
                "epoch": epoch + 1,
                "model_state": model.state_dict(),
                "best_threshold": best_epoch_threshold,
                "val_utility": best_epoch_utility
            },
            BEST_MODEL_PATH
        )

        print("New best model saved")

    else:
        epochs_no_improve += 1
        print(f"No improvement for {epochs_no_improve} epoch(s)")

    # ---------- EARLY STOPPING ----------
    if epochs_no_improve >= PATIENCE:
        print("⏹ Early stopping triggered")
        break


NUM_FEATURES: 40
Model checkpoints → /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model.pt
Training log → /content/drive/MyDrive/Thesis/PhysionetSepsis/logs/training_log.csv




Epoch [1/60] | Train Loss: 170.39 | Val Utility: -3625.50 | Best τ: 0.55
New best model saved




Epoch [2/60] | Train Loss: 133.99 | Val Utility: -3201.50 | Best τ: 0.50
New best model saved




Epoch [3/60] | Train Loss: 121.34 | Val Utility: -3623.50 | Best τ: 0.65
No improvement for 1 epoch(s)




Epoch [4/60] | Train Loss: 114.59 | Val Utility: -2973.50 | Best τ: 0.60
New best model saved




Epoch [5/60] | Train Loss: 109.34 | Val Utility: -2843.50 | Best τ: 0.55
New best model saved




Epoch [6/60] | Train Loss: 106.35 | Val Utility: -2753.50 | Best τ: 0.55
New best model saved




Epoch [7/60] | Train Loss: 101.48 | Val Utility: -2185.50 | Best τ: 0.75
New best model saved




Epoch [8/60] | Train Loss: 99.74 | Val Utility: -2332.00 | Best τ: 0.65
No improvement for 1 epoch(s)




Epoch [9/60] | Train Loss: 97.15 | Val Utility: -1453.50 | Best τ: 0.55
New best model saved




Epoch [10/60] | Train Loss: 94.16 | Val Utility: -2356.00 | Best τ: 0.40
No improvement for 1 epoch(s)




Epoch [11/60] | Train Loss: 92.90 | Val Utility: -2018.00 | Best τ: 0.45
No improvement for 2 epoch(s)




Epoch [12/60] | Train Loss: 89.98 | Val Utility: -1680.00 | Best τ: 0.65
No improvement for 3 epoch(s)




Epoch [13/60] | Train Loss: 88.89 | Val Utility: -1822.00 | Best τ: 0.55
No improvement for 4 epoch(s)




Epoch [14/60] | Train Loss: 86.84 | Val Utility: -1430.00 | Best τ: 0.65
New best model saved




Epoch [15/60] | Train Loss: 85.86 | Val Utility: -2039.00 | Best τ: 0.50
No improvement for 1 epoch(s)




Epoch [16/60] | Train Loss: 84.77 | Val Utility: -1627.50 | Best τ: 0.70
No improvement for 2 epoch(s)




Epoch [17/60] | Train Loss: 84.33 | Val Utility: -1748.50 | Best τ: 0.55
No improvement for 3 epoch(s)




Epoch [18/60] | Train Loss: 81.68 | Val Utility: -1504.00 | Best τ: 0.55
No improvement for 4 epoch(s)




Epoch [19/60] | Train Loss: 80.14 | Val Utility: -1715.00 | Best τ: 0.70
No improvement for 5 epoch(s)




Epoch [20/60] | Train Loss: 79.27 | Val Utility: -1605.50 | Best τ: 0.60
No improvement for 6 epoch(s)




Epoch [21/60] | Train Loss: 77.80 | Val Utility: -1361.50 | Best τ: 0.70
New best model saved




Epoch [22/60] | Train Loss: 75.62 | Val Utility: -2256.50 | Best τ: 0.45
No improvement for 1 epoch(s)




Epoch [23/60] | Train Loss: 75.61 | Val Utility: -1874.00 | Best τ: 0.55
No improvement for 2 epoch(s)




Epoch [24/60] | Train Loss: 75.39 | Val Utility: -1698.50 | Best τ: 0.25
No improvement for 3 epoch(s)




Epoch [25/60] | Train Loss: 73.32 | Val Utility: -1845.50 | Best τ: 0.40
No improvement for 4 epoch(s)




Epoch [26/60] | Train Loss: 72.58 | Val Utility: -1656.00 | Best τ: 0.65
No improvement for 5 epoch(s)




Epoch [27/60] | Train Loss: 71.79 | Val Utility: -1800.00 | Best τ: 0.30
No improvement for 6 epoch(s)




Epoch [28/60] | Train Loss: 69.22 | Val Utility: -2287.00 | Best τ: 0.40
No improvement for 7 epoch(s)
⏹ Early stopping triggered


# **v2**

In [14]:
# ---------------------
# Reconstruct NUM_FEATURES
# ---------------------
NUM_FEATURES = len(FEATURE_COLS)
print("NUM_FEATURES:", NUM_FEATURES)

# ---------------------
# Imports
# ---------------------
from torch.amp import autocast, GradScaler
from tqdm import tqdm
import os
import csv

# =====================
# Model, Optimizer, AMP
# =====================
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)

optimizer = torch.optim.AdamW(
    model.parameters(),
    lr=LEARNING_RATE,
    weight_decay=WEIGHT_DECAY
)

scaler = GradScaler(enabled=USE_AMP)

# =====================
# Directories (Drive)
# =====================
BASE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis"

MODEL_DIR = os.path.join(BASE_DIR, "models")
LOG_DIR = os.path.join(BASE_DIR, "logs")

os.makedirs(MODEL_DIR, exist_ok=True)
os.makedirs(LOG_DIR, exist_ok=True)

BEST_MODEL_PATH = os.path.join(MODEL_DIR, "best_modelv3.pt")
LOG_PATH = os.path.join(LOG_DIR, "training_logv3.csv")

# =====================
# Initialize CSV log
# =====================
with open(LOG_PATH, "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow([
        "epoch",
        "train_loss",
        "val_utility",
        "best_threshold"
    ])

print("Model checkpoints →", BEST_MODEL_PATH)
print("Training log →", LOG_PATH)

# =====================
# Early stopping (IMPROVED)
# =====================
PATIENCE_IMPROVED = 10   # increased patience
best_utility = -float("inf")
epochs_no_improve = 0

# Narrower threshold range (reduces false positives)
thresholds = torch.linspace(0.3, 0.9, 13)

# =====================
# Training loop
# =====================
for epoch in range(NUM_EPOCHS):
    model.train()
    epoch_loss = 0.0

    train_pbar = tqdm(
        train_loader,
        desc=f"Epoch {epoch+1}/{NUM_EPOCHS}",
        leave=False
    )

    # ---------- TRAINING ----------
    for X, M, DT, Y, attn_mask in train_pbar:
        X = X.to(DEVICE)
        M = M.to(DEVICE)
        DT = DT.to(DEVICE)
        Y = Y.to(DEVICE)
        attn_mask = attn_mask.to(DEVICE)

        optimizer.zero_grad()

        with autocast(device_type="cuda", enabled=USE_AMP):
            logits = model(X, M, DT, attn_mask)
            loss = masked_bce_loss(
                logits,
                Y,
                attn_mask,
                pos_weight=6.0   # increased positive weight
            )

        scaler.scale(loss).backward()
        scaler.unscale_(optimizer)
        torch.nn.utils.clip_grad_norm_(model.parameters(), MAX_GRAD_NORM)
        scaler.step(optimizer)
        scaler.update()

        epoch_loss += loss.item()
        train_pbar.set_postfix(
            avg_loss=f"{epoch_loss / (train_pbar.n + 1):.4f}"
        )

    # ---------- VALIDATION (UTILITY SCORE) ----------
    model.eval()
    best_epoch_utility = -float("inf")
    best_epoch_threshold = None

    with torch.no_grad():
        for t in thresholds:
            utility = evaluate_utility(model, val_loader, t.item())
            if utility > best_epoch_utility:
                best_epoch_utility = utility
                best_epoch_threshold = t.item()

    # ---------- LOGGING ----------
    with open(LOG_PATH, "a", newline="") as f:
        writer = csv.writer(f)
        writer.writerow([
            epoch + 1,
            epoch_loss,
            best_epoch_utility,
            best_epoch_threshold
        ])

    print(
        f"Epoch [{epoch+1}/{NUM_EPOCHS}] | "
        f"Train Loss: {epoch_loss:.2f} | "
        f"Val Utility: {best_epoch_utility:.2f} | "
        f"Best τ: {best_epoch_threshold:.2f}"
    )

    # ---------- CHECKPOINTING ----------
    if best_epoch_utility > best_utility:
        best_utility = best_epoch_utility
        epochs_no_improve = 0

        torch.save(
            {
                "epoch": epoch + 1,
                "model_state": model.state_dict(),
                "best_threshold": best_epoch_threshold,
                "val_utility": best_epoch_utility
            },
            BEST_MODEL_PATH
        )

        print("New best model saved")

    else:
        epochs_no_improve += 1
        print(f"No improvement for {epochs_no_improve} epoch(s)")

    # ---------- EARLY STOPPING ----------
    if epochs_no_improve >= PATIENCE_IMPROVED:
        print("⏹ Early stopping triggered")
        break


NUM_FEATURES: 40
Model checkpoints → /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt
Training log → /content/drive/MyDrive/Thesis/PhysionetSepsis/logs/training_logv3.csv




Epoch [1/60] | Train Loss: 188.78 | Val Utility: -3860.00 | Best τ: 0.35
New best model saved




Epoch [2/60] | Train Loss: 147.24 | Val Utility: -3689.50 | Best τ: 0.35
New best model saved




Epoch [3/60] | Train Loss: 134.02 | Val Utility: -2988.00 | Best τ: 0.50
New best model saved




Epoch [4/60] | Train Loss: 127.48 | Val Utility: -3056.00 | Best τ: 0.70
No improvement for 1 epoch(s)




Epoch [5/60] | Train Loss: 121.52 | Val Utility: -2625.50 | Best τ: 0.60
New best model saved




Epoch [6/60] | Train Loss: 117.52 | Val Utility: -2044.50 | Best τ: 0.75
New best model saved




Epoch [7/60] | Train Loss: 113.84 | Val Utility: -2160.00 | Best τ: 0.30
No improvement for 1 epoch(s)




Epoch [8/60] | Train Loss: 110.70 | Val Utility: -2853.00 | Best τ: 0.65
No improvement for 2 epoch(s)




Epoch [9/60] | Train Loss: 107.12 | Val Utility: -2100.00 | Best τ: 0.55
No improvement for 3 epoch(s)




Epoch [10/60] | Train Loss: 104.91 | Val Utility: -2035.00 | Best τ: 0.60
New best model saved




Epoch [11/60] | Train Loss: 102.64 | Val Utility: -1918.50 | Best τ: 0.60
New best model saved




Epoch [12/60] | Train Loss: 101.63 | Val Utility: -2143.50 | Best τ: 0.30
No improvement for 1 epoch(s)




Epoch [13/60] | Train Loss: 99.65 | Val Utility: -2118.00 | Best τ: 0.80
No improvement for 2 epoch(s)




Epoch [14/60] | Train Loss: 97.66 | Val Utility: -1451.50 | Best τ: 0.60
New best model saved




Epoch [15/60] | Train Loss: 95.89 | Val Utility: -1895.00 | Best τ: 0.50
No improvement for 1 epoch(s)




Epoch [16/60] | Train Loss: 94.92 | Val Utility: -2275.50 | Best τ: 0.70
No improvement for 2 epoch(s)




Epoch [17/60] | Train Loss: 94.27 | Val Utility: -1902.00 | Best τ: 0.60
No improvement for 3 epoch(s)




Epoch [18/60] | Train Loss: 91.99 | Val Utility: -2087.50 | Best τ: 0.45
No improvement for 4 epoch(s)




Epoch [19/60] | Train Loss: 91.08 | Val Utility: -1438.50 | Best τ: 0.50
New best model saved




Epoch [20/60] | Train Loss: 91.06 | Val Utility: -1691.50 | Best τ: 0.50
No improvement for 1 epoch(s)




Epoch [21/60] | Train Loss: 86.51 | Val Utility: -1800.50 | Best τ: 0.80
No improvement for 2 epoch(s)




Epoch [22/60] | Train Loss: 85.22 | Val Utility: -1351.50 | Best τ: 0.65
New best model saved




Epoch [23/60] | Train Loss: 84.89 | Val Utility: -1451.50 | Best τ: 0.60
No improvement for 1 epoch(s)




Epoch [24/60] | Train Loss: 83.56 | Val Utility: -1839.00 | Best τ: 0.75
No improvement for 2 epoch(s)




Epoch [25/60] | Train Loss: 81.42 | Val Utility: -1934.00 | Best τ: 0.60
No improvement for 3 epoch(s)




Epoch [26/60] | Train Loss: 81.21 | Val Utility: -1655.50 | Best τ: 0.70
No improvement for 4 epoch(s)




Epoch [27/60] | Train Loss: 79.54 | Val Utility: -2148.00 | Best τ: 0.50
No improvement for 5 epoch(s)




Epoch [28/60] | Train Loss: 79.96 | Val Utility: -1509.50 | Best τ: 0.55
No improvement for 6 epoch(s)




Epoch [29/60] | Train Loss: 76.88 | Val Utility: -2035.00 | Best τ: 0.40
No improvement for 7 epoch(s)




Epoch [30/60] | Train Loss: 79.44 | Val Utility: -1852.50 | Best τ: 0.70
No improvement for 8 epoch(s)




Epoch [31/60] | Train Loss: 76.88 | Val Utility: -1860.00 | Best τ: 0.35
No improvement for 9 epoch(s)




Epoch [32/60] | Train Loss: 75.61 | Val Utility: -1349.00 | Best τ: 0.50
New best model saved




Epoch [33/60] | Train Loss: 73.68 | Val Utility: -1840.00 | Best τ: 0.50
No improvement for 1 epoch(s)




Epoch [34/60] | Train Loss: 74.21 | Val Utility: -1801.50 | Best τ: 0.85
No improvement for 2 epoch(s)




Epoch [35/60] | Train Loss: 70.85 | Val Utility: -1790.00 | Best τ: 0.75
No improvement for 3 epoch(s)




Epoch [36/60] | Train Loss: 71.40 | Val Utility: -1930.00 | Best τ: 0.60
No improvement for 4 epoch(s)




Epoch [37/60] | Train Loss: 69.36 | Val Utility: -1797.00 | Best τ: 0.35
No improvement for 5 epoch(s)




Epoch [38/60] | Train Loss: 68.38 | Val Utility: -1514.50 | Best τ: 0.65
No improvement for 6 epoch(s)




Epoch [39/60] | Train Loss: 68.98 | Val Utility: -1695.00 | Best τ: 0.35
No improvement for 7 epoch(s)




Epoch [40/60] | Train Loss: 67.66 | Val Utility: -1814.50 | Best τ: 0.40
No improvement for 8 epoch(s)




Epoch [41/60] | Train Loss: 67.02 | Val Utility: -1954.50 | Best τ: 0.75
No improvement for 9 epoch(s)




Epoch [42/60] | Train Loss: 64.90 | Val Utility: -2036.00 | Best τ: 0.90
No improvement for 10 epoch(s)
⏹ Early stopping triggered


# **VALIDATION**

CELL A — Model Evaluation Utilities

Purpose:
Reusable functions for validation / test evaluation (no training logic).

In [13]:
import numpy as np
import torch

# -------------------------
# Utility-based evaluation
# -------------------------
def evaluate_model_utility(model, dataloader, threshold):
    model.eval()
    total_utility = 0.0

    with torch.no_grad():
        for X, M, DT, Y, attn_mask in dataloader:
            X = X.to(DEVICE)
            M = M.to(DEVICE)
            DT = DT.to(DEVICE)
            Y = Y.to(DEVICE)
            attn_mask = attn_mask.to(DEVICE)

            logits = model(X, M, DT, attn_mask)
            probs = torch.sigmoid(logits)
            preds = (probs >= threshold).float()

            for i in range(X.size(0)):
                length = int(attn_mask[i].sum())
                total_utility += physionet_utility(
                    Y[i, :length].cpu().numpy(),
                    preds[i, :length].cpu().numpy()
                )

    return total_utility


# -------------------------
# Accuracy (hour-level)
# -------------------------
def evaluate_model_accuracy(model, dataloader, threshold):
    model.eval()
    correct = 0
    total = 0

    with torch.no_grad():
        for X, M, DT, Y, attn_mask in dataloader:
            X = X.to(DEVICE)
            M = M.to(DEVICE)
            DT = DT.to(DEVICE)
            Y = Y.to(DEVICE)
            attn_mask = attn_mask.to(DEVICE)

            logits = model(X, M, DT, attn_mask)
            probs = torch.sigmoid(logits)
            preds = (probs >= threshold).float()

            mask = attn_mask.bool()
            correct += (preds[mask] == Y[mask]).sum().item()
            total += mask.sum().item()

    return correct / total


CELL B — Load & Evaluate a Single Model

Purpose:
Evaluate one checkpoint on validation or test.

In [15]:
def evaluate_checkpoint(checkpoint_path, dataloader, split_name="Validation"):
    print(f"\nEvaluating model: {checkpoint_path}")

    checkpoint = torch.load(checkpoint_path, map_location=DEVICE)

    model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
    model.load_state_dict(checkpoint["model_state"])

    threshold = checkpoint["best_threshold"]

    utility = evaluate_model_utility(model, dataloader, threshold)
    accuracy = evaluate_model_accuracy(model, dataloader, threshold)

    print(f"{split_name} Utility Score : {utility:.2f}")
    print(f"{split_name} Accuracy      : {accuracy:.4f}")

    return {
        "path": checkpoint_path,
        "threshold": threshold,
        "utility": utility,
        "accuracy": accuracy
    }


CELL C — Evaluate ALL Saved Models (Model Comparison)

Purpose:
Compare multiple saved models fairly and pick the best.

In [17]:
# ---------------------
# Reconstruct NUM_FEATURES
# ---------------------
NUM_FEATURES = len(FEATURE_COLS)
print("NUM_FEATURES:", NUM_FEATURES)

import glob
import pandas as pd

MODEL_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models"

model_paths = sorted(glob.glob(f"{MODEL_DIR}/*.pt"))
print(f"Found {len(model_paths)} models")

results = []

for path in model_paths:
    result = evaluate_checkpoint(
        path,
        val_loader,     # or test_loader
        split_name="Validation"
    )
    results.append(result)

results_df = pd.DataFrame(results)
results_df = results_df.sort_values("utility", ascending=False)

results_df


NUM_FEATURES: 40
Found 4 models

Evaluating model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model1.pt


  output = torch._nested_tensor_from_mask(


Validation Utility Score : -1477.50
Validation Accuracy      : 0.9770

Evaluating model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model2.pt
Validation Utility Score : -1361.50
Validation Accuracy      : 0.9782

Evaluating model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv2.pt
Validation Utility Score : -1627.50
Validation Accuracy      : 0.9773

Evaluating model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt
Validation Utility Score : -1349.00
Validation Accuracy      : 0.9791


Unnamed: 0,path,threshold,utility,accuracy
3,/content/drive/MyDrive/Thesis/PhysionetSepsis/...,0.5,-1349.0,0.979139
1,/content/drive/MyDrive/Thesis/PhysionetSepsis/...,0.7,-1361.5,0.978156
0,/content/drive/MyDrive/Thesis/PhysionetSepsis/...,0.65,-1477.5,0.976952
2,/content/drive/MyDrive/Thesis/PhysionetSepsis/...,0.8,-1627.5,0.977315


In [20]:
# ============================================================
# GRID SEARCH: Temporal Smoothing Decision Logic (VALIDATION)
# ============================================================

import torch
import numpy as np
import pandas as pd

MODEL_PATH = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt"
DATALOADER = val_loader
MC_SAMPLES = 10   # only for stability; no uncertainty gating yet

# Search space (SAFE + EFFECTIVE)
THRESHOLDS = [0.35, 0.40, 0.45]
MIN_CONSECUTIVES = [2, 3]

print("Evaluating model:", MODEL_PATH)

# -------------------------
# Load model
# -------------------------
checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
model.load_state_dict(checkpoint["model_state"])

# -------------------------
# Helpers
# -------------------------
def mc_mean_predict(model, X, M, DT, attn_mask, n_samples):
    model.train()  # enable dropout (light MC averaging)
    probs = []
    with torch.no_grad():
        for _ in range(n_samples):
            logits = model(X, M, DT, attn_mask)
            probs.append(torch.sigmoid(logits))
    probs = torch.stack(probs, dim=0)
    return probs.mean(dim=0)

def temporal_smoothing(prob_seq, threshold, min_consecutive):
    raw = (prob_seq >= threshold).astype(int)
    smoothed = np.zeros_like(raw)
    count = 0
    for t in range(len(raw)):
        if raw[t] == 1:
            count += 1
            if count >= min_consecutive:
                smoothed[t] = 1
        else:
            count = 0
    return smoothed

# -------------------------
# Grid search
# -------------------------
results = []

for threshold in THRESHOLDS:
    for min_c in MIN_CONSECUTIVES:
        total_utility = 0.0

        for X, M, DT, Y, attn_mask in DATALOADER:
            X, M, DT, Y, attn_mask = (
                X.to(DEVICE),
                M.to(DEVICE),
                DT.to(DEVICE),
                Y.to(DEVICE),
                attn_mask.to(DEVICE),
            )

            mean_prob = mc_mean_predict(
                model, X, M, DT, attn_mask, MC_SAMPLES
            )

            for i in range(X.size(0)):
                length = int(attn_mask[i].sum())
                p = mean_prob[i, :length].cpu().numpy()
                y = Y[i, :length].cpu().numpy()

                preds = temporal_smoothing(
                    p, threshold, min_c
                )

                total_utility += physionet_utility(y, preds)

        results.append({
            "threshold": threshold,
            "min_consecutive": min_c,
            "utility": total_utility
        })

        print(
            f"τ={threshold:.2f}, min_consec={min_c} → Utility={total_utility:.1f}"
        )

# -------------------------
# Results table
# -------------------------
results_df = pd.DataFrame(results)
results_df = results_df.sort_values("utility", ascending=False)

print("\n===== GRID SEARCH RESULTS (BEST FIRST) =====")
results_df


Evaluating model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt
τ=0.35, min_consec=2 → Utility=-913.0
τ=0.35, min_consec=3 → Utility=-1267.0
τ=0.40, min_consec=2 → Utility=-940.0
τ=0.40, min_consec=3 → Utility=-1353.0
τ=0.45, min_consec=2 → Utility=-1065.0
τ=0.45, min_consec=3 → Utility=-1524.5

===== GRID SEARCH RESULTS (BEST FIRST) =====


Unnamed: 0,threshold,min_consecutive,utility
0,0.35,2,-913.0
2,0.4,2,-940.0
4,0.45,2,-1065.0
1,0.35,3,-1267.0
3,0.4,3,-1353.0
5,0.45,3,-1524.5


CELL D — Select BEST Model for Final Test Evaluation

Purpose:
Lock in the best validation model and evaluate on test set.

In [19]:
# ============================================================
# SAFE EVALUATION: Temporal Smoothing ONLY (no uncertainty yet)
# ============================================================

import torch
import numpy as np

MODEL_PATH = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt"
DATALOADER = val_loader
MIN_CONSECUTIVE = 2
MC_SAMPLES = 10

checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
model.load_state_dict(checkpoint["model_state"])

# LOWER threshold because smoothing already increases precision
threshold = max(0.35, checkpoint["best_threshold"] - 0.1)

print("Using threshold:", threshold)

def mc_dropout_predict(model, X, M, DT, attn_mask, n_samples):
    model.train()
    probs = []
    with torch.no_grad():
        for _ in range(n_samples):
            logits = model(X, M, DT, attn_mask)
            probs.append(torch.sigmoid(logits))
    probs = torch.stack(probs, dim=0)
    return probs.mean(dim=0)

def temporal_smoothing(prob_seq, threshold, min_consecutive):
    raw = (prob_seq >= threshold).astype(int)
    smoothed = np.zeros_like(raw)
    count = 0
    for t in range(len(raw)):
        if raw[t] == 1:
            count += 1
            if count >= min_consecutive:
                smoothed[t] = 1
        else:
            count = 0
    return smoothed

total_utility = 0.0

for X, M, DT, Y, attn_mask in DATALOADER:
    X, M, DT, Y, attn_mask = (
        X.to(DEVICE),
        M.to(DEVICE),
        DT.to(DEVICE),
        Y.to(DEVICE),
        attn_mask.to(DEVICE),
    )

    mean_prob = mc_dropout_predict(model, X, M, DT, attn_mask, MC_SAMPLES)

    for i in range(X.size(0)):
        length = int(attn_mask[i].sum())
        p = mean_prob[i, :length].cpu().numpy()
        y = Y[i, :length].cpu().numpy()

        preds = temporal_smoothing(p, threshold, MIN_CONSECUTIVE)
        total_utility += physionet_utility(y, preds)

print("SMOOTHED UTILITY SCORE:", total_utility)


Using threshold: 0.4
SMOOTHED UTILITY SCORE: -917.0


In [21]:
# ============================================================
# FINAL TEST EVALUATION (LOCKED CONFIGURATION)
# ============================================================

import torch
import numpy as np

# -------------------------
# LOCKED CONFIGURATION
# -------------------------
MODEL_PATH = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt"
DATALOADER = test_loader     # FINAL evaluation only
THRESHOLD = 0.35             # LOCKED from validation
MIN_CONSECUTIVE = 2          # LOCKED from validation
MC_SAMPLES = 10              # light MC averaging (no uncertainty gating)

print("FINAL TEST EVALUATION")
print("Model:", MODEL_PATH)
print("Threshold:", THRESHOLD)
print("Min consecutive hours:", MIN_CONSECUTIVE)

# -------------------------
# Load model
# -------------------------
checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)

model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
model.load_state_dict(checkpoint["model_state"])

# -------------------------
# Helpers
# -------------------------
def mc_mean_predict(model, X, M, DT, attn_mask, n_samples):
    model.train()  # enable dropout
    probs = []
    with torch.no_grad():
        for _ in range(n_samples):
            logits = model(X, M, DT, attn_mask)
            probs.append(torch.sigmoid(logits))
    probs = torch.stack(probs, dim=0)
    return probs.mean(dim=0)

def temporal_smoothing(prob_seq, threshold, min_consecutive):
    raw = (prob_seq >= threshold).astype(int)
    smoothed = np.zeros_like(raw)
    count = 0
    for t in range(len(raw)):
        if raw[t] == 1:
            count += 1
            if count >= min_consecutive:
                smoothed[t] = 1
        else:
            count = 0
    return smoothed

# -------------------------
# Evaluation
# -------------------------
total_utility = 0.0
correct = 0
total = 0

for X, M, DT, Y, attn_mask in DATALOADER:
    X = X.to(DEVICE)
    M = M.to(DEVICE)
    DT = DT.to(DEVICE)
    Y = Y.to(DEVICE)
    attn_mask = attn_mask.to(DEVICE)

    mean_prob = mc_mean_predict(
        model, X, M, DT, attn_mask, MC_SAMPLES
    )

    for i in range(X.size(0)):
        length = int(attn_mask[i].sum())

        p = mean_prob[i, :length].cpu().numpy()
        y = Y[i, :length].cpu().numpy()

        preds = temporal_smoothing(
            p, THRESHOLD, MIN_CONSECUTIVE
        )

        # Utility (PRIMARY METRIC)
        total_utility += physionet_utility(y, preds)

        # Accuracy (SECONDARY)
        correct += (preds == y).sum()
        total += len(y)

# -------------------------
# FINAL RESULTS
# -------------------------
accuracy = correct / total

print("--------------------------------------------------")
print("FINAL TEST RESULTS")
print("PhysioNet Utility Score :", total_utility)
print("Hourly Accuracy         :", round(accuracy, 4))
print("--------------------------------------------------")


FINAL TEST EVALUATION
Model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt
Threshold: 0.35
Min consecutive hours: 2
--------------------------------------------------
FINAL TEST RESULTS
PhysioNet Utility Score : -1379.0
Hourly Accuracy         : 0.98
--------------------------------------------------


New Trial Test Evaluation (FAIL)

In [23]:
# ============================================================
# SAFE UPGRADE: Smoothing + Alarm Persistence + Soft Uncertainty
# FINAL TEST EVALUATION (NO RETRAINING)
# ============================================================

import torch
import numpy as np

# -------------------------
# LOCKED CONFIGURATION
# -------------------------
MODEL_PATH = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt"
DATALOADER = test_loader          # FINAL test set
THRESHOLD = 0.35                 # validated
MIN_CONSECUTIVE = 2              # validated
MC_SAMPLES = 10                  # MC Dropout samples
UNCERTAINTY_ALPHA = 10.0         # soft attenuation strength (SAFE range: 5–15)

print("SAFE UPGRADE — FINAL TEST EVALUATION")
print("Model:", MODEL_PATH)
print("Threshold:", THRESHOLD)
print("Min consecutive:", MIN_CONSECUTIVE)
print("Uncertainty alpha:", UNCERTAINTY_ALPHA)

# -------------------------
# Load model
# -------------------------
checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
model.load_state_dict(checkpoint["model_state"])

# -------------------------
# Helpers
# -------------------------
def mc_mean_and_var(model, X, M, DT, attn_mask, n_samples):
    model.train()  # enable dropout
    probs = []
    with torch.no_grad():
        for _ in range(n_samples):
            logits = model(X, M, DT, attn_mask)
            probs.append(torch.sigmoid(logits))
    probs = torch.stack(probs, dim=0)
    return probs.mean(dim=0), probs.var(dim=0)

def temporal_smoothing(prob_seq, threshold, min_consecutive):
    raw = (prob_seq >= threshold).astype(int)
    smoothed = np.zeros_like(raw)
    count = 0
    for t in range(len(raw)):
        if raw[t] == 1:
            count += 1
            if count >= min_consecutive:
                smoothed[t] = 1
        else:
            count = 0
    return smoothed

def persist_alarm(preds):
    """
    Once alarm turns ON, keep it ON for the rest of the stay.
    """
    if preds.sum() == 0:
        return preds
    first_on = np.argmax(preds == 1)
    preds[first_on:] = 1
    return preds

# -------------------------
# Evaluation
# -------------------------
total_utility = 0.0
correct = 0
total = 0

for X, M, DT, Y, attn_mask in DATALOADER:
    X, M, DT, Y, attn_mask = (
        X.to(DEVICE),
        M.to(DEVICE),
        DT.to(DEVICE),
        Y.to(DEVICE),
        attn_mask.to(DEVICE),
    )

    mean_prob, var_prob = mc_mean_and_var(
        model, X, M, DT, attn_mask, MC_SAMPLES
    )

    for i in range(X.size(0)):
        length = int(attn_mask[i].sum())

        p = mean_prob[i, :length].cpu().numpy()
        v = var_prob[i, :length].cpu().numpy()
        y = Y[i, :length].cpu().numpy()

        # -------------------------
        # Soft uncertainty attenuation
        # -------------------------
        # High variance → probability softly reduced, not zeroed
        p = p * np.exp(-UNCERTAINTY_ALPHA * v)

        # -------------------------
        # Temporal smoothing
        # -------------------------
        preds = temporal_smoothing(
            p, THRESHOLD, MIN_CONSECUTIVE
        )

        # -------------------------
        # Alarm persistence
        # -------------------------
        preds = persist_alarm(preds)

        # -------------------------
        # Metrics
        # -------------------------
        total_utility += physionet_utility(y, preds)
        correct += (preds == y).sum()
        total += len(y)

# -------------------------
# FINAL RESULTS
# -------------------------
accuracy = correct / total

print("--------------------------------------------------")
print("SAFE UPGRADE — FINAL TEST RESULTS")
print("PhysioNet Utility Score :", total_utility)
print("Hourly Accuracy         :", round(accuracy, 4))
print("--------------------------------------------------")


SAFE UPGRADE — FINAL TEST EVALUATION
Model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt
Threshold: 0.35
Min consecutive: 2
Uncertainty alpha: 10.0
--------------------------------------------------
SAFE UPGRADE — FINAL TEST RESULTS
PhysioNet Utility Score : -2084.5
Hourly Accuracy         : 0.9836
--------------------------------------------------


# RETRIAN NEW **MODEL**

In [24]:
# ============================================================
# UTILITY-AWARE RETRAINING (SAME ARCH + DATA)
# Time-Weighted BCE to Encourage EARLIER Detection
# ============================================================

import os
import csv
import torch
import numpy as np
import torch.nn.functional as F
from torch.amp import autocast, GradScaler
from tqdm import tqdm

# ---------------------
# Safety: reconstruct NUM_FEATURES
# ---------------------
NUM_FEATURES = len(FEATURE_COLS)
print("NUM_FEATURES:", NUM_FEATURES)

# ---------------------
# Directories (Google Drive)
# ---------------------
BASE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis"
MODEL_DIR = os.path.join(BASE_DIR, "models")
LOG_DIR = os.path.join(BASE_DIR, "logs")
os.makedirs(MODEL_DIR, exist_ok=True)
os.makedirs(LOG_DIR, exist_ok=True)

BEST_MODEL_PATH = os.path.join(MODEL_DIR, "best_model_v4_time_weighted.pt")
LOG_PATH = os.path.join(LOG_DIR, "training_log_v4_time_weighted.csv")

print("Model →", BEST_MODEL_PATH)
print("Log   →", LOG_PATH)

# ---------------------
# Hyperparameters (SAFE, TARGETED)
# ---------------------
POS_WEIGHT = 6.5        # mild recall bias (do NOT go aggressive)
PRE_WINDOW = 6          # hours before onset emphasized
ALPHA = 2.0             # strength of early bias (start at 2.0)
PATIENCE = 10           # utility is noisy
THRESHOLDS = torch.linspace(0.3, 0.9, 13)  # narrower, FP-aware

# ---------------------
# Model / Optimizer / AMP
# ---------------------
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)

optimizer = torch.optim.AdamW(
    model.parameters(),
    lr=LEARNING_RATE,
    weight_decay=WEIGHT_DECAY
)

scaler = GradScaler(enabled=USE_AMP)

# ---------------------
# Time-Weighted Masked BCE
# ---------------------
def time_weighted_masked_bce(
    logits, y, attn_mask, onset_idx,
    pos_weight=6.5, pre_window=6, alpha=2.0
):
    """
    logits: (B, T)
    y: (B, T)
    attn_mask: (B, T)
    onset_idx: (B,) int, -1 if non-septic
    """
    B, T = y.shape
    device = y.device

    pw = torch.tensor(pos_weight, device=device)
    bce = F.binary_cross_entropy_with_logits(
        logits, y, reduction="none", pos_weight=pw
    )

    # Time weights (default 1)
    time_w = torch.ones_like(bce)

    for i in range(B):
        oi = onset_idx[i].item()
        if oi >= 0:
            t = torch.arange(T, device=device)
            pre = (t <= oi)
            dist = (oi - t).clamp(min=0)
            ramp = 1.0 + alpha * (1.0 - (dist / pre_window).clamp(0, 1))
            time_w[i, pre] = ramp[pre]

    loss = bce * time_w * attn_mask
    return loss.sum() / attn_mask.sum().clamp_min(1)

# ---------------------
# Init CSV log
# ---------------------
with open(LOG_PATH, "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["epoch", "train_loss", "val_utility", "best_threshold"])

# ---------------------
# Early stopping
# ---------------------
best_utility = -float("inf")
epochs_no_improve = 0

# ============================================================
# TRAINING LOOP
# ============================================================
for epoch in range(NUM_EPOCHS):
    model.train()
    epoch_loss = 0.0

    pbar = tqdm(train_loader, desc=f"Epoch {epoch+1}/{NUM_EPOCHS}", leave=False)

    for X, M, DT, Y, attn_mask in pbar:
        X = X.to(DEVICE)
        M = M.to(DEVICE)
        DT = DT.to(DEVICE)
        Y = Y.to(DEVICE)
        attn_mask = attn_mask.to(DEVICE)

        # Onset index per patient (first positive hour), -1 if none
        onset_idx = torch.full((Y.size(0),), -1, device=Y.device)
        for i in range(Y.size(0)):
            pos = torch.where(Y[i] == 1)[0]
            if len(pos) > 0:
                onset_idx[i] = pos[0]

        optimizer.zero_grad(set_to_none=True)

        with autocast(device_type="cuda", enabled=USE_AMP):
            logits = model(X, M, DT, attn_mask)
            loss = time_weighted_masked_bce(
                logits, Y, attn_mask, onset_idx,
                pos_weight=POS_WEIGHT,
                pre_window=PRE_WINDOW,
                alpha=ALPHA
            )

        scaler.scale(loss).backward()
        scaler.unscale_(optimizer)
        torch.nn.utils.clip_grad_norm_(model.parameters(), MAX_GRAD_NORM)
        scaler.step(optimizer)
        scaler.update()

        epoch_loss += loss.item()
        pbar.set_postfix(avg_loss=f"{epoch_loss / (pbar.n + 1):.4f}")

    # ---------------------
    # VALIDATION (UTILITY)
    # ---------------------
    model.eval()
    best_epoch_utility = -float("inf")
    best_epoch_threshold = None

    with torch.no_grad():
        for t in THRESHOLDS:
            util = evaluate_utility(model, val_loader, t.item())
            if util > best_epoch_utility:
                best_epoch_utility = util
                best_epoch_threshold = t.item()

    # ---------------------
    # LOGGING
    # ---------------------
    with open(LOG_PATH, "a", newline="") as f:
        writer = csv.writer(f)
        writer.writerow([
            epoch + 1,
            round(epoch_loss, 4),
            round(best_epoch_utility, 4),
            best_epoch_threshold
        ])

    print(
        f"Epoch [{epoch+1}/{NUM_EPOCHS}] | "
        f"Train Loss: {epoch_loss:.2f} | "
        f"Val Utility: {best_epoch_utility:.2f} | "
        f"Best τ: {best_epoch_threshold:.2f}"
    )

    # ---------------------
    # CHECKPOINTING
    # ---------------------
    if best_epoch_utility > best_utility:
        best_utility = best_epoch_utility
        epochs_no_improve = 0
        torch.save(
            {
                "epoch": epoch + 1,
                "model_state": model.state_dict(),
                "best_threshold": best_epoch_threshold,
                "val_utility": best_epoch_utility,
                "pos_weight": POS_WEIGHT,
                "pre_window": PRE_WINDOW,
                "alpha": ALPHA
            },
            BEST_MODEL_PATH
        )
        print("✓ New best model saved")
    else:
        epochs_no_improve += 1
        print(f"No improvement for {epochs_no_improve} epoch(s)")

    if epochs_no_improve >= PATIENCE:
        print("⏹ Early stopping triggered")
        break

print("Training complete. Best model saved to:")
print(BEST_MODEL_PATH)


NUM_FEATURES: 40
Model → /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model_v4_time_weighted.pt
Log   → /content/drive/MyDrive/Thesis/PhysionetSepsis/logs/training_log_v4_time_weighted.csv




Epoch [1/60] | Train Loss: 238.47 | Val Utility: -4190.50 | Best τ: 0.45
✓ New best model saved




Epoch [2/60] | Train Loss: 188.62 | Val Utility: -4334.00 | Best τ: 0.75
No improvement for 1 epoch(s)




Epoch [3/60] | Train Loss: 171.35 | Val Utility: -3497.50 | Best τ: 0.70
✓ New best model saved




Epoch [4/60] | Train Loss: 163.80 | Val Utility: -2600.50 | Best τ: 0.60
✓ New best model saved




Epoch [5/60] | Train Loss: 158.85 | Val Utility: -2952.50 | Best τ: 0.55
No improvement for 1 epoch(s)




Epoch [6/60] | Train Loss: 155.83 | Val Utility: -2363.50 | Best τ: 0.75
✓ New best model saved




Epoch [7/60] | Train Loss: 147.04 | Val Utility: -2518.50 | Best τ: 0.65
No improvement for 1 epoch(s)




Epoch [8/60] | Train Loss: 143.36 | Val Utility: -2093.00 | Best τ: 0.80
✓ New best model saved




Epoch [9/60] | Train Loss: 141.12 | Val Utility: -1609.00 | Best τ: 0.85
✓ New best model saved




Epoch [10/60] | Train Loss: 135.33 | Val Utility: -2215.00 | Best τ: 0.70
No improvement for 1 epoch(s)




Epoch [11/60] | Train Loss: 135.40 | Val Utility: -2116.00 | Best τ: 0.60
No improvement for 2 epoch(s)




Epoch [12/60] | Train Loss: 130.86 | Val Utility: -2185.00 | Best τ: 0.60
No improvement for 3 epoch(s)




Epoch [13/60] | Train Loss: 131.29 | Val Utility: -1766.00 | Best τ: 0.75
No improvement for 4 epoch(s)




Epoch [14/60] | Train Loss: 129.99 | Val Utility: -1368.00 | Best τ: 0.75
✓ New best model saved




Epoch [15/60] | Train Loss: 126.52 | Val Utility: -2282.50 | Best τ: 0.75
No improvement for 1 epoch(s)




Epoch [16/60] | Train Loss: 124.07 | Val Utility: -1790.00 | Best τ: 0.80
No improvement for 2 epoch(s)




Epoch [17/60] | Train Loss: 122.93 | Val Utility: -1696.00 | Best τ: 0.75
No improvement for 3 epoch(s)




Epoch [18/60] | Train Loss: 121.84 | Val Utility: -1757.00 | Best τ: 0.65
No improvement for 4 epoch(s)




Epoch [19/60] | Train Loss: 121.38 | Val Utility: -1752.00 | Best τ: 0.65
No improvement for 5 epoch(s)




Epoch [20/60] | Train Loss: 119.08 | Val Utility: -1830.00 | Best τ: 0.75
No improvement for 6 epoch(s)




Epoch [21/60] | Train Loss: 118.97 | Val Utility: -1618.00 | Best τ: 0.70
No improvement for 7 epoch(s)




Epoch [22/60] | Train Loss: 114.79 | Val Utility: -1498.00 | Best τ: 0.75
No improvement for 8 epoch(s)




Epoch [23/60] | Train Loss: 113.71 | Val Utility: -1861.50 | Best τ: 0.65
No improvement for 9 epoch(s)




Epoch [24/60] | Train Loss: 111.32 | Val Utility: -1600.50 | Best τ: 0.60
No improvement for 10 epoch(s)
⏹ Early stopping triggered
Training complete. Best model saved to:
/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model_v4_time_weighted.pt


v4 Validation

In [25]:
# ============================================================
# VALIDATION — v4 TIME-WEIGHTED MODEL
# Policy Selection (Threshold + Smoothing)
# ============================================================

import torch
import numpy as np
import pandas as pd

MODEL_PATH = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model_v4_time_weighted.pt"
DATALOADER = val_loader
MC_SAMPLES = 10

THRESHOLDS = [0.30, 0.35, 0.40]
MIN_CONSECUTIVES = [2, 3]

print("Validating model:", MODEL_PATH)

# -------------------------
# Load model
# -------------------------
checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
model.load_state_dict(checkpoint["model_state"])

# -------------------------
# Helpers
# -------------------------
def mc_mean_predict(model, X, M, DT, attn_mask, n_samples):
    model.train()  # enable dropout
    probs = []
    with torch.no_grad():
        for _ in range(n_samples):
            logits = model(X, M, DT, attn_mask)
            probs.append(torch.sigmoid(logits))
    return torch.stack(probs, dim=0).mean(dim=0)

def temporal_smoothing(prob_seq, threshold, min_consecutive):
    raw = (prob_seq >= threshold).astype(int)
    smoothed = np.zeros_like(raw)
    count = 0
    for t in range(len(raw)):
        if raw[t] == 1:
            count += 1
            if count >= min_consecutive:
                smoothed[t] = 1
        else:
            count = 0
    return smoothed

# -------------------------
# Grid search
# -------------------------
results = []

for tau in THRESHOLDS:
    for min_c in MIN_CONSECUTIVES:
        total_utility = 0.0

        for X, M, DT, Y, attn_mask in DATALOADER:
            X, M, DT, Y, attn_mask = (
                X.to(DEVICE),
                M.to(DEVICE),
                DT.to(DEVICE),
                Y.to(DEVICE),
                attn_mask.to(DEVICE),
            )

            mean_prob = mc_mean_predict(
                model, X, M, DT, attn_mask, MC_SAMPLES
            )

            for i in range(X.size(0)):
                length = int(attn_mask[i].sum())
                p = mean_prob[i, :length].cpu().numpy()
                y = Y[i, :length].cpu().numpy()

                preds = temporal_smoothing(p, tau, min_c)
                total_utility += physionet_utility(y, preds)

        results.append({
            "threshold": tau,
            "min_consecutive": min_c,
            "utility": total_utility
        })

        print(f"τ={tau:.2f}, min_consec={min_c} → Utility={total_utility:.1f}")

# -------------------------
# Results table
# -------------------------
results_df = pd.DataFrame(results).sort_values("utility", ascending=False)
print("\n===== VALIDATION RESULTS (BEST FIRST) =====")
results_df


Validating model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model_v4_time_weighted.pt
τ=0.30, min_consec=2 → Utility=-1243.0
τ=0.30, min_consec=3 → Utility=-1270.5
τ=0.35, min_consec=2 → Utility=-1068.5
τ=0.35, min_consec=3 → Utility=-1243.0
τ=0.40, min_consec=2 → Utility=-967.5
τ=0.40, min_consec=3 → Utility=-1220.0

===== VALIDATION RESULTS (BEST FIRST) =====


Unnamed: 0,threshold,min_consecutive,utility
4,0.4,2,-967.5
2,0.35,2,-1068.5
5,0.4,3,-1220.0
0,0.3,2,-1243.0
3,0.35,3,-1243.0
1,0.3,3,-1270.5


In [28]:
# ============================================================
# FINAL TEST EVALUATION — v4 TIME-WEIGHTED MODEL (LOCKED)
# ============================================================

import torch
import numpy as np

MODEL_PATH = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model_v4_time_weighted.pt"
DATALOADER = test_loader

LOCKED_THRESHOLD = 0.40        # ← replace after validation
LOCKED_MIN_CONSECUTIVE = 2     # ← replace after validation
MC_SAMPLES = 10

print("FINAL TEST EVALUATION (v4)")
print("Model:", MODEL_PATH)
print("Threshold:", LOCKED_THRESHOLD)
print("Min consecutive:", LOCKED_MIN_CONSECUTIVE)

# -------------------------
# Load model
# -------------------------
checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
model.load_state_dict(checkpoint["model_state"])

# -------------------------
# Helpers
# -------------------------
def mc_mean_predict(model, X, M, DT, attn_mask, n_samples):
    model.train()
    probs = []
    with torch.no_grad():
        for _ in range(n_samples):
            logits = model(X, M, DT, attn_mask)
            probs.append(torch.sigmoid(logits))
    return torch.stack(probs, dim=0).mean(dim=0)

def temporal_smoothing(prob_seq, threshold, min_consecutive):
    raw = (prob_seq >= threshold).astype(int)
    smoothed = np.zeros_like(raw)
    count = 0
    for t in range(len(raw)):
        if raw[t] == 1:
            count += 1
            if count >= min_consecutive:
                smoothed[t] = 1
        else:
            count = 0
    return smoothed

# -------------------------
# Evaluation
# -------------------------
total_utility = 0.0
correct = 0
total = 0

for X, M, DT, Y, attn_mask in DATALOADER:
    X, M, DT, Y, attn_mask = (
        X.to(DEVICE),
        M.to(DEVICE),
        DT.to(DEVICE),
        Y.to(DEVICE),
        attn_mask.to(DEVICE),
    )

    mean_prob = mc_mean_predict(
        model, X, M, DT, attn_mask, MC_SAMPLES
    )

    for i in range(X.size(0)):
        length = int(attn_mask[i].sum())
        p = mean_prob[i, :length].cpu().numpy()
        y = Y[i, :length].cpu().numpy()

        preds = temporal_smoothing(
            p, LOCKED_THRESHOLD, LOCKED_MIN_CONSECUTIVE
        )

        total_utility += physionet_utility(y, preds)
        correct += (preds == y).sum()
        total += len(y)

accuracy = correct / total

print("--------------------------------------------------")
print("FINAL TEST RESULTS — v4 MODEL")
print("PhysioNet Utility Score :", total_utility)
print("Hourly Accuracy         :", round(accuracy, 4))
print("--------------------------------------------------")


FINAL TEST EVALUATION (v4)
Model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model_v4_time_weighted.pt
Threshold: 0.4
Min consecutive: 2
--------------------------------------------------
FINAL TEST RESULTS — v4 MODEL
PhysioNet Utility Score : -1487.5
Hourly Accuracy         : 0.9801
--------------------------------------------------


TRIAL EVAL

In [31]:
# ============================================================
# NEW DECISION POLICY: Early-Window Trigger + Smoothing
# FINAL TEST EVALUATION (NO RETRAINING)
# ============================================================

import torch
import numpy as np

# -------------------------
# CONFIG (LOCKED)
# -------------------------
MODEL_PATH = "/content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt"
DATALOADER = test_loader

TAU_EARLY = 0.25          # early trigger threshold
EARLY_WINDOW = 3          # hours
TAU_MAIN = 0.35           # main threshold
MIN_CONSECUTIVE = 2
MC_SAMPLES = 10

print("EARLY-WINDOW TEST EVALUATION")
print("Model:", MODEL_PATH)
print(f"Early τ={TAU_EARLY}, window={EARLY_WINDOW}")
print(f"Main τ={TAU_MAIN}, min_consec={MIN_CONSECUTIVE}")

# -------------------------
# Load model
# -------------------------
checkpoint = torch.load(MODEL_PATH, map_location=DEVICE)
model = TemporalTransformer(NUM_FEATURES).to(DEVICE)
model.load_state_dict(checkpoint["model_state"])

# -------------------------
# Helpers
# -------------------------
def mc_mean_predict(model, X, M, DT, attn_mask, n_samples):
    model.train()
    probs = []
    with torch.no_grad():
        for _ in range(n_samples):
            logits = model(X, M, DT, attn_mask)
            probs.append(torch.sigmoid(logits))
    return torch.stack(probs, dim=0).mean(dim=0)

def temporal_smoothing(prob_seq, threshold, min_consecutive):
    raw = (prob_seq >= threshold).astype(int)
    smoothed = np.zeros_like(raw)
    count = 0
    for t in range(len(raw)):
        if raw[t] == 1:
            count += 1
            if count >= min_consecutive:
                smoothed[t] = 1
        else:
            count = 0
    return smoothed

def early_window_trigger(prob_seq, tau_early, window):
    preds = np.zeros_like(prob_seq, dtype=int)
    for t in range(len(prob_seq)):
        start = max(0, t - window + 1)
        if np.max(prob_seq[start:t+1]) >= tau_early:
            preds[t] = 1
            preds[t:] = 1  # persistence
            break
    return preds

# -------------------------
# Evaluation
# -------------------------
total_utility = 0.0
correct = 0
total = 0

for X, M, DT, Y, attn_mask in DATALOADER:
    X, M, DT, Y, attn_mask = (
        X.to(DEVICE),
        M.to(DEVICE),
        DT.to(DEVICE),
        Y.to(DEVICE),
        attn_mask.to(DEVICE),
    )

    mean_prob = mc_mean_predict(model, X, M, DT, attn_mask, MC_SAMPLES)

    for i in range(X.size(0)):
        length = int(attn_mask[i].sum())

        p = mean_prob[i, :length].cpu().numpy()
        y = Y[i, :length].cpu().numpy()

        # Stage 1: early trigger
        preds_early = early_window_trigger(
            p, TAU_EARLY, EARLY_WINDOW
        )

        # Stage 2: fallback smoothing
        if preds_early.sum() == 0:
            preds = temporal_smoothing(
                p, TAU_MAIN, MIN_CONSECUTIVE
            )
        else:
            preds = preds_early

        total_utility += physionet_utility(y, preds)
        correct += (preds == y).sum()
        total += len(y)

accuracy = correct / total

print("--------------------------------------------------")
print("EARLY-WINDOW TEST RESULTS")
print("PhysioNet Utility Score :", total_utility)
print("Hourly Accuracy         :", round(accuracy, 4))
print("--------------------------------------------------")


EARLY-WINDOW TEST EVALUATION
Model: /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_modelv3.pt
Early τ=0.25, window=3
Main τ=0.4, min_consec=2
--------------------------------------------------
EARLY-WINDOW TEST RESULTS
PhysioNet Utility Score : -1584.5
Hourly Accuracy         : 0.97
--------------------------------------------------


# **V5 TEST TRIAL**

CELL V5-1 — Setup & Paths (NEW)

In [35]:
# ============================================================
# V5 EXPERIMENT — SETUP & PATHS (ALIGNED WITH EXISTING FOLDERS)
# ============================================================

import os
import torch

# Base directory (unchanged)
BASE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis"

# ------------------------------------------------------------
# Use EXISTING folders, namespace v5 inside them
# ------------------------------------------------------------
PROCESSED_DIR = os.path.join(BASE_DIR, "processed")
MODELS_DIR = os.path.join(BASE_DIR, "models")
LOGS_DIR = os.path.join(BASE_DIR, "logs")

# Create if missing (safe)
os.makedirs(PROCESSED_DIR, exist_ok=True)
os.makedirs(MODELS_DIR, exist_ok=True)
os.makedirs(LOGS_DIR, exist_ok=True)

# ------------------------------------------------------------
# V5-specific paths (NO overwrites)
# ------------------------------------------------------------
V5_PREPROCESSED_PATH = os.path.join(
    PROCESSED_DIR, "patients_v5_trends.pt"
)

V5_MODEL_PATH = os.path.join(
    MODELS_DIR, "best_model_v5_trends.pt"
)

V5_LOG_PATH = os.path.join(
    LOGS_DIR, "training_log_v5_trends.csv"
)

# ------------------------------------------------------------
# Print for verification
# ------------------------------------------------------------
print("V5 preprocessed data →", V5_PREPROCESSED_PATH)
print("V5 model checkpoint  →", V5_MODEL_PATH)
print("V5 training log      →", V5_LOG_PATH)


V5 preprocessed data → /content/drive/MyDrive/Thesis/PhysionetSepsis/processed/patients_v5_trends.pt
V5 model checkpoint  → /content/drive/MyDrive/Thesis/PhysionetSepsis/models/best_model_v5_trends.pt
V5 training log      → /content/drive/MyDrive/Thesis/PhysionetSepsis/logs/training_log_v5_trends.csv


CELL V5-2 — Trend Feature Engineering Utilities (NEW)

In [36]:
# ============================================================
# V5 — TREND FEATURE FUNCTIONS
# ============================================================

import numpy as np
import pandas as pd

def add_trend_features_v5(df, feature_cols):
    """
    Adds explicit trend features:
    - delta
    - rolling mean (3h)
    - rolling slope (3h)
    No imputation is performed.
    """
    df = df.copy()

    for col in feature_cols:
        # Delta
        df[f"{col}_delta"] = df[col].diff()

        # Rolling mean (3 hours)
        df[f"{col}_mean3"] = df[col].rolling(
            window=3, min_periods=1
        ).mean()

        # Rolling slope (3 hours)
        def slope(x):
            if len(x) < 2:
                return 0.0
            t = np.arange(len(x))
            return np.polyfit(t, x, 1)[0]

        df[f"{col}_slope3"] = (
            df[col]
            .rolling(window=3, min_periods=2)
            .apply(slope, raw=True)
        )

    return df


CELL V5-3 — Preprocess Dataset with Trends (NEW, RUN ONCE)

In [38]:
# ============================================================
# V5 DATA PREPROCESSING (OPTIMIZED, NO WARNINGS)
# Trend features computed in batch
# Saved to: processed/patients_v5_trends.pt
# RUN ONCE ONLY
# ============================================================

import os
import glob
import time
import numpy as np
import pandas as pd
import torch
from tqdm.auto import tqdm

# ---------------------
# PATHS
# ---------------------
BASE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis"
DATA_DIR = os.path.join(
    BASE_DIR,
    "physionet.org/files/challenge-2019/1.0.0/training"
)

SAVE_DIR = "/content/drive/MyDrive/Thesis/PhysionetSepsis/processed"
os.makedirs(SAVE_DIR, exist_ok=True)

SAVE_PATH = os.path.join(SAVE_DIR, "patients_v5_trends.pt")

print("Saving v5 processed data to:")
print(SAVE_PATH)

# ---------------------
# Raw PhysioNet features
# ---------------------
FEATURE_COLS_V5 = [
    "HR","O2Sat","Temp","SBP","MAP","DBP","Resp","EtCO2",
    "BaseExcess","HCO3","FiO2","pH","PaCO2","SaO2",
    "AST","BUN","Alkalinephos","Calcium","Chloride",
    "Creatinine","Bilirubin_direct","Glucose","Lactate",
    "Magnesium","Phosphate","Potassium","Bilirubin_total",
    "TroponinI","Hct","Hgb","PTT","WBC","Fibrinogen",
    "Platelets","Age","Gender","Unit1","Unit2","HospAdmTime"
]

# ---------------------
# Trend feature engineering (BATCHED)
# ---------------------
def add_trend_features_v5(df, feature_cols):
    """
    Computes trend features in batch and concatenates once
    to avoid DataFrame fragmentation.
    """
    trend_features = {}

    for col in feature_cols:
        series = df[col]

        # Delta
        trend_features[f"{col}_delta"] = series.diff()

        # Rolling mean (3 hours)
        trend_features[f"{col}_mean3"] = series.rolling(
            window=3, min_periods=1
        ).mean()

        # Rolling slope (3 hours)
        def slope(x):
            if len(x) < 2:
                return 0.0
            t = np.arange(len(x))
            return np.polyfit(t, x, 1)[0]

        trend_features[f"{col}_slope3"] = (
            series
            .rolling(window=3, min_periods=2)
            .apply(slope, raw=True)
        )

    trend_df = pd.DataFrame(trend_features, index=df.index)

    # Concatenate ONCE → no fragmentation
    return pd.concat([df, trend_df], axis=1)

# ---------------------
# Build full feature list
# ---------------------
TREND_COLS_V5 = []
for c in FEATURE_COLS_V5:
    TREND_COLS_V5.extend([
        f"{c}_delta",
        f"{c}_mean3",
        f"{c}_slope3"
    ])

ALL_FEATURE_COLS_V5 = FEATURE_COLS_V5 + TREND_COLS_V5
NUM_FEATURES_V5 = len(ALL_FEATURE_COLS_V5)

print(f"Total v5 features: {NUM_FEATURES_V5}")

# ---------------------
# Collect patient files
# ---------------------
psv_files = sorted(
    glob.glob(os.path.join(DATA_DIR, "training_setA", "*.psv")) +
    glob.glob(os.path.join(DATA_DIR, "training_setB", "*.psv"))
)

print(f"Total patients: {len(psv_files)}")

# ---------------------
# Process patients (REAL PROGRESS BAR)
# ---------------------
patients_v5 = []

start_time = time.time()

with tqdm(
    total=len(psv_files),
    desc="V5 preprocessing",
    unit="patient",
    dynamic_ncols=True
) as pbar:

    for path in psv_files:
        df = pd.read_csv(path, sep="|")

        # Add trend features (optimized)
        df = add_trend_features_v5(df, FEATURE_COLS_V5)

        # Labels
        y = df["SepsisLabel"].values.astype(np.float32)

        # Feature matrix
        X = df[ALL_FEATURE_COLS_V5].values.astype(np.float32)

        # Missingness mask
        M = ~np.isnan(X)

        # Time since last measurement
        DT = np.zeros_like(X)
        for j in range(X.shape[1]):
            last = -1
            for t in range(len(X)):
                if not np.isnan(X[t, j]):
                    DT[t, j] = 0
                    last = t
                else:
                    DT[t, j] = (t - last) if last >= 0 else 0

        patients_v5.append({
            "X": torch.tensor(np.nan_to_num(X, nan=0.0)),
            "M": torch.tensor(M.astype(np.float32)),
            "DT": torch.tensor(DT.astype(np.float32)),
            "Y": torch.tensor(y),
            "length": len(y)
        })

        pbar.update(1)

elapsed = (time.time() - start_time) / 60

# ---------------------
# SAVE
# ---------------------
torch.save(
    {
        "patients": patients_v5,
        "feature_cols": ALL_FEATURE_COLS_V5
    },
    SAVE_PATH
)

print("===================================================")
print("V5 preprocessing COMPLETE")
print("Saved to:", SAVE_PATH)
print(f"Total time: {elapsed:.2f} minutes")
print("===================================================")


Saving v5 processed data to:
/content/drive/MyDrive/Thesis/PhysionetSepsis/processed/patients_v5_trends.pt
Total v5 features: 156
Total patients: 40317


V5 preprocessing:   0%|          | 0/40317 [00:00<?, ?patient/s]

V5 preprocessing COMPLETE
Saved to: /content/drive/MyDrive/Thesis/PhysionetSepsis/processed/patients_v5_trends.pt
Total time: 520.10 minutes


CELL V5-4 — Load Preprocessed v5 Data (FAST)

Run this every time after runtime restart.

In [None]:
# ============================================================
# V5 — LOAD PREPROCESSED DATA
# ============================================================

data = torch.load(
    "/content/drive/MyDrive/Thesis/PhysionetSepsis/processed/patients_v5_trends.pt",
    map_location="cpu"
)

patients_v5 = data["patients"]
FEATURE_COLS_V5 = data["feature_cols"]
NUM_FEATURES_V5 = len(FEATURE_COLS_V5)

print("Loaded v5 patients:", len(patients_v5))
print("NUM_FEATURES_V5:", NUM_FEATURES_V5)

CELL V5-5 — Build Datasets & Loaders (NEW)

Uses same split logic as v3 (patient-level, no leakage).

In [None]:
# ============================================================
# V5 — DATASET & DATALOADERS
# ============================================================

from torch.utils.data import Dataset, DataLoader
import random

class ICUSequenceDataset(Dataset):
    def __init__(self, patients):
        self.patients = patients

    def __len__(self):
        return len(self.patients)

    def __getitem__(self, idx):
        p = self.patients[idx]
        return p["X"], p["M"], p["DT"], p["Y"], p["length"]

def collate_fn(batch):
    Xs, Ms, DTs, Ys, lengths = zip(*batch)
    max_len = max(lengths)

    def pad(seq):
        return torch.nn.functional.pad(
            seq, (0, 0, 0, max_len - seq.shape[0])
        )

    X = torch.stack([pad(x) for x in Xs])
    M = torch.stack([pad(m) for m in Ms])
    DT = torch.stack([pad(dt) for dt in DTs])
    Y = torch.stack([pad(y.unsqueeze(-1)).squeeze(-1) for y in Ys])

    attn_mask = torch.zeros(len(lengths), max_len)
    for i, l in enumerate(lengths):
        attn_mask[i, :l] = 1

    return X, M, DT, Y, attn_mask

# ---------------------
# Patient-level split (same ratios as v3)
# ---------------------
random.seed(42)
random.shuffle(patients_v5)

n = len(patients_v5)
train_p = patients_v5[:int(0.7 * n)]
val_p   = patients_v5[int(0.7 * n):int(0.85 * n)]
test_p  = patients_v5[int(0.85 * n):]

train_loader_v5 = DataLoader(
    ICUSequenceDataset(train_p),
    batch_size=8,
    shuffle=True,
    collate_fn=collate_fn
)

val_loader_v5 = DataLoader(
    ICUSequenceDataset(val_p),
    batch_size=8,
    shuffle=False,
    collate_fn=collate_fn
)

test_loader_v5 = DataLoader(
    ICUSequenceDataset(test_p),
    batch_size=8,
    shuffle=False,
    collate_fn=collate_fn
)

print("V5 loaders ready")


CELL V5-6 — Train v5 Model (UNCHANGED LOGIC)

You can now reuse your best training cell, with only: