# Individual risk modeling with Cox proportional hazards and Random Survival Forests

What you will learn  
- Fit a Cox model and interpret hazard ratios with confidence intervals  
- Check proportional hazards using Schoenfeld residuals  
- Fit a Random Survival Forest (RSF) for non-linear effects and interactions  
- Evaluate with concordance index, time-dependent AUC at 7, 30, 90 days, and integrated Brier score  
- Calibrate 30-day survival probabilities and stratify patients into risk quintiles

Clinical lens  
- Cox is great for explanation and policy rules  
- RSF is great for flexible prediction when relationships are non-linear


In [None]:
# Why: import once and confirm environment
import warnings
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.impute import SimpleImputer
from sklearn.inspection import permutation_importance

import lifelines
from lifelines import CoxPHFitter
from lifelines.statistics import proportional_hazard_test
from lifelines.calibration import survival_probability_calibration

from sksurv.util import Surv
from sksurv.linear_model import CoxPHSurvivalAnalysis
from sksurv.ensemble import RandomSurvivalForest
from sksurv.metrics import (
    concordance_index_censored,
    cumulative_dynamic_auc,
    integrated_brier_score,
)

import platform
print("Python", platform.python_version())
print("pandas", pd.__version__, "numpy", np.__version__, "scikit-learn", sklearn.__version__)
print("lifelines", lifelines.__version__)


## Validate labels and define analysis targets

We align the time scale to in-hospital death and verify label consistency

- Duration = `Length_of_stay` in days  
- Event = inferred from (`Survival`, `Length_of_stay`) using the dataset rules and cross-checked with `In-hospital_death`  
- If provided labels and inferred labels disagree more than a small tolerance, we use the inferred labels

This teaches students to check assumptions before building models and to connect modeling targets to their clinical definitions :contentReference[oaicite:3]{index=3}
