<center><h1 style="background-color: #C6F3CD; border-radius: 10px; color: #FFFFFF; padding: 5px;">
Cox Regression
</h1><center/>

**Link to the article** : https://medium.com/@soulawalid/cox-regression-for-survival-analysis-fd60e2b32dd2?sk=c28320cd2dfb2c8ded15337182e18b2e

In [1]:
import numpy as np
import pandas as pd
from lifelines import CoxPHFitter

In [3]:
# Set seed for reproducibility
np.random.seed(42)

# Number of patients
n_patients = 100

# Simulate age and gender
ages = np.random.normal(60, 10, n_patients).astype(int)  # Age around 60 with SD of 10
genders = np.random.choice([0, 1], n_patients)  # 0: Male, 1: Female

# Simulate treatment type
treatments = np.random.choice([0, 1], n_patients)  # 0: Treatment A, 1: Treatment B

# Simulate survival times (days) and events
# Let's assume Treatment B is more effective
base_hazard = 0.02
time_A = np.random.exponential(scale=1 / base_hazard, size=n_patients)  # Treatment A
time_B = np.random.exponential(scale=1 / (base_hazard * 1.5), size=n_patients)  # Treatment B (longer survival)

# Combine times based on treatment type
survival_times = np.where(treatments == 0, time_A, time_B)

# Simulate events (1: event occurred, 0: censored)
events = np.random.binomial(1, 0.8, n_patients)  # 80% event occurrence rate

# Create DataFrame
data = pd.DataFrame({
    'age': ages,
    'gender': genders,
    'treatment': treatments,
    'time': survival_times,
    'event': events
})

data.head()

Unnamed: 0,age,gender,treatment,time,event
0,64,0,1,14.667138,1
1,58,1,0,34.921766,1
2,66,0,0,43.007795,1
3,75,0,1,4.11268,1
4,57,0,0,10.860745,1


In [4]:
# Initialize and fit the Cox Proportional Hazards model
cph = CoxPHFitter()
cph.fit(data, duration_col='time', event_col='event')

# Print the summary of the model
cph.print_summary()

0,1
model,lifelines.CoxPHFitter
duration col,'time'
event col,'event'
baseline estimation,breslow
number of observations,100
number of events observed,70
partial log-likelihood,-247.51
time fit was run,2024-07-22 12:14:09 UTC

Unnamed: 0,coef,exp(coef),se(coef),coef lower 95%,coef upper 95%,exp(coef) lower 95%,exp(coef) upper 95%,cmp to,z,p,-log2(p)
age,-0.02,0.98,0.01,-0.05,0.01,0.96,1.01,0.0,-1.56,0.12,3.07
gender,-0.44,0.65,0.25,-0.92,0.05,0.4,1.05,0.0,-1.75,0.08,3.66
treatment,0.52,1.69,0.26,0.02,1.03,1.02,2.79,0.0,2.04,0.04,4.58

0,1
Concordance,0.59
Partial AIC,501.02
log-likelihood ratio test,8.20 on 3 df
-log2(p) of ll-ratio test,4.57
