# Cox Regression

A data set $(xi,  ti)$ for $i = 1; . . . ; 10$ is concerned with a survival regression
model, whereby

$$\large h (t \hspace{1 mm} | \hspace{1 mm} x) = h_o (t) {\rm e}^{x \beta} $$

$$where$$

$$
h_o(t) = \text{baseline hazard function} \\
h (t \hspace{1 mm} | \hspace{1 mm} x) = \text{hazard function for an individual with predictor variable x.} \\
$$

In effect, the baseline hazard is the hazard for an individual with $x = 0$.<br>
This is also known as a proportional hazards model.<br> 
Recall that if $f_o(t)$ is the baseline density function for survival time $T$ then:

$$
h_o(t) = \dfrac{f_o(t)}{S_o(t)} \hspace{3 mm} \text{where} \hspace{3 mm}  S_o(t)  = \int_{t}^{\infty}f_o(s) \hspace{2 mm} ds
$$

(1) (10 pts) If $\large f_o (t) = \theta {\rm e}^{- t \theta}$ for $t > 0$  for some $\theta > 0$, find $h_o(t)$.

\begin{align}
h_o(t) & = \dfrac{f(t)}{S(t)} \\
\\
& = \dfrac{\theta {\rm e}^{- t \theta}}{{\rm e}^{- t \theta}} \\ 
& = \theta
\end{align}

(2) Hence, write down $f(t \hspace{2 mm} | \hspace{2 mm} x, \hspace{2 mm} \theta, \hspace{2 mm} \beta)$ corresponding to the hazard function $ h (t \hspace{1 mm} | \hspace{1 mm} x)$.

$$\large h (t \hspace{1 mm} | \hspace{1 mm} x) = h_o (t) {\rm e}^{x \beta} $$

$$\large h (t \hspace{1 mm} | \hspace{1 mm} x) = \theta {\rm e}^{x \beta} $$

Recall if,

$\large \hspace{100 mm} h(t) = c   $ then $\large f(t) = c {\rm e}^{-c t}$

\begin{align}
\large
f(t \hspace{2 mm} | \hspace{2 mm} x, \hspace{2 mm} \theta, \hspace{2 mm} \beta) & \large = c {\rm e}^{-c t}
\\
&\large  = (\theta {\rm e}^{x \beta})\hspace{1 mm} {\rm e}^{- t \hspace{1 mm} (\theta {\rm e}^{x \beta})}
\end{align}

(3) Following on from part (2), write down the expression of the log-likelihood
function in terms of the data and $(\theta, \beta)$.

$$\large I(1 :n) = \prod_{i=1}^n \dfrac{h(t_i | x_i)}{\Sigma_{j\geq i}\hspace{2 mm}h(t_j | x_j)} \\
\large I(1 :n) = \prod_{i=1}^n \dfrac{h_o (t_i) {\rm e}^{x \beta}}{\Sigma_{j\geq i}\hspace{2 mm}h(t_j | x_j)}\\
\large I(1 :n) = \prod_{i=1}^n \dfrac{{\rm e}^{x_i \beta}}{\Sigma_{j\geq i}\hspace{2 mm}{\rm e}^{x_j \beta}} \\
\large L(\beta) = \Sigma_{i=1}^n \hspace{2 mm} [ x_i \beta - log \hspace{2 mm} (\Sigma_{j\geq i}^n \hspace{2 mm}{\rm e}^{x_j \beta})] $$

(4) Assuming $\theta = 1$, find the maximum likelihood estimator for $\beta$; i.e., $\hat\beta$

$$\large \dfrac{\partial L}{\partial \beta} =  \hspace{4 mm} \Sigma_{i=1}^n \hspace{2 mm}[ x_i  - \dfrac{\Sigma_{j\geq i}^n \hspace{2 mm} x_j {\rm e}^{x_j \beta}}{\Sigma_{j\geq i}^n \hspace{2 mm}{\rm e}^{x_j \beta}}]$$

$$\large \dfrac{\partial^2 L}{\partial \beta ^2} =  \hspace{4 mm} \Sigma_{i=1}^n \hspace{2 mm}
[ - \dfrac{\Sigma_{j\geq i}^n \hspace{2 mm} x_j^2 {\rm e}^{x_j \beta}}{\Sigma_{j\geq i}^n \hspace{2 mm}{\rm e}^{x_j \beta}} 
+ (\dfrac{\Sigma_{j\geq i}^n \hspace{2 mm} x_j {\rm e}^{x_j \beta}}{\Sigma_{j\geq i}^n \hspace{2 mm}{\rm e}^{x_j \beta}})^2]$$

In [1]:
import math
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd

In [2]:
df = pd.read_csv('data_2.csv')
df.sort_values('t', inplace = True)
df.reset_index(inplace = True, drop =True)
xArr = df.x.values
df

Unnamed: 0,x,t
0,1.812486,0.017334
1,-0.34168,0.146967
2,-0.026972,0.293785
3,-1.326232,0.304046
4,0.073958,0.309173
5,0.287339,0.361833
6,0.783739,0.639589
7,0.416258,0.687852
8,0.304462,1.744188
9,-1.457712,2.956782


In [3]:
def get_Lprime(x, beta):
    total = 0
    for idx, xi in enumerate(x):        
        total += (xi - np.sum(x[idx:]*np.exp(x[idx:]*beta)) /  np.sum(np.exp(x[idx:]*beta)))
    return total

def get_Lprime2(x, beta):
    total = 0
    for idx, xi in enumerate(x): 
        term_1 = np.sum((x[idx:]**2)*np.exp(x[idx:]*beta)) /  np.sum(np.exp(x[idx:]*beta))
        term2 = np.sum(x[idx:]*np.exp(x[idx:]*beta)) /  np.sum(np.exp(x[idx:]*beta))**2
        
        total += (-term_1 + term2)
    return total

In [4]:
def newton_raphson(xArr, b0, tolerance = 0.00001):
    """
    Performs Newton-Raphson root finding.
    
    Args:
        xArr (np.ndarray): Column array with x values.
        yArr (np.ndarray): Column array with y values (discrete).
        b_0 (float): Initial guess for regression parameters.
        tolerance (float): Stops iteration when difference between iterations
            is within tolerance.
    """

 
    difference = tolerance * 5
    iter = 0 
    beta_iter = [b0]
    while abs(difference) > tolerance:
        
        f = get_Lprime(xArr, b0)         
        f_prime = get_Lprime2(xArr, b0)    
        b = b0 - (f / f_prime)

        # calculate difference and update iteration state
        difference = abs(b-b0)
        b0 = b

        beta_iter.append(b0)
        iter +=1
    
    if iter == 50:
        return np.nan, np.nan
    
    return b, beta_iter

In [5]:
# Run NR with different starting points
betas = [2, 4, 0, -2 , 1]

root = [] # Container for different starting points
for b_0 in betas:
    beta_1, _ = newton_raphson(xArr, b_0)
    root.append(beta_1)

In [6]:
df_nr = pd.DataFrame(root, columns = ['beta_hat'])
df_nr.index.name = 'Trial'
df_nr

Unnamed: 0_level_0,beta_hat
Trial,Unnamed: 1_level_1
0,0.61442
1,0.614421
2,0.614396
3,0.614394
4,0.61442


In [7]:
print('Maximum Likelihood Estimator for Beta_hat =', df_nr.mean()[0])

Maximum Likelihood Estimator for Beta_hat = 0.6144101284400094


(5) Parametric bootstrap

In [8]:
beta_hat = df_nr.mean()[0]


0.6144101284400094

$$\large h (t \hspace{1 mm} | \hspace{1 mm} x) = \theta {\rm e}^{x \beta} $$

In [12]:
beta_i = []
for i in range(1000):
    
    # Obtain T samples using exponential parameter
    ti = []
    for xi in xArr:    

        ti_iter = np.random.exponential(np.exp(xi*beta_hat), size = 1)[0]        
        ti.append(ti_iter)
    
    df_temp = pd.DataFrame({'x': xArr, 't':ti})
    df_temp.sort_values('t', inplace = True)
    df_temp.reset_index(inplace = True, drop =True)

    beta_1, _ = newton_raphson(df_temp['x'].values, 0.9, tolerance = 0.001)
    beta_i.append(beta_1)
    print(i, end=',')

In [14]:
beta_i

[-0.44128565135989534,
 -1.0953093059284416,
 -0.6812944241614777,
 -4.140485072829828,
 -4.127204511778041,
 -4.451728178784738,
 -9.728527716858949,
 -0.6057888633598061,
 -0.40939509357178167,
 -0.6387302064209397,
 -0.36162500908467243,
 -0.35272818511737514,
 -0.3763892216212748,
 -0.5896760055314709,
 -0.47203898545473905,
 -0.7927331815271327,
 -0.18363215443787298,
 -4.436440859129645,
 -0.34780863224833436,
 -1.1726970664199043]

In [13]:
var_beta_hat_pb = np.array(beta_i).var(ddof = 1)
var_beta_hat_pb

5.8721542988497095

$$ \beta \sim N( 0.614, 0.1123)  $$