# Example Deep Gaussian Processes for Survival Analysis


Deep Multi-task Gaussian Processes for Survival Analysis with Competing Risks [paper](https://papers.nips.cc/paper/6827-deep-multi-task-gaussian-processes-for-survival-analysis-with-competing-risks.pdf)

This tutorial shows how to use DPG for survival analysis. We are using [The Veterans' Administration Lung Cancer Trial](http://lib.stat.cmu.edu/datasets/veteran) as an example. It is based upon an example from [scikit-survival](https://github.com/sebp/scikit-survival/blob/master/examples/00-introduction.ipynb). 

We need to transform this dataset into a csv file that can be processed by the model: Events needs to be converted to numbers: event 1=1, event 2=2, etc, and event time is in days.

In [None]:
from sksurv.datasets import load_veterans_lung_cancer
from sksurv.preprocessing import OneHotEncoder
import numpy as np

data_x, data_y = load_veterans_lung_cancer()

# convert categorical variables into numeric values.
data_x_numeric = OneHotEncoder().fit_transform(data_x)

df = data_x_numeric
# events needs to be converted to numbers: event 1=1, event 2=2, etc, and event time is in days.
df['Status'] = np.where(data_y['Status'], 1, 0)
df['Survival_in_days'] = data_y['Survival_in_days']
df.to_csv('veterans_lung_cancer.csv', index=False)
df.head()

Run DGP and calculate the c-index:

In [None]:
event_horizon_days = 365*5  # 5 years
!python3 dgp.py -i 'veterans_lung_cancer.csv' --target 'Status' --time 'Survival_in_days' --horizon  {event_horizon_days}