In [1]:
import os
main = os.chdir(os.path.dirname(os.path.dirname(os.getcwd())))

In [2]:
import numpy as np
import pandas as pd

from model import model_architecture, output_results, utils
from sksurv.util import Surv as skSurv

Using TensorFlow backend.


# I. Simulations

We first choose the type of pseudo-observation we want to use among the followings:
- "pseudo-optim"
- "pseudo-km"
- "pseudo-continuous"
- "pseudo-discrete"

In [3]:
name = "pseudo-optim" 

We choose the censoring rate of the simulated data (it can be either 0.2, 0.4 or 0.6). Data is simulated by the random function generator introduced by Friedman et al. (2001). 
Data is normalized (with mean and std from train set for train and test set) and splitted into training and test set (df_train and df_test are subsets of df_sim). The same training and test set are used for all the models.

In [4]:
rate = "0.2" 

In [5]:
dir_sim = "data/simulations/"+rate+"/"

In [6]:
df_sim = pd.read_csv(dir_sim+'simdata.csv')
df_train = pd.read_csv(dir_sim+'sim_train.csv')
y_train = pd.read_csv(dir_sim+ name + "_" + str(rate) +".csv")

df_test = pd.read_csv(dir_sim+'sim_test.csv')
y_test_all = df_test[['yy','status']]
durations_test, events_test = df_test['yy'].values, df_test['status'].values

In [7]:
x_train, x_test = utils.normed_data(df_train, df_test)
x_train_all, y_train_all, x_test_all = utils.prepare_pseudobs(x_train, y_train, df_train, x_test, df_test, name)

# II. Model's construction and training

The parameters of the architecture are the one listed in the parameters dataframe, selected by a 5-fold cross-validation among 100 sets of parameters. 

In [8]:
param = pd.read_csv("model/param_simu_"+rate+".csv",sep=';', index_col = 0).T
param_final = param.loc[name]

In [9]:
param_final

neurons           128
drop              0.6
activation       relu
lr_opt         0.0025
optimizer     rmsprop
n_layers            2
Name: pseudo-optim, dtype: object

In [10]:
neurons = int(param_final['neurons'])
drop = float(param_final['drop'])
activation = param_final['activation']
lr_opt = float(param_final['lr_opt'])
optimizer = param_final['optimizer']
n_layers = int(param_final['n_layers'])

The objective function is used to define the architecture of the neural network. 

In [11]:
model,callbacks  = model_architecture.objective_pseudobs(x_train_all, neurons, drop,  activation, lr_opt, optimizer, n_layers)
log = model.fit(x_train_all, y_train_all, batch_size = 32, epochs = 100, callbacks = callbacks, verbose=2)

Epoch 1/100
 - 1s - loss: 0.2295
Epoch 2/100
 - 1s - loss: 0.1686
Epoch 3/100
 - 1s - loss: 0.1481
Epoch 4/100
 - 1s - loss: 0.1441
Epoch 5/100
 - 1s - loss: 0.1363
Epoch 6/100
 - 1s - loss: 0.1307
Epoch 7/100
 - 1s - loss: 0.1286
Epoch 8/100
 - 1s - loss: 0.1252
Epoch 9/100
 - 1s - loss: 0.1234
Epoch 10/100
 - 1s - loss: 0.1186
Epoch 11/100
 - 1s - loss: 0.1177
Epoch 12/100
 - 1s - loss: 0.1143
Epoch 13/100
 - 1s - loss: 0.1149
Epoch 14/100
 - 1s - loss: 0.1105
Epoch 15/100
 - 1s - loss: 0.1102
Epoch 16/100
 - 1s - loss: 0.1122
Epoch 17/100
 - 1s - loss: 0.1084
Epoch 18/100
 - 1s - loss: 0.1084
Epoch 19/100
 - 1s - loss: 0.1067
Epoch 20/100
 - 1s - loss: 0.1035
Epoch 21/100
 - 1s - loss: 0.1057
Epoch 22/100
 - 1s - loss: 0.1047
Epoch 23/100
 - 1s - loss: 0.1065
Epoch 24/100
 - 1s - loss: 0.1044
Epoch 25/100
 - 1s - loss: 0.1039
Epoch 26/100
 - 1s - loss: 0.1018
Epoch 27/100
 - 1s - loss: 0.1039
Epoch 28/100
 - 1s - loss: 0.0983
Epoch 29/100
 - 1s - loss: 0.1010
Epoch 30/100
 - 1s - lo

# III. Results

We present here the results for one simulation dataset. We then simulate 100 datasets for one censoring rate and output the results for the 100 datasets. 

In [12]:
surv = output_results.make_predictions_pseudobs(model, y_train, x_test_all, x_test,name)
results_all = output_results.output_simulations(surv,df_train, x_test, df_test, name)

We output the median survival time, the AUC value for this time, Uno's C-index for median time and the final censoring rate of the simulated dataset.

In [13]:
results_all

Unnamed: 0,t_med,auc_med,unoc,cens_rate
0,1.203649,0.81119,0.765314,19.333333
