## Step 1: load datasets
Below we generate some toy datasets using generate_toy_datasets() as defined in utils.py. User can load their own survival datasets into "datasets", which should be a list of (X, time, event) tuples, where X, time, and event are the design matrix, survival time and event vectors for a given dataset. 

In [None]:
from utils import generate_toy_datasets

n_datasets = 10 # generate 10 datasets
n_min, n_max = 100, 200 # number of samples in each dataset is a random integer between 100 and 200
n_features = 10000 # number of features is 10000
datasets = generate_toy_datasets(n_datasets, n_min, n_max, n_features)

## Step 2 (optional): feature transformation
If necessary, we can first preprocess X so that it is standardized. We provide in preprocessing.py two basic types of feature transformation functions:
- __rank_transform()__: transform features of each sample into normalized ranks
- __zscore_transform()__: transform each feature to be zero mean and unit std across samples

We can then wrap them in a FeatureTransformer object which defines the fit_transform method for our "datasets" list

In [None]:
from preprocessing import rank_transform, FeatureTransformer

feature_transformer = FeatureTransformer(rank_transform)
datasets_transformed = feature_transformer.fit_transform(datasets)

## Step 3: split into training and validation
We provide train_test_split() in utils.py to split "datasets" list in a stratified way. That is, each dataset in "datasets" is split according to test_size

In [None]:
from utils import train_test_split

datasets_train, datasets_val = train_test_split(datasets_transformed, test_size=0.2)

## Step 4 (optional): feature selection
We can additionally perform a feature selection step to reduce the number of features before model training. In feature_selection.py we provide a feature selection method based on concordance index as commonly used to characterize the feature's correlation with survival. 

Also note that our feature selection for multiple datasets is based on meta-analysis. The concordance index is calculated for each dataset and combined into a meta-score based on the size of the dataset. This is done by wrapping the score function in a SelectKBestMeta object which defines the fit and transform function. 

In [None]:
from feature_selection import concordance_index, SelectKBestMeta

topK = 1024 # select top 1024 features

## fit on training
feature_selector = SelectKBestMeta(concordance_index, topK)
feature_selector.fit(datasets_train)

## apply to both
datasets_train_new = feature_selector.transform(datasets_train)
datasets_val_new = feature_selector.transform(datasets_val)

## Step 5: user defined keras model
This is the core input required of the user. Below we provide a simple fully-connected network with 3 hidden layers. Note that there is no need to apply any activation function to the input layer. We are building a survival regression model!

In [None]:
from keras.models import Model
from keras.layers import Input, Dense, Activation, Dropout
import keras.backend as K


def model_builder(input_shape):
    '''
    Define a callable keras model yourself
    '''
    x = Input(shape=input_shape)
    #--------START OF USER CODE-------
    a0 = Dropout(0.3)(x)
    z1 = Dense(units=1024, activation=None)(a0)
    a1 = Activation(activation='elu')(z1)
    a1 = Dropout(0.5)(a1)
    z2 = Dense(units=1024, activation=None)(a1)
    a2 = Activation(activation='elu')(z2)
    a2 = Dropout(0.5)(a2)
    y = Dense(units=1, activation=None)(a2)
    #--------END OF USER CODE-------
    
    model = Model(inputs=x, outputs=y)
    return model

## Step 6: create a model and train
We provide a high-level class SurvivalModel to facilitate model training. In SurvivalModel.fit(), There are two modes for model training: merge or decentralized. For mode='dencentral'. each dataset will be treated as a mini-batch. For mode='merge', the datasets are merged into a single dataset and mini-batches are sampled from the merged dataset. If your datasets are very heterogenous (eg different cancers), consider mode='decentral'; otherwise, mode='merge' should be the choice.

In [None]:
from models import SurvivalModel

survival_model = SurvivalModel(model_builder)
survival_model.fit(datasets_train_new, datasets_val_new, loss_func='cox', epochs=1000, lr=0.001, mode='decentralize')

This model achieves an almost perfect performance on the training dataset but not so on the testing dataset. This is expected since our simulated datasets are just randomly generated and there is nothing to learn (it'll be surprising if it does learn anything useful...). You can provide your own dataset and check if it also works on testing dataset. 

## Step 7: predict on testing data

In [None]:
X_test, time_test, event_test = datasets_train_new[2]
score_test = survival_model.predict(X_test).ravel()
_, cindex = survival_model.evaluate(X_test, time_test, event_test)
print(cindex)