# Adjutorium survival analysis

This tutorial will show how to train and learn the best ensemble for a dataset while visualizing the performance using [Tensorboard](https://www.tensorflow.org/tensorboard/).

### Setup

In [None]:
# third party
from lifelines.datasets import load_rossi
from sklearn.model_selection import train_test_split
import pandas as pd
import json

import warnings
warnings.filterwarnings('ignore')

### Import RiskEstimationStudy

RiskEstimationStudy is the engine that learns an ensemble of survival analysis pipelines and their hyperparameters automatically.

In [None]:
from adjutorium.studies.risk_estimation import RiskEstimationStudy


### Load the target dataset

Adjutorium expects pandas.DataFrames as input.

For this example, we will use the [Rossi dataset](https://rdrr.io/cran/RcmdrPlugin.survival/man/Rossi.html).

In [None]:
from lifelines.datasets import load_rossi

rossi = load_rossi()

X = rossi.drop(["week", "arrest"], axis=1)
Y = rossi["arrest"]
T = rossi["week"]

eval_time_horizons = [
    int(T[Y.iloc[:] == 1].quantile(0.25)),
    int(T[Y.iloc[:] == 1].quantile(0.50)),
    int(T[Y.iloc[:] == 1].quantile(0.75)),
]


### Create the risk estimation study

While Adjutorium provides default plugins, it allows the user to customize the plugins for the pipelines.

You can see the supported plugins below:

In [None]:
# List the available plugins
import json
from pathlib import Path
from adjutorium.plugins import Plugins

print(json.dumps(Plugins().list(), indent=2))

We will set a few custom plugins for the pipelines and create the classifier study.

In [None]:
workspace = Path("workspace")
workspace.mkdir(parents=True, exist_ok=True)

study_name="test_risk_estimation_studies"

study = RiskEstimationStudy(
    study_name=study_name,
    dataset=rossi,
    target="arrest",
    time_to_event="week",
    time_horizons=eval_time_horizons,
    num_iter=10, # number of BO iterations per estimator. Default: 50
    num_study_iter=1, # number of outer optimization iterations. Default: 5
    timeout=10, # timeout(seconds) for optimization for each estimator. Default: 600 
    risk_estimators=["cox_ph", "lognormal_aft", "loglogistic_aft"],
    workspace=workspace,
    score_threshold=0.4,
)

### Search for the best ensemble


In [None]:
study.run()

In [None]:
from adjutorium.utils.serialization import load_model_from_file
from adjutorium.utils.tester import evaluate_survival_estimator

output = workspace / study_name / "model.p"

model = load_model_from_file(output)

metrics = evaluate_survival_estimator(model, X, T, Y, eval_time_horizons)

print(f"Model {model.name()}")
print(f"score: {metrics['clf']}")

# Congratulations!

Congratulations on completing this notebook tutorial! If you enjoyed this and would like to join the movement towards Machine learning and AI for medicine, you can do so in the following ways!

### Star Adjutorium on GitHub

The easiest way to help our community is just by starring the Repos! This helps raise awareness of the tools we're building.

- [Star Adjutorium](https://github.com/vanderschaarlab/adjutorium-framework)
