# Introduction to the xgbsurv package - Accelerated Failure Time

This notebook introduces `xgbsurv` using a specific dataset. It structured by the following steps:

- Load data
- Load model
- Fit model
- Predict and evaluate model

The syntax conveniently follows that of sklearn.

In [1]:
from xgbsurv.datasets import load_metabric
from xgbsurv import XGBSurv
from sklearn.model_selection import train_test_split
import numpy as np
%load_ext autoreload
%autoreload 2


## Load Data

In [2]:
data, target = load_metabric(path="/Users/JUSC/Documents/xgbsurv/xgbsurv/datasets/data/", as_frame=False)
target_sign = np.sign(target)
X_train, X_test, y_train, y_test = train_test_split(data, target, stratify=target_sign)

## Load Model

In [3]:
model = XGBSurv(n_estimators=100, objective="aft_objective",
                                             eval_metric="aft_loss",
                                             learning_rate=0.3,
                                             random_state=7, 
                                             disable_default_eval_metric=True,
                                             base_score=0.0)

The options of loss and objective functions can be obtained like below:

In [4]:
print(model.get_loss_functions().keys())
print(model.get_objective_functions().keys())

dict_keys(['breslow_loss', 'efron_loss', 'cind_loss', 'deephit_loss', 'aft_loss'])
dict_keys(['breslow_objective', 'efron_objective', 'cind_objective', 'deephit_objective', 'aft_objective'])


## Fit Model

In [5]:
eval_set = [(X_train, y_train)]

In [6]:
model.fit(X_train, y_train, eval_set=eval_set)

gradient [-0.00038347  0.00091862 -0.00065103 ... -0.00040878 -0.00064186
 -0.00021019]
[0]	validation_0-aft_likelihood:2.25016
gradient [-0.00038391  0.00091957 -0.00065213 ... -0.00040945 -0.0006437
 -0.00020994]
[1]	validation_0-aft_likelihood:2.24802
gradient [-0.00038436  0.0009201  -0.00065322 ... -0.00041013 -0.00064555
 -0.00020966]
[2]	validation_0-aft_likelihood:2.24602
gradient [-0.0003848   0.00091899 -0.00065423 ... -0.0004108  -0.00064736
 -0.00020941]
[3]	validation_0-aft_likelihood:2.24388
gradient [-0.00038285  0.00091923 -0.00065599 ... -0.00041106 -0.00065064
 -0.00020847]
[4]	validation_0-aft_likelihood:2.24185
gradient [-0.00038088  0.00092009 -0.00065769 ... -0.00041132 -0.00065389
 -0.00020751]
[5]	validation_0-aft_likelihood:2.24001
gradient [-0.00038142  0.00092024 -0.00065923 ... -0.00041168 -0.00065667
 -0.00020651]
[6]	validation_0-aft_likelihood:2.23827
gradient [-0.0003819   0.00091882 -0.0006608  ... -0.00041198 -0.00065953
 -0.00020554]
[7]	validation_0-

The model can be saved like below. Note that objective and eval_metric are not saved.

In [7]:
model.save_model("aft_model.json")



## Predict

In [8]:
preds_train = model.predict(X_train, output_margin=True)
preds_test = model.predict(X_test, output_margin=True)

In [9]:
preds_test

array([-0.19664694,  0.03747117,  0.20572296, -0.5110288 ,  0.25593626,
       -0.12037577, -0.37777936,  0.4138791 ,  0.15182146, -0.18654987,
       -0.06907195,  0.15294518, -0.1494228 ,  0.39027557, -0.14986373,
        0.1207994 ,  0.05859881, -0.23284891, -0.1411149 ,  0.05613565,
        0.01947766, -0.24143383, -0.27849904,  0.211208  ,  0.21652137,
       -0.1386647 ,  0.134942  ,  0.08852444, -0.14568774,  0.15107696,
        0.15035832, -0.0099461 ,  0.36334425, -0.32113457,  0.27168718,
       -0.1500694 ,  0.12703529, -0.14256918, -0.07817414, -0.1233663 ,
        0.29236835, -0.2883798 , -0.06976104, -0.51431584,  0.07568631,
        0.0238933 ,  0.19463752, -0.16118217,  0.26836222, -0.1386647 ,
        0.18002872,  0.20518917,  0.29348204,  0.2529692 ,  0.31707793,
       -0.09532505, -0.15831205, -0.16515766, -0.44173452,  0.02593636,
        0.21801522, -0.09828568, -0.29537186, -0.14558174, -0.16273256,
        0.37342507,  0.06822326,  0.3329116 , -0.34557107,  0.16

## Evaluate

In [10]:
#from sksurv.metrics import concordance_index_censored
from xgbsurv.evaluation import cindex_censored, ibs

In [11]:
# train
cindex_censored(y_train, preds_train)

0.7712782493126619

In [12]:
# test
cindex_censored(y_test, preds_test)

0.612474707486584

In [None]:
times = np.arange(356)
ibs(target, target, preds, times)

NameError: name 'preds' is not defined