# Introduction to the xgbsurv package - Accelerated Hazards

This notebook introduces `xgbsurv` using a specific dataset. It structured by the following steps:

- Load data
- Load model
- Fit model
- Predict and evaluate model

The syntax conveniently follows that of sklearn.

In [13]:
from xgbsurv.datasets import load_metabric
from xgbsurv import XGBSurv
from sklearn.model_selection import train_test_split
import numpy as np
%load_ext autoreload
%autoreload 2


The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Load Data

In [14]:
data, target = load_metabric(path="/Users/JUSC/Documents/xgbsurv/xgbsurv/datasets/data/", as_frame=False)
target_sign = np.sign(target)
X_train, X_test, y_train, y_test = train_test_split(data, target, stratify=target_sign)

## Load Model

In [15]:
model = XGBSurv(n_estimators=100, objective="ah_objective",
                                             eval_metric="ah_loss",
                                             learning_rate=0.3,
                                             random_state=7, 
                                             disable_default_eval_metric=True,
                                             base_score=0.0)

The options of loss and objective functions can be obtained like below:

In [16]:
print(model.get_loss_functions().keys())
print(model.get_objective_functions().keys())

dict_keys(['breslow_loss', 'efron_loss', 'cind_loss', 'deephit_loss', 'aft_loss', 'ah_loss'])
dict_keys(['breslow_objective', 'efron_objective', 'cind_objective', 'deephit_objective', 'aft_objective', 'ah_objective'])


## Fit Model

In [17]:
eval_set = [(X_train, y_train)]

In [18]:
model.fit(X_train, y_train, eval_set=eval_set)

[0]	validation_0-ah_likelihood:4.27727
[1]	validation_0-ah_likelihood:4.27686
[2]	validation_0-ah_likelihood:4.27646
[3]	validation_0-ah_likelihood:4.27605
[4]	validation_0-ah_likelihood:4.27565
[5]	validation_0-ah_likelihood:4.27525
[6]	validation_0-ah_likelihood:4.27484
[7]	validation_0-ah_likelihood:4.27444
[8]	validation_0-ah_likelihood:4.27403
[9]	validation_0-ah_likelihood:4.27364
[10]	validation_0-ah_likelihood:4.27324
[11]	validation_0-ah_likelihood:4.27285
[12]	validation_0-ah_likelihood:4.27245
[13]	validation_0-ah_likelihood:4.27205
[14]	validation_0-ah_likelihood:4.27166
[15]	validation_0-ah_likelihood:4.27126
[16]	validation_0-ah_likelihood:4.27087
[17]	validation_0-ah_likelihood:4.27048
[18]	validation_0-ah_likelihood:4.27008
[19]	validation_0-ah_likelihood:4.26968
[20]	validation_0-ah_likelihood:4.26929
[21]	validation_0-ah_likelihood:4.26890
[22]	validation_0-ah_likelihood:4.26850
[23]	validation_0-ah_likelihood:4.26811
[24]	validation_0-ah_likelihood:4.26771
[25]	valid

The model can be saved like below. Note that objective and eval_metric are not saved.

In [19]:
model.save_model("aft_model.json")



## Predict

In [20]:
preds_train = model.predict(X_train, output_margin=True)
preds_test = model.predict(X_test, output_margin=True)

In [21]:
preds_test

array([-0.03132122, -0.03036948, -0.02342741, -0.02261352, -0.02277499,
       -0.03132122, -0.03132122, -0.03036948, -0.03036948, -0.02261352,
       -0.03169414, -0.03036948, -0.02261352, -0.02914932, -0.03169414,
       -0.03132122, -0.01586759, -0.02914932, -0.03036948, -0.01586759,
       -0.03169414, -0.01586759, -0.02261352, -0.02261352, -0.01945358,
       -0.02342741, -0.02358455, -0.02914932, -0.03132122, -0.03132122,
       -0.03264588, -0.02261352, -0.01945358, -0.01586759, -0.02263372,
       -0.02358455, -0.01586759, -0.01586759, -0.03169414, -0.03132122,
       -0.0159177 , -0.03264588, -0.02342741, -0.01586759, -0.02263102,
       -0.0159177 , -0.02277499, -0.02261352, -0.02914932, -0.02261352,
       -0.02358455, -0.01945358, -0.02261352, -0.02342741, -0.02914932,
       -0.01882225, -0.025411  , -0.02342741, -0.02358455, -0.02261352,
       -0.03264588, -0.02342741, -0.02914932, -0.01586759, -0.03132122,
       -0.02261352, -0.02261352, -0.0159177 , -0.01947467, -0.02

## Evaluate

In [22]:
#from sksurv.metrics import concordance_index_censored
from xgbsurv.evaluation import cindex_censored, ibs

In [23]:
# train
cindex_censored(y_train, preds_train)

0.6719579736693919

In [24]:
# test
cindex_censored(y_test, preds_test)

0.608648873117143