# Mortality Prediction


@References : Soenksen, L.R., Ma, Y., Zeng, C. et al. Integrated multimodal artificial intelligence framework for healthcare applications. npj Digit. Med. 5, 149 (2022). https://doi.org/10.1038/s41746-022-00689-4

In this notebook, the task is to predict 48 hour mortality using the CSV embeddings file



## Introduction


The goal of this part of the study is to build models to predict the probability that a patient will expire during the next 48 h as a binary classification problem: expired ≤48 h (1) or otherwise (0). In the case of a patient whose hospital exit status is not expiration, the class label is set to 0. A patient can acquire different target class labels at different time points during their stay due to changes in status and proximity to the discharge or time of death. Similar to the length-of-stay modeling, each sample in this predictive task corresponds to a single patient-admission EHR time point where an X-ray image was obtained (N = 45,050).


#### Imports

In [1]:
import os
os.chdir('../')

from xgboost import XGBClassifier
from pandas import read_csv

from src.data import constants
from src.data.dataset import HAIMDataset
from src.data.sampling import Sampler
from src.evaluation.evaluating import Evaluator
from src.evaluation.tuning import SklearnTuner
from src.utils.metric_scores import *

#### Read data from local source



In [4]:
df = read_csv(constants.FILE_DF, nrows=constants.N_DATA)

#### Create a custom dataset for the HAIM experiment


Build the target column for the task at hand, set the dataset specificities:  the ``haim_id`` as a ``global_id``, use all sources for prediction

In [5]:
dataset = HAIMDataset(df,  
                      constants.ALL_PREDICTORS, 
                      constants.ALL_MODALITIES, 
                      constants.MORTALITY, 
                      constants.IMG_ID, 
                      constants.GLOBAL_ID)

#### Create the sampler


Sample the data using a 5 folds cross-validation method based on unique ``haim_id`` 

In [7]:
sampler = Sampler(dataset, constants.GLOBAL_ID, 5)
_, masks = sampler()

100%|███████████████████████████████████████████| 5/5 [00:00<00:00, 2788.77it/s]


#### Select the evaluation metrics

Initilialize a list containing the evaluation metrics to report

In [8]:
# Initialization of the list containing the evaluation metrics
evaluation_metrics = [BinaryAccuracy(), 
                      BinaryBalancedAccuracy(),
                      BinaryBalancedAccuracy(Reduction.GEO_MEAN),
                      Sensitivity(), 
                      Specificity(), 
                      AUC(), 
                      BrierScore(),
                      BinaryCrossEntropy()]

#### Set hyper-parameters and fixed parameters

In [9]:
# Define the grid oh hyper-parameters for the tuning
grid_hps = {'max_depth': [5, 6, 7, 8],
            'n_estimators': [200, 300],
            'learning_rate': [0.3, 0.1, 0.05],
            }

# Save the fixed parameters of the model
fixed_params = {'seed': 42,
                'eval_metric': 'logloss',
                'verbosity': 0
                }

### Model training and predictions using an XGBClassifier model with GridSearchCV and Hyperparameters optimization


The goal of this section of the notebook is to compute the following metrics:

``ACCURACY_SCORE, BALANCED_ACCURACY_SCORE, SENSITIVITY, SPECIFICITY, AUC, BRIER SCORE, BINARY CROSS-ENTROPY``


The
hyperparameter combinations of individual XGBoost models were
selected within each training loop using a ``fivefold cross-validated
grid search`` on the training set (80%). This XGBoost ``tuning process``
selected the ``maximum depth of the trees (5–8)``, the number of
``estimators (200 or 300)``, and the ``learning rate (0.05, 0.1, 0.3)``
according to the parameter value combination leading to the
highest observed AUROC within the training loop 


As mentioned previously, all XGBoost models were trained ``five times with five different data splits`` to repeat the
experiments and compute average metrics 


```Refer to page 8 of study``` : https://doi.org/10.1038/s41746-022-00689-4

In [2]:
evaluation = Evaluator(dataset=dataset,
                       masks=masks,
                       metrics=evaluation_metrics,
                       model=XGBClassifier,
                       tuner=SklearnTuner,
                       tuning_metric=AUC(),
                       hps=grid_hps,
                       n_tuning_splits=5,
                       fixed_params=fixed_params,
                       filepath=constants.EXPERIMENT_PATH,
                       weight='scale_pos_weight',
                       evaluation_name=''
                       evaluation_name='48h mortality'
                       )
evaluation.evaluate()


#### Comparison with the paper results:






In [2]:
Evaluator.visualize_results('experiments/48h mortality', constants.MORTALITY)

Unnamed: 0,Accuracy,BalancedAcc,GeoBalancedAcc,Sensitivity,Specificity,AUC,BrierScore,BCE
train_metrics,1.0 +- 0.0,1.0 +- 0.0,1.0 +- 0.0,1.0 +- 0.0,1.0 +- 0.0,1.0 +- 0.0,0.0 +- 0.0,0.0005 +- 0.0004
test_metrics,0.9737 +- 0.002,0.5182 +- 0.0037,0.1896 +- 0.0196,0.0363 +- 0.0074,1.0 +- 0.0,0.9066 +- 0.0072,0.0195 +- 0.0011,0.1098 +- 0.0052
HAIM,--,--,--,--,--,0.912,--,--
NON_HAIM,--,--,--,--,--,0.889,--,--
