# Before you use this template

This template is just a recommended template for project Report. It only considers the general type of research in our paper pool. Feel free to edit it to better fit your project. You will iteratively update the same notebook submission for your draft and the final submission. Please check the project rubriks to get a sense of what is expected in the template.

---

# FAQ and Attentions
* Copy and move this template to your Google Drive. Name your notebook by your team ID (upper-left corner). Don't eidt this original file.
* This template covers most questions we want to ask about your reproduction experiment. You don't need to exactly follow the template, however, you should address the questions. Please feel free to customize your report accordingly.
* any report must have run-able codes and necessary annotations (in text and code comments).
* The notebook is like a demo and only uses small-size data (a subset of original data or processed data), the entire runtime of the notebook including data reading, data process, model training, printing, figure plotting, etc,
must be within 8 min, otherwise, you may get penalty on the grade.
  * If the raw dataset is too large to be loaded  you can select a subset of data and pre-process the data, then, upload the subset or processed data to Google Drive and load them in this notebook.
  * If the whole training is too long to run, you can only set the number of training epoch to a small number, e.g., 3, just show that the training is runable.
  * For results model validation, you can train the model outside this notebook in advance, then, load pretrained model and use it for validation (display the figures, print the metrics).
* The post-process is important! For post-process of the results,please use plots/figures. The code to summarize results and plot figures may be tedious, however, it won't be waste of time since these figures can be used for presentation. While plotting in code, the figures should have titles or captions if necessary (e.g., title your figure with "Figure 1. xxxx")
* There is not page limit to your notebook report, you can also use separate notebooks for the report, just make sure your grader can access and run/test them.
* If you use outside resources, please refer them (in any formats). Include the links to the resources if necessary.

github: https://github.com/zhihuiw328/CS598-final-project

# Introduction
This is an introduction to your report, you should edit this text/mardown section to compose. In this text/markdown, you should introduce:

*   Background of the problem
  * what type of problem: disease/readmission/mortality prediction,  feature engineeing, data processing, etc
  * what is the importance/meaning of solving the problem
  * what is the difficulty of the problem
  * the state of the art methods and effectiveness.
*   Paper explanation
  * what did the paper propose
  * what is the innovations of the method
  * how well the proposed method work (in its own metrics)
  * what is the contribution to the reasearch regime (referring the Background above, how important the paper is to the problem).


The problem that this paper addresses is what the most suitable deep learning architecture would be for predicting the risk of readmission within
30 days of discharge from the ICU. This helps us to more accurately predict if a patient is at risk, and can help in making more educated
treatment decisions in the future. The problem's difficulty stems from the difficulty of processing the data of the MIMIC-III dataset,
as well as the training and evaluations of the various architectures against each other. We will be using a series of advanced deep learning
architectures to compare and analyze which is the most effective. The paper proposed that an architecture involving a recurrent neural network
with time dynamics of code embeddings computed by time decay would be the most effective. The innovation of this method is that it combines
various architecutre together. The proposed method worked well, and achieved the highest average precision of 0.331. This paper effectively
addresses the original problem, as it helps us to form conclusions on what the most appropriate architecture for our problem is.

In [None]:
# code comment is used as inline annotations for your coding

# Scope of Reproducibility:

List hypotheses from the paper you will test and the corresponding experiments you will run.


1. Hypothesis: 

The most appropriate architecture for predicting the risk of readmission to the ICU within 30 days is the bidirectional recurrent neural network with time dynamics of code embeddings computed with time decay. To test this, we will train and run this specific architecture on the MIMIC-III dataset, and evaluate it using metrics including precision, AUROC, and F-Score.

Although downloading the MIMIC-III dataset is complicated, we have provided some pre-trained models that will enable you to see the results of the paper without needing to download the dataset.


# Methodology

This methodology is the core of your project. It consists of run-able codes with necessary annotations to show the expeiment you executed for testing the hypotheses.

The methodology at least contains two subsections **data** and **model** in your experiment.

Environment: We will be using the latest Python version, which at the time of writing is 3.12.2. Additionally, we will require the
following packages which can be found in the requirements.txt which is included in the github. This includes the libraries:
torch, torchvision, numpy, pandas, scipy, tqdm, scikit-learn, torchdiffeq, and matplotli

.
Data: We will be using the MIMIC-III dataset, which you can download by following the instructions on the following link:
https://physionet.org/content/mimiciii/1

4/.
Model: The link to the original paper is: https://www.nature.com/articles/s41598-020-58053-z#Sec13. The link to the original
github is: https://github.com/sebbarb/time_aware_attention. The model we have chosen to evaluate in the given notebook is
the bidirectional recurrent neural network with ode time decay and an attention mechanism.

In [None]:
# import  packages you need
import numpy as np
from google.colab import drive


##  Data
Data includes raw data (MIMIC III tables), descriptive statistics (our homework questions), and data processing (feature engineering).
  * Source of the data: where the data is collected from; if data is synthetic or self-generated, explain how. If possible, please provide a link to the raw datasets.
  * Statistics: include basic descriptive statistics of the dataset like size, cross validation split, label distribution, etc.
  * Data process: how do you munipulate the data, e.g., change the class labels, split the dataset to train/valid/test, refining the dataset.
  * Illustration: printing results, plotting figures for illustration.
  * You can upload your raw dataset to Google Drive and mount this Colab to the same directory. If your raw dataset is too large, you can upload the processed dataset and have a code to load the processed dataset.

we use MIMIC III data from https://eicu-crd.mit.edu/gettingstarted/access/ after finishing the courses, we can download the MIMIC III raw data

In [None]:
# unprocessed data we need
# MIMIC III data: PRESCRIPTIONS.csv, ADMISSIONS.csv, DIAGNOSES_ICD.csv, PROCEDURES_ICD.csv
# MIMIC III data: ICUSTAYS.csv, PATIENTS.csv, SERVICES.csv
# MIMIC III data: D_ITEMS.csv, CHARTEVENTS.csv, OUTPUTEVENTS.csv

In [1]:
# tansfer gz file to csv
import gzip
import os

directory = './data/MIMIC-III/gz'

csv_dir = './data/MIMIC-III/csv'

# Loop through all the files in the directory
for filename in os.listdir(directory):
    # Check if the file is a GZIP file
    if filename.endswith('.gz'):
        gzip_file_path = os.path.join(directory, filename)
        csv_file_path = os.path.join(csv_dir, os.path.splitext(filename)[0])

        # Open the GZIP file and read its contents, then write them to a CSV file
        with gzip.open(gzip_file_path, 'rt') as file_in:
            with open(csv_file_path, 'w') as file_out:
                for line in file_in:  # Read line by line
                    file_out.write(line)
#                 file_out.write(file_in.read())


In [6]:
from hyperparameters import Hyperparameters as hp
from data_load import *
import gzip
import os

In [3]:
# preprocess the original csv file from MIMIC-III

# don't change the sequence

%run preprocessing_ICU_PAT_ADMIT.py
%run preprocessing_DIAGNOSES_PROCEDURES.py


# produce CHARTS_PRESCRIPTIONS.py
%run preprocessing_reduce_charts.py
%run preprocessing_reduce_outputs.py
%run preprocessing_merge_charts_outputs.py
%run preprocessing_CHARTS_PRESCRIPTIONS.py


%run preprocessing_create_arrays.py

Load ICU stays...
-----------------------------------------
Load patients...
-----------------------------------------
Load admissions...
-----------------------------------------
Load services...
-----------------------------------------
Link icustays and patients tables...
Compute number of recent admissions...


100%|██████████| 43126/43126 [00:57<00:00, 749.93it/s]


-----------------------------------------
Link icu_pat and admissions tables...
SUBJECT_ID                  0
HADM_ID                     0
ICUSTAY_ID                  0
INTIME                      0
OUTTIME                     0
LOS                         0
GENDER_M                    0
NUM_RECENT_ADMISSIONS       0
AGE                         0
POSITIVE                    0
ADMITTIME                   0
ADMISSION_TYPE              0
ADMISSION_LOCATION          0
INSURANCE                   0
MARITAL_STATUS           1776
ETHNICITY                   0
dtype: int64
Some data cleaning on admissions...
-----------------------------------------
Link services table...
-----------------------------------------
Total pos 5495
Total neg 39803
count    5495.000000
mean        5.090056
std         7.568803
min         0.000100
25%         1.379100
50%         2.610800
75%         5.219100
max       116.832703
Name: LOS, dtype: float64
count    39803.000000
mean         3.660028
std          5.

331it [06:43,  1.22s/it]


-----------------------------------------
Load item definitions
URINE_OUTPUT
4397             urine out other
4402     urine out straight cath
4416                 urine flush
4428             cath lab output
4436                cath lab out
                  ...           
11905              straight cath
12182               ed urine out
12293                   or urine
12297                 pacu urine
12298                   cath lab
Name: LABEL, Length: 99, dtype: object
-----------------------------------------
Loading Output Events
Remove admission and discharge days (since data on urine output is incomplete)
Load ICU stays...
Loading chart events...
-----------------------------------------
Compute BMI and GCS total...
-----------------------------------------
Loading output events...
-----------------------------------------
Create categorical variable...
-----------------------------------------
Save...
-----------------------------------------
Save data for logistic regression

In [6]:
%run preprocessing_create_arrays.py

Loading icu_pat...
Loading diagnoses/procedures...
Loading charts/prescriptions...
-----------------------------------------
Create static array...
Create label array...
Create diagnoses/procedures and charts/prescriptions array...
max_count 552
Reindex df...
done
max_count 392
Reindex df...
done
-----------------------------------------
Split data into train/validate/test...
Get patients corresponding to test ids
-----------------------------------------
Save...


In [6]:
from hyperparameters import Hyperparameters as hp
from data_load import *
import gzip
import os
# load data

data = np.load(hp.data_dir + 'data_arrays.npz', allow_pickle=True)


# Training and validation data
if hp.all_train:
    trainloader, num_batches, pos_weight = get_trainloader(data, 'ALL')
else:
    trainloader, num_batches, pos_weight = get_trainloader(data, 'TRAIN')


# calculate statistics

num_static = num_static(data)
num_dp_codes, num_cp_codes = vocab_sizes(data)

##   Model
The model includes the model definitation which usually is a class, model training, and other necessary parts.
  * Model architecture: layer number/size/type, activation function, etc
  * Training objectives: loss function, optimizer, weight of each loss term, etc
  * Others: whether the model is pretrained, Monte Carlo simulation for uncertainty analysis, etc
  * The code of model should have classes of the model, functions of model training, model validation, etc.
  * If your model training is done outside of this notebook, please upload the trained model here and develop a function to load and test it.

Citations:
Barbieri, S., Kemp, J., Perez-Concha, O., Kotwal, S., Gallagher, M., Ritchie, A., & Jorm, L. (2020). 
Benchmarking Deep Learning Architectures for predicting readmission to the ICU and describing patients-at-risk. 
Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-58053-z
Original paper’s repo:  
https://github.com/sebbarb/time_aware_attention/tree/master

Model description:
we choose model RNN (exp time decay) + Attention 
and we do Ablation to remove the Attention to compare the result.

Hyperparams

learning_rate = 0.001

batch_size = 128

num_epochs = 80

dropout_rate = 0.5

In [3]:
import numpy as np
epoch_times = np.load('logdir/epoch_times.npz', allow_pickle=True)
average_time = np.average(epoch_times['epoch_times'])

In [4]:
print(average_time)

922.581482553482


Computational requirements

Report at least 3 types of requirements such as type of hardware, average runtime for each epoch, total number of trials

type of hardware = use cuda, GPU is NVIDIA A100-SXM4-80GB

average runtime for each epoch = 922.581482553482s

total number of trials = 2

In [1]:
import torch.optim as optim
from modules import *
from tqdm import tqdm
from time import time
from sklearn.metrics import accuracy_score, confusion_matrix, average_precision_score, roc_auc_score, f1_score

In [2]:
# CUDA for PyTorch
use_cuda = torch.cuda.is_available()
device = torch.device('cuda:0' if use_cuda else 'cpu')
torch.backends.cudnn.benchmark = True

In [3]:
import os
# Create log dir
logdir = hp.logdir + hp.net_variant + '/'
if not os.path.exists(logdir):
    os.makedirs(logdir)

In [8]:
# # import model
# model = Net(num_static, num_dp_codes, num_cp_codes).to(device)
# # Loss function and optimizer
# criterion = nn.BCEWithLogitsLoss(pos_weight=pos_weight).to(device)
# optimizer = optim.Adam(model.parameters(), lr = 0.001)  

# print('start to train')

# # Train
# num_epoch = 10
# epoch_times = []


# for epoch in tqdm(range(num_epoch)): 
#     print('-----------------------------------------')
#     print('Epoch: {}'.format(epoch))
#     model.train()
#     time_start = time()
#     for i, (stat, dp, cp, dp_t, cp_t, label) in enumerate(tqdm(trainloader), 0):
#         # move to GPU if available
#         stat  = stat.to(device)
#         dp    = dp.to(device)
#         cp    = cp.to(device)
#         dp_t  = dp_t.to(device)
#         cp_t  = cp_t.to(device)
#         label = label.to(device)

#         # zero the parameter gradients
#         optimizer.zero_grad()

#         # forward + backward + optimize
#         label_pred, _ = model(stat, dp, cp, dp_t, cp_t)
#         loss = criterion(label_pred, label)
#         loss.backward()
#         optimizer.step()
    
#     # timing
#     time_end = time()
#     epoch_times.append(time_end-time_start)

#     # Save
#     print('Saving...')
#     torch.save(model.state_dict(), logdir + 'final_model.pt')
#     np.savez(logdir + 'epoch_times', epoch_times=epoch_times)
#     print('Done')

start to train


  0%|          | 0/10 [00:00<?, ?it/s]

-----------------------------------------
Epoch: 0



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<09:56,  1.88s/it][A
  1%|          | 2/318 [00:03<09:19,  1.77s/it][A
  1%|          | 3/318 [00:05<08:52,  1.69s/it][A
  1%|▏         | 4/318 [00:06<08:33,  1.64s/it][A
  2%|▏         | 5/318 [00:08<08:42,  1.67s/it][A
  2%|▏         | 6/318 [00:10<08:30,  1.63s/it][A
  2%|▏         | 7/318 [00:11<08:16,  1.60s/it][A
  3%|▎         | 8/318 [00:12<08:00,  1.55s/it][A
  3%|▎         | 9/318 [00:14<07:49,  1.52s/it][A
  3%|▎         | 10/318 [00:15<07:40,  1.50s/it][A
  3%|▎         | 11/318 [00:17<07:54,  1.55s/it][A
  4%|▍         | 12/318 [00:19<07:53,  1.55s/it][A
  4%|▍         | 13/318 [00:20<07:51,  1.55s/it][A
  4%|▍         | 14/318 [00:22<07:48,  1.54s/it][A
  5%|▍         | 15/318 [00:23<07:56,  1.57s/it][A
  5%|▌         | 16/318 [00:25<07:48,  1.55s/it][A
  5%|▌         | 17/318 [00:26<07:32,  1.50s/it][A
  6%|▌         | 18/318 [00:28<07:25,  1.49s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 1



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<09:56,  1.88s/it][A
  1%|          | 2/318 [00:03<08:20,  1.58s/it][A
  1%|          | 3/318 [00:04<07:49,  1.49s/it][A
  1%|▏         | 4/318 [00:06<07:38,  1.46s/it][A
  2%|▏         | 5/318 [00:07<07:18,  1.40s/it][A
  2%|▏         | 6/318 [00:08<07:43,  1.48s/it][A
  2%|▏         | 7/318 [00:10<07:32,  1.45s/it][A
  3%|▎         | 8/318 [00:11<07:29,  1.45s/it][A
  3%|▎         | 9/318 [00:13<07:18,  1.42s/it][A
  3%|▎         | 10/318 [00:14<07:08,  1.39s/it][A
  3%|▎         | 11/318 [00:15<07:05,  1.39s/it][A
  4%|▍         | 12/318 [00:17<07:03,  1.38s/it][A
  4%|▍         | 13/318 [00:18<06:54,  1.36s/it][A
  4%|▍         | 14/318 [00:19<06:45,  1.33s/it][A
  5%|▍         | 15/318 [00:21<06:40,  1.32s/it][A
  5%|▌         | 16/318 [00:22<06:47,  1.35s/it][A
  5%|▌         | 17/318 [00:23<06:35,  1.31s/it][A
  6%|▌         | 18/318 [00:25<06:30,  1.30s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 2



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<08:37,  1.63s/it][A
  1%|          | 2/318 [00:02<07:17,  1.38s/it][A
  1%|          | 3/318 [00:04<06:52,  1.31s/it][A
  1%|▏         | 4/318 [00:05<07:09,  1.37s/it][A
  2%|▏         | 5/318 [00:06<07:07,  1.37s/it][A
  2%|▏         | 6/318 [00:08<07:13,  1.39s/it][A
  2%|▏         | 7/318 [00:09<07:09,  1.38s/it][A
  3%|▎         | 8/318 [00:11<07:08,  1.38s/it][A
  3%|▎         | 9/318 [00:12<07:29,  1.46s/it][A
  3%|▎         | 10/318 [00:14<07:31,  1.46s/it][A
  3%|▎         | 11/318 [00:15<07:17,  1.43s/it][A
  4%|▍         | 12/318 [00:16<07:03,  1.38s/it][A
  4%|▍         | 13/318 [00:18<07:02,  1.39s/it][A
  4%|▍         | 14/318 [00:19<07:03,  1.39s/it][A
  5%|▍         | 15/318 [00:20<07:00,  1.39s/it][A
  5%|▌         | 16/318 [00:22<06:52,  1.37s/it][A
  5%|▌         | 17/318 [00:23<06:47,  1.35s/it][A
  6%|▌         | 18/318 [00:24<06:36,  1.32s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 3



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<08:18,  1.57s/it][A
  1%|          | 2/318 [00:02<07:10,  1.36s/it][A
  1%|          | 3/318 [00:04<06:53,  1.31s/it][A
  1%|▏         | 4/318 [00:05<06:46,  1.29s/it][A
  2%|▏         | 5/318 [00:06<06:52,  1.32s/it][A
  2%|▏         | 6/318 [00:07<06:39,  1.28s/it][A
  2%|▏         | 7/318 [00:09<06:33,  1.26s/it][A
  3%|▎         | 8/318 [00:10<06:50,  1.32s/it][A
  3%|▎         | 9/318 [00:11<06:54,  1.34s/it][A
  3%|▎         | 10/318 [00:13<06:51,  1.33s/it][A
  3%|▎         | 11/318 [00:14<06:44,  1.32s/it][A
  4%|▍         | 12/318 [00:15<06:37,  1.30s/it][A
  4%|▍         | 13/318 [00:17<06:30,  1.28s/it][A
  4%|▍         | 14/318 [00:18<06:27,  1.28s/it][A
  5%|▍         | 15/318 [00:19<06:27,  1.28s/it][A
  5%|▌         | 16/318 [00:20<06:34,  1.31s/it][A
  5%|▌         | 17/318 [00:22<06:37,  1.32s/it][A
  6%|▌         | 18/318 [00:23<06:49,  1.36s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 4



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<08:44,  1.65s/it][A
  1%|          | 2/318 [00:02<07:29,  1.42s/it][A
  1%|          | 3/318 [00:04<07:03,  1.35s/it][A
  1%|▏         | 4/318 [00:05<06:49,  1.31s/it][A
  2%|▏         | 5/318 [00:06<06:44,  1.29s/it][A
  2%|▏         | 6/318 [00:08<06:53,  1.33s/it][A
  2%|▏         | 7/318 [00:09<06:58,  1.35s/it][A
  3%|▎         | 8/318 [00:10<06:54,  1.34s/it][A
  3%|▎         | 9/318 [00:12<06:51,  1.33s/it][A
  3%|▎         | 10/318 [00:13<06:52,  1.34s/it][A
  3%|▎         | 11/318 [00:14<06:40,  1.30s/it][A
  4%|▍         | 12/318 [00:15<06:31,  1.28s/it][A
  4%|▍         | 13/318 [00:17<06:22,  1.25s/it][A
  4%|▍         | 14/318 [00:18<06:18,  1.25s/it][A
  5%|▍         | 15/318 [00:19<06:14,  1.24s/it][A
  5%|▌         | 16/318 [00:20<06:11,  1.23s/it][A
  5%|▌         | 17/318 [00:21<06:10,  1.23s/it][A
  6%|▌         | 18/318 [00:23<06:13,  1.24s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 5



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<10:06,  1.91s/it][A
  1%|          | 2/318 [00:03<08:58,  1.70s/it][A
  1%|          | 3/318 [00:04<08:11,  1.56s/it][A
  1%|▏         | 4/318 [00:06<07:45,  1.48s/it][A
  2%|▏         | 5/318 [00:07<07:34,  1.45s/it][A
  2%|▏         | 6/318 [00:09<07:25,  1.43s/it][A
  2%|▏         | 7/318 [00:10<07:23,  1.42s/it][A
  3%|▎         | 8/318 [00:11<07:17,  1.41s/it][A
  3%|▎         | 9/318 [00:13<07:12,  1.40s/it][A
  3%|▎         | 10/318 [00:14<07:09,  1.40s/it][A
  3%|▎         | 11/318 [00:15<07:04,  1.38s/it][A
  4%|▍         | 12/318 [00:17<07:08,  1.40s/it][A
  4%|▍         | 13/318 [00:18<06:57,  1.37s/it][A
  4%|▍         | 14/318 [00:20<07:19,  1.45s/it][A
  5%|▍         | 15/318 [00:21<07:16,  1.44s/it][A
  5%|▌         | 16/318 [00:23<07:05,  1.41s/it][A
  5%|▌         | 17/318 [00:24<06:54,  1.38s/it][A
  6%|▌         | 18/318 [00:25<06:55,  1.38s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 6



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<09:08,  1.73s/it][A
  1%|          | 2/318 [00:02<07:34,  1.44s/it][A
  1%|          | 3/318 [00:04<07:23,  1.41s/it][A
  1%|▏         | 4/318 [00:05<07:20,  1.40s/it][A
  2%|▏         | 5/318 [00:07<07:09,  1.37s/it][A
  2%|▏         | 6/318 [00:08<07:03,  1.36s/it][A
  2%|▏         | 7/318 [00:09<07:09,  1.38s/it][A
  3%|▎         | 8/318 [00:11<07:12,  1.39s/it][A
  3%|▎         | 9/318 [00:12<07:04,  1.38s/it][A
  3%|▎         | 10/318 [00:14<07:10,  1.40s/it][A
  3%|▎         | 11/318 [00:15<07:09,  1.40s/it][A
  4%|▍         | 12/318 [00:16<07:06,  1.39s/it][A
  4%|▍         | 13/318 [00:18<07:03,  1.39s/it][A
  4%|▍         | 14/318 [00:19<07:08,  1.41s/it][A
  5%|▍         | 15/318 [00:20<07:01,  1.39s/it][A
  5%|▌         | 16/318 [00:22<07:03,  1.40s/it][A
  5%|▌         | 17/318 [00:23<07:00,  1.40s/it][A
  6%|▌         | 18/318 [00:25<07:01,  1.41s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 7



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<10:13,  1.94s/it][A
  1%|          | 2/318 [00:03<09:47,  1.86s/it][A
  1%|          | 3/318 [00:05<09:04,  1.73s/it][A
  1%|▏         | 4/318 [00:06<08:27,  1.62s/it][A
  2%|▏         | 5/318 [00:08<08:03,  1.54s/it][A
  2%|▏         | 6/318 [00:09<07:52,  1.51s/it][A
  2%|▏         | 7/318 [00:11<07:46,  1.50s/it][A
  3%|▎         | 8/318 [00:12<07:44,  1.50s/it][A
  3%|▎         | 9/318 [00:14<07:33,  1.47s/it][A
  3%|▎         | 10/318 [00:15<07:22,  1.44s/it][A
  3%|▎         | 11/318 [00:16<07:27,  1.46s/it][A
  4%|▍         | 12/318 [00:18<07:18,  1.43s/it][A
  4%|▍         | 13/318 [00:19<07:08,  1.41s/it][A
  4%|▍         | 14/318 [00:21<07:20,  1.45s/it][A
  5%|▍         | 15/318 [00:22<07:19,  1.45s/it][A
  5%|▌         | 16/318 [00:24<07:19,  1.46s/it][A
  5%|▌         | 17/318 [00:25<07:12,  1.44s/it][A
  6%|▌         | 18/318 [00:26<07:13,  1.44s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 8



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<09:36,  1.82s/it][A
  1%|          | 2/318 [00:03<08:09,  1.55s/it][A
  1%|          | 3/318 [00:04<07:36,  1.45s/it][A
  1%|▏         | 4/318 [00:05<07:15,  1.39s/it][A
  2%|▏         | 5/318 [00:07<07:05,  1.36s/it][A
  2%|▏         | 6/318 [00:08<07:07,  1.37s/it][A
  2%|▏         | 7/318 [00:09<06:57,  1.34s/it][A
  3%|▎         | 8/318 [00:11<06:59,  1.35s/it][A
  3%|▎         | 9/318 [00:12<06:52,  1.33s/it][A
  3%|▎         | 10/318 [00:13<07:04,  1.38s/it][A
  3%|▎         | 11/318 [00:15<07:07,  1.39s/it][A
  4%|▍         | 12/318 [00:16<07:09,  1.40s/it][A
  4%|▍         | 13/318 [00:18<06:58,  1.37s/it][A
  4%|▍         | 14/318 [00:19<06:47,  1.34s/it][A
  5%|▍         | 15/318 [00:20<06:34,  1.30s/it][A
  5%|▌         | 16/318 [00:22<06:49,  1.36s/it][A
  5%|▌         | 17/318 [00:23<07:15,  1.45s/it][A
  6%|▌         | 18/318 [00:25<07:16,  1.46s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done
-----------------------------------------
Epoch: 9



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:01<09:27,  1.79s/it][A
  1%|          | 2/318 [00:03<07:56,  1.51s/it][A
  1%|          | 3/318 [00:04<07:20,  1.40s/it][A
  1%|▏         | 4/318 [00:05<07:13,  1.38s/it][A
  2%|▏         | 5/318 [00:07<07:27,  1.43s/it][A
  2%|▏         | 6/318 [00:08<07:29,  1.44s/it][A
  2%|▏         | 7/318 [00:10<07:31,  1.45s/it][A
  3%|▎         | 8/318 [00:11<07:15,  1.40s/it][A
  3%|▎         | 9/318 [00:12<07:09,  1.39s/it][A
  3%|▎         | 10/318 [00:14<07:01,  1.37s/it][A
  3%|▎         | 11/318 [00:15<07:02,  1.38s/it][A
  4%|▍         | 12/318 [00:16<06:55,  1.36s/it][A
  4%|▍         | 13/318 [00:18<06:49,  1.34s/it][A
  4%|▍         | 14/318 [00:19<06:43,  1.33s/it][A
  5%|▍         | 15/318 [00:20<06:37,  1.31s/it][A
  5%|▌         | 16/318 [00:22<06:32,  1.30s/it][A
  5%|▌         | 17/318 [00:23<06:28,  1.29s/it][A
  6%|▌         | 18/318 [00:24<06:25,  1.29s/it][A
  6%|▌         | 19/318 [00:2

Saving...
Done





The below is our ablation study with the attention mechanism removed

In [10]:
class Net(nn.Module):
    def __init__(self, num_static, num_dp_codes, num_cp_codes):
        super(Net, self).__init__()
      
        # Embedding dimensions
        self.embed_dp_dim = int(np.ceil(num_dp_codes**0.25))+1
        self.embed_cp_dim = int(np.ceil(num_cp_codes**0.25))+1

        # Embedding layers
        self.embed_dp = nn.Embedding(num_embeddings=num_dp_codes, embedding_dim=self.embed_dp_dim, padding_idx=0)
        self.embed_cp = nn.Embedding(num_embeddings=num_cp_codes, embedding_dim=self.embed_cp_dim, padding_idx=0)

        # GRU layers
        self.gru_dp_fw = GRUExponentialDecay(input_size=self.embed_dp_dim, hidden_size=self.embed_dp_dim)
        self.gru_cp_fw = GRUExponentialDecay(input_size=self.embed_cp_dim, hidden_size=self.embed_cp_dim)
        self.gru_dp_bw = GRUExponentialDecay(input_size=self.embed_dp_dim, hidden_size=self.embed_dp_dim)
        self.gru_cp_bw = GRUExponentialDecay(input_size=self.embed_cp_dim, hidden_size=self.embed_cp_dim)

        # Fully connected output
        self.fc_dp  = nn.Linear(2*self.embed_dp_dim, 1)
        self.fc_cp  = nn.Linear(2*self.embed_cp_dim, 1)
        self.fc_all = nn.Linear(num_static + 2, 1)

        # Others
        self.dropout = nn.Dropout(p=0.5)

    def forward(self, stat, dp, cp, dp_t, cp_t):
        # Compute time delta
        ## output dim: batch_size x seq_len
        dp_t_delta_fw = abs_time_to_delta(dp_t)
        cp_t_delta_fw = abs_time_to_delta(cp_t)
        dp_t_delta_bw = abs_time_to_delta(torch.flip(dp_t, [1]))
        cp_t_delta_bw = abs_time_to_delta(torch.flip(cp_t, [1]))    

        # Embedding
        ## output dim: batch_size x seq_len x embedding_dim
        embedded_dp_fw = self.embed_dp(dp)
        embedded_cp_fw = self.embed_cp(cp)
        embedded_dp_bw = torch.flip(embedded_dp_fw, [1])
        embedded_cp_bw = torch.flip(embedded_cp_fw, [1])
        ## Dropout
        embedded_dp_fw = self.dropout(embedded_dp_fw)
        embedded_cp_fw = self.dropout(embedded_cp_fw)
        embedded_dp_bw = self.dropout(embedded_dp_bw)
        embedded_cp_bw = self.dropout(embedded_cp_bw)

        # GRU
        ## output dim rnn:        batch_size x seq_len x embedding_dim
        rnn_dp_fw = self.gru_dp_fw(embedded_dp_fw, dp_t_delta_fw)
        rnn_cp_fw = self.gru_cp_fw(embedded_cp_fw, cp_t_delta_fw)
        rnn_dp_bw = self.gru_dp_bw(embedded_dp_bw, dp_t_delta_bw)
        rnn_cp_bw = self.gru_cp_bw(embedded_cp_bw, cp_t_delta_bw)      
        ## output dim rnn_hidden: batch_size x embedding_dim
        rnn_dp_fw = rnn_dp_fw[:,-1,:]
        rnn_cp_fw = rnn_cp_fw[:,-1,:]
        rnn_dp_bw = rnn_dp_bw[:,-1,:]
        rnn_cp_bw = rnn_cp_bw[:,-1,:]
        ## concatenate forward and backward: batch_size x 2*embedding_dim
        rnn_dp = torch.cat((rnn_dp_fw, rnn_dp_bw), dim=-1)
        rnn_cp = torch.cat((rnn_cp_fw, rnn_cp_bw), dim=-1)

        # Scores
        score_dp = self.fc_dp(self.dropout(rnn_dp))
        score_cp = self.fc_cp(self.dropout(rnn_cp))

        # Concatenate to variable collection
        all = torch.cat((stat, score_dp, score_cp), dim=1)

        # Final linear projection
        out = self.fc_all(self.dropout(all)).squeeze()

        return out, []

In [None]:
# model_without_attention = Net(num_static, num_dp_codes, num_cp_codes).to(device)
# # Loss function and optimizer
# criterion = nn.BCEWithLogitsLoss(pos_weight=pos_weight).to(device)
# optimizer = optim.Adam(model_without_attention.parameters(), lr = 0.001)  

# print('start to train')

# # Train
# num_epoch = 10
# epoch_times = []


# for epoch in tqdm(range(num_epoch)): 
#     print('-----------------------------------------')
#     print('Epoch: {}'.format(epoch))
#     model_without_attention.train()
#     time_start = time()
#     for i, (stat, dp, cp, dp_t, cp_t, label) in enumerate(tqdm(trainloader), 0):
#         # move to GPU if available
#         stat  = stat.to(device)
#         dp    = dp.to(device)
#         cp    = cp.to(device)
#         dp_t  = dp_t.to(device)
#         cp_t  = cp_t.to(device)
#         label = label.to(device)

#         # zero the parameter gradients
#         optimizer.zero_grad()

#         # forward + backward + optimize
#         label_pred, _ = model_without_attention(stat, dp, cp, dp_t, cp_t)
#         loss = criterion(label_pred, label)
#         loss.backward()
#         optimizer.step()
    
#     # timing
#     time_end = time()
#     epoch_times.append(time_end-time_start)

#     # Save
#     print('Saving...')
#     torch.save(model_without_attention.state_dict(), logdir + 'model_without_attention.pt')
#     np.savez(logdir + 'epoch_times_without_attention', epoch_times=epoch_times)
#     print('Done')

start to train


  0%|          | 0/10 [00:00<?, ?it/s]

-----------------------------------------
Epoch: 0



  0%|          | 0/318 [00:00<?, ?it/s][A
  0%|          | 1/318 [00:02<10:52,  2.06s/it][A
  1%|          | 2/318 [00:03<08:25,  1.60s/it][A
  1%|          | 3/318 [00:04<07:35,  1.45s/it][A
  1%|▏         | 4/318 [00:05<07:15,  1.39s/it][A
  2%|▏         | 5/318 [00:07<07:19,  1.40s/it][A
  2%|▏         | 6/318 [00:08<07:08,  1.37s/it][A
  2%|▏         | 7/318 [00:09<06:55,  1.34s/it][A
  3%|▎         | 8/318 [00:11<06:47,  1.31s/it][A
  3%|▎         | 9/318 [00:12<06:42,  1.30s/it][A
  3%|▎         | 10/318 [00:13<06:41,  1.30s/it][A
  3%|▎         | 11/318 [00:15<06:37,  1.29s/it][A
  4%|▍         | 12/318 [00:16<06:35,  1.29s/it][A
  4%|▍         | 13/318 [00:17<06:30,  1.28s/it][A
  4%|▍         | 14/318 [00:18<06:21,  1.25s/it][A
  5%|▍         | 15/318 [00:20<06:38,  1.32s/it][A
  5%|▌         | 16/318 [00:21<06:31,  1.30s/it][A
  5%|▌         | 17/318 [00:22<06:20,  1.27s/it][A
  6%|▌         | 18/318 [00:23<06:17,  1.26s/it][A
  6%|▌         | 19/318 [00:2

# Results
In this section, you should finish training your model training or loading your trained model. That is a great experiment! You should share the results with others with necessary metrics and figures.

Please test and report results for all experiments that you run with:

*   specific numbers (accuracy, AUC, RMSE, etc)
*   figures (loss shrinkage, outputs from GAN, annotation or label of sample pictures, etc)


In [8]:
from __future__ import print_function
import torch
import numpy as np
import pandas as pd
import pickle
import scipy.stats as st
from hyperparameters import Hyperparameters as hp
from data_load import *
from modules import *
import os
from tqdm import tqdm
# from train import Net
#import matplotlib.pyplot as plt
from sklearn.metrics import *
from sklearn.calibration import calibration_curve
from pdb import set_trace as bp

The below section is for the results of the recurrent neural network with the attention mechanism. The section below this will contain the same model but for our ablation with the attention mechanism removed.

In [9]:
from data_load import *
# Load data
print('Load data...')
data = np.load('data/data_arrays.npz')
test_ids_patients = pd.read_pickle('data/test_ids_patients.pkl')

# Patients in test data
patients = test_ids_patients.drop_duplicates()
num_patients = patients.shape[0]
row_ids = pd.DataFrame({'ROW_IDX': test_ids_patients.index}, index=test_ids_patients)

# Vocabulary sizes
num_static = num_static(data)
num_dp_codes, num_cp_codes = vocab_sizes(data)

# CUDA for PyTorch
use_cuda = torch.cuda.is_available()
device = torch.device('cuda:0' if use_cuda else 'cpu')
torch.backends.cudnn.benchmark = True

# Network
net = Net(num_static, num_dp_codes, num_cp_codes).to(device)

print('Evaluate...')
# Set log dir to read trained model from
logdir = 'log/'

# Restore variables from disk
net.load_state_dict(torch.load(logdir + 'final_model.pt', map_location=device), strict=False)

# Bootstrapping
np.random.seed(hp.np_seed)
hp.bootstrap_samples = 2
hp.net_variant = 'birnn_time_decay_attention'
avpre_vec = np.zeros(hp.bootstrap_samples)
auroc_vec = np.zeros(hp.bootstrap_samples)
f1_vec    = np.zeros(hp.bootstrap_samples)
sensitivity_vec = np.zeros(hp.bootstrap_samples)
specificity_vec = np.zeros(hp.bootstrap_samples)
ppv_vec = np.zeros(hp.bootstrap_samples)
npv_vec = np.zeros(hp.bootstrap_samples)
for sample in range(hp.bootstrap_samples):
    print('Bootstrap sample {}'.format(sample))

    # Test data
    sample_patients = patients.sample(n=num_patients, replace=True)
    idx = np.squeeze(row_ids.loc[sample_patients].values)
    testloader, _, _ = get_trainloader(data, 'TEST', shuffle=False, idx=idx)
    
    # evaluate on test data
    net.eval()
    label_pred = torch.Tensor([])
    label_test = torch.Tensor([])
    with torch.no_grad():
      for i, (stat, dp, cp, dp_t, cp_t, label_batch) in enumerate(tqdm(testloader), 0):
        # move to GPU if available
        stat  = stat.to(device)
        dp    = dp.to(device)
        cp    = cp.to(device)
        dp_t  = dp_t.to(device)
        cp_t  = cp_t.to(device)
    
        label_pred_batch, _ = net(stat, dp, cp, dp_t, cp_t)
        label_pred = torch.cat((label_pred, label_pred_batch.cpu()))
        label_test = torch.cat((label_test, label_batch))
        
    label_sigmoids = torch.sigmoid(label_pred).cpu().numpy()
    
    # Average precision
    avpre = average_precision_score(label_test, label_sigmoids)
    
    # Determine AUROC score
    auroc = roc_auc_score(label_test, label_sigmoids)
    
    # Sensitivity, specificity
    fpr, tpr, thresholds = roc_curve(label_test, label_sigmoids)
    youden_idx = np.argmax(tpr - fpr)
    sensitivity = tpr[youden_idx]
    specificity = 1-fpr[youden_idx]
    
    # F1, PPV, NPV score
    f1 = 0
    ppv = 0
    npv = 0
    for t in thresholds:
      label_pred = (np.array(label_sigmoids) >= t).astype(int)
      f1_temp = f1_score(label_test, label_pred)
      if f1_temp > f1:
        f1 = f1_temp
    
    # Store in vectors
    avpre_vec[sample] = avpre
    auroc_vec[sample] = auroc
    f1_vec[sample]    = f1
    sensitivity_vec[sample]  = sensitivity
    specificity_vec[sample]  = specificity

avpre_mean = np.mean(avpre_vec)
avpre_lci, avpre_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=avpre_mean, scale=st.sem(avpre_vec))
auroc_mean = np.mean(auroc_vec)
auroc_lci, auroc_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=auroc_mean, scale=st.sem(auroc_vec))
f1_mean = np.mean(f1_vec)
f1_lci, f1_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=f1_mean, scale=st.sem(f1_vec))
sensitivity_mean = np.mean(sensitivity_vec)
sensitivity_lci, sensitivity_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=sensitivity_mean, scale=st.sem(sensitivity_vec))
specificity_mean = np.mean(specificity_vec)
specificity_lci, specificity_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=specificity_mean, scale=st.sem(specificity_vec))

epoch_times = np.load('log/epoch_times.npz')['epoch_times']
times_mean = np.mean(epoch_times)
times_lci, times_uci = st.t.interval(0.95, len(epoch_times)-1, loc=np.mean(epoch_times), scale=st.sem(epoch_times))
times_std = np.std(epoch_times)

print('------------------------------------------------')
print('Net variant: {}'.format(hp.net_variant))
print('Average Precision: {} [{},{}]'.format(round(avpre_mean, 4), round(avpre_lci, 4), round(avpre_uci, 4)))
print('AUROC: {} [{},{}]'.format(round(auroc_mean, 4), round(auroc_lci, 4), round(auroc_uci, 4)))
print('F1: {} [{},{}]'.format(round(f1_mean, 4), round(f1_lci, 4), round(f1_uci, 4)))
print('Sensitivity: {} [{},{}]'.format(round(sensitivity_mean, 4), round(sensitivity_lci, 4), round(sensitivity_uci, 4)))
print('Specificity: {} [{},{}]'.format(round(specificity_mean, 4), round(specificity_lci, 4), round(specificity_uci, 4)))
print('Time: {} [{},{}] std: {}'.format(round(times_mean, 4), round(times_lci, 4), round(times_uci, 4), round(times_std, 4)))
print('Done')

Load data...
Evaluate...
Bootstrap sample 0


100%|██████████| 35/35 [00:17<00:00,  2.03it/s]


Bootstrap sample 1


100%|██████████| 36/36 [00:17<00:00,  2.05it/s]


------------------------------------------------
Net variant: birnn_time_decay_attention
Average Precision: 0.3187 [0.173,0.4645]
AUROC: 0.7369 [0.7069,0.7668]
F1: 0.3828 [0.364,0.4016]
Sensitivity: 0.6341 [-0.053,1.3213]
Specificity: 0.7279 [0.1308,1.325]
Time: 432.8146 [418.6816,446.9475] std: 18.7426
Done


The below section has the attention mechanism removed

In [11]:
from data_load import *
# Load data
print('Load data...')
data = np.load('data/data_arrays.npz')
test_ids_patients = pd.read_pickle('data/test_ids_patients.pkl')

# Patients in test data
patients = test_ids_patients.drop_duplicates()
num_patients = patients.shape[0]
row_ids = pd.DataFrame({'ROW_IDX': test_ids_patients.index}, index=test_ids_patients)

# Vocabulary sizes
num_static = num_static(data)
num_dp_codes, num_cp_codes = vocab_sizes(data)

# CUDA for PyTorch
use_cuda = torch.cuda.is_available()
device = torch.device('cuda:0' if use_cuda else 'cpu')
torch.backends.cudnn.benchmark = True

# Network
net = Net(num_static, num_dp_codes, num_cp_codes).to(device)

print('Evaluate...')
# Set log dir to read trained model from
logdir = 'log/'

# Restore variables from disk
net.load_state_dict(torch.load(logdir + 'model_without_attention.pt', map_location=device), strict=False)

# Bootstrapping
np.random.seed(hp.np_seed)
hp.bootstrap_samples = 2
hp.net_variant = 'birnn_time_decay'
avpre_vec = np.zeros(hp.bootstrap_samples)
auroc_vec = np.zeros(hp.bootstrap_samples)
f1_vec    = np.zeros(hp.bootstrap_samples)
sensitivity_vec = np.zeros(hp.bootstrap_samples)
specificity_vec = np.zeros(hp.bootstrap_samples)
ppv_vec = np.zeros(hp.bootstrap_samples)
npv_vec = np.zeros(hp.bootstrap_samples)
for sample in range(hp.bootstrap_samples):
    print('Bootstrap sample {}'.format(sample))

    # Test data
    sample_patients = patients.sample(n=num_patients, replace=True)
    idx = np.squeeze(row_ids.loc[sample_patients].values)
    testloader, _, _ = get_trainloader(data, 'TEST', shuffle=False, idx=idx)
    
    # evaluate on test data
    net.eval()
    label_pred = torch.Tensor([])
    label_test = torch.Tensor([])
    with torch.no_grad():
      for i, (stat, dp, cp, dp_t, cp_t, label_batch) in enumerate(tqdm(testloader), 0):
        # move to GPU if available
        stat  = stat.to(device)
        dp    = dp.to(device)
        cp    = cp.to(device)
        dp_t  = dp_t.to(device)
        cp_t  = cp_t.to(device)
    
        label_pred_batch, _ = net(stat, dp, cp, dp_t, cp_t)
        label_pred = torch.cat((label_pred, label_pred_batch.cpu()))
        label_test = torch.cat((label_test, label_batch))
        
    label_sigmoids = torch.sigmoid(label_pred).cpu().numpy()
    
    # Average precision
    avpre = average_precision_score(label_test, label_sigmoids)
    
    # Determine AUROC score
    auroc = roc_auc_score(label_test, label_sigmoids)
    
    # Sensitivity, specificity
    fpr, tpr, thresholds = roc_curve(label_test, label_sigmoids)
    youden_idx = np.argmax(tpr - fpr)
    sensitivity = tpr[youden_idx]
    specificity = 1-fpr[youden_idx]
    
    # F1, PPV, NPV score
    f1 = 0
    ppv = 0
    npv = 0
    for t in thresholds:
      label_pred = (np.array(label_sigmoids) >= t).astype(int)
      f1_temp = f1_score(label_test, label_pred)
      if f1_temp > f1:
        f1 = f1_temp
    
    # Store in vectors
    avpre_vec[sample] = avpre
    auroc_vec[sample] = auroc
    f1_vec[sample]    = f1
    sensitivity_vec[sample]  = sensitivity
    specificity_vec[sample]  = specificity

avpre_mean = np.mean(avpre_vec)
avpre_lci, avpre_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=avpre_mean, scale=st.sem(avpre_vec))
auroc_mean = np.mean(auroc_vec)
auroc_lci, auroc_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=auroc_mean, scale=st.sem(auroc_vec))
f1_mean = np.mean(f1_vec)
f1_lci, f1_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=f1_mean, scale=st.sem(f1_vec))
sensitivity_mean = np.mean(sensitivity_vec)
sensitivity_lci, sensitivity_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=sensitivity_mean, scale=st.sem(sensitivity_vec))
specificity_mean = np.mean(specificity_vec)
specificity_lci, specificity_uci = st.t.interval(0.95, hp.bootstrap_samples-1, loc=specificity_mean, scale=st.sem(specificity_vec))

epoch_times = np.load('log/epoch_times.npz')['epoch_times']
times_mean = np.mean(epoch_times)
times_lci, times_uci = st.t.interval(0.95, len(epoch_times)-1, loc=np.mean(epoch_times), scale=st.sem(epoch_times))
times_std = np.std(epoch_times)

print('------------------------------------------------')
print('Net variant: {}'.format(hp.net_variant))
print('Average Precision: {} [{},{}]'.format(round(avpre_mean, 4), round(avpre_lci, 4), round(avpre_uci, 4)))
print('AUROC: {} [{},{}]'.format(round(auroc_mean, 4), round(auroc_lci, 4), round(auroc_uci, 4)))
print('F1: {} [{},{}]'.format(round(f1_mean, 4), round(f1_lci, 4), round(f1_uci, 4)))
print('Sensitivity: {} [{},{}]'.format(round(sensitivity_mean, 4), round(sensitivity_lci, 4), round(sensitivity_uci, 4)))
print('Specificity: {} [{},{}]'.format(round(specificity_mean, 4), round(specificity_lci, 4), round(specificity_uci, 4)))
print('Time: {} [{},{}] std: {}'.format(round(times_mean, 4), round(times_lci, 4), round(times_uci, 4), round(times_std, 4)))
print('Done')

Load data...
Evaluate...
Bootstrap sample 0


100%|██████████| 35/35 [00:17<00:00,  2.04it/s]


Bootstrap sample 1


100%|██████████| 36/36 [00:17<00:00,  2.08it/s]


------------------------------------------------
Net variant: birnn_time_decay
Average Precision: 0.2965 [0.2492,0.3437]
AUROC: 0.7028 [0.6876,0.7179]
F1: 0.3398 [0.2863,0.3934]
Sensitivity: 0.6755 [0.0789,1.272]
Specificity: 0.6443 [0.0109,1.2777]
Time: 432.8146 [418.6816,446.9475] std: 18.7426
Done


## Model comparison

Comparing the results between the hypothesis recurrent neural network with time decay BIRNN and an attention layer with the same model without the attention layer, we can see that the attention layer improves the performance of the model. This can be seen by comparing the precision, AUROC, F1 score, Sensitivity, and Specificity, seen above.

# Discussion

The metrics that we have chosen to evaluate are the following, Average Precision: proportion of positive predictions
that are actually correct, AUROC: the area under the ROC curve, F1 score: measure of predictive performance,
Sensitivity: the proportion of actual positive that was identified incorrectly, Specificity: proportion of true negatives
that were identified by the model, Time: the average epoch time for training of the model.

In this section,you should discuss your work and make future plan. The discussion should address the following questions:
  * Make assessment that the paper is reproducible or not.

This paper is indeed reproducible as the testing has already been done in the past by the scientists 
that initiated the experiment/project. The scientists have provided their respective GitHub repositories 
with the trained models, stored weights, and source code as guidance for the paper.

  * Explain why it is not reproducible if your results are kind negative.
  
Our results were not negative, here is an example with an ODE BIRNN with Attention:

Net varianbirnn_time_decay_attention
ion\
Average Precision: 0.2854 [0.2656,0.3052]\
AUROC: 0.7028 [0.5969,0.8088]\
F1: 0.3295 [0.224,0.4351]\
Sensitivity: 0.7282 [0.5413, 0.9152]\
Specificity: 0.5893 [0.4974, 0.6811]\
Time: 1106.2515 [1103.6993,1108.8037]
std: 11.3967

This in itself illustrates positive results that are somewhat in-line with what is expected.

  * Describe “What was easy” and “What was difficult” during the reproduction.
  
The easiest part is to go and train the model and evaluate it. The hard part is to understand the code used in the paper and also to debug the code where there are some problems. Also dealing with the data is the tricky part.

  * Make suggestions to the author or other reproducers on how to improve the reproducibility.
  
I wish author could add documentation to the repo.

# References

1.   Barbieri, S., Kemp, J., Perez-Concha, O., Kotwal, S., Gallagher, M., Ritchie, A., & Jorm, L. (2020). Benchmarking Deep Learning Architectures for predicting readmission to the ICU and describing patients-at-risk. Scientific Reports, 10(1). https://doi.org/10.1038/s41598-020-58053-z 



# Feel free to add new sections