# Before you use this template

This template is just a recommended template for project Report. It only considers the general type of research in our paper pool. Feel free to edit it to better fit your project. You will iteratively update the same notebook submission for your draft and the final submission. Please check the project rubriks to get a sense of what is expected in the template.

---

# FAQ and Attentions
* Copy and move this template to your Google Drive. Name your notebook by your team ID (upper-left corner). Don't eidt this original file.
* This template covers most questions we want to ask about your reproduction experiment. You don't need to exactly follow the template, however, you should address the questions. Please feel free to customize your report accordingly.
* any report must have run-able codes and necessary annotations (in text and code comments).
* The notebook is like a demo and only uses small-size data (a subset of original data or processed data), the entire runtime of the notebook including data reading, data process, model training, printing, figure plotting, etc,
must be within 8 min, otherwise, you may get penalty on the grade.
  * If the raw dataset is too large to be loaded  you can select a subset of data and pre-process the data, then, upload the subset or processed data to Google Drive and load them in this notebook.
  * If the whole training is too long to run, you can only set the number of training epoch to a small number, e.g., 3, just show that the training is runable.
  * For results model validation, you can train the model outside this notebook in advance, then, load pretrained model and use it for validation (display the figures, print the metrics).
* The post-process is important! For post-process of the results,please use plots/figures. The code to summarize results and plot figures may be tedious, however, it won't be waste of time since these figures can be used for presentation. While plotting in code, the figures should have titles or captions if necessary (e.g., title your figure with "Figure 1. xxxx")
* There is not page limit to your notebook report, you can also use separate notebooks for the report, just make sure your grader can access and run/test them.
* If you use outside resources, please refer them (in any formats). Include the links to the resources if necessary.

# Mount Notebook to Google Drive
Upload the data, pretrianed model, figures, etc to your Google Drive, then mount this notebook to Google Drive. After that, you can access the resources freely.

Instruction: https://colab.research.google.com/notebooks/io.ipynb

Example: https://colab.research.google.com/drive/1srw_HFWQ2SMgmWIawucXfusGzrj1_U0q

Video: https://www.youtube.com/watch?v=zc8g8lGcwQU

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Introduction



The analysis of clinical notes to predict patient outcomes is a critical area in healthcare informatics. Current methodologies often struggle with the inherent multi-level sequential structure and temporal spacing of clinical notes. This project aims to reproduce the study conducted by Zhang et al. [1], which addresses these challenges using a novel transformer-based hierarchical architecture, termed FTL-Trans, for predicting patient health states from clinical notes.

# Scope of Reproducibility:

List hypotheses from the paper you will test and the corresponding experiments you will run.


1.   Hypothesis 1: The FTL-Trans model will outperform baseline model BERT [2] in predicting patient outcomes using clinical notes.
2.   Hypothesis 2: Incorporating both sequential and temporal information will enhance the model's predictive accuracy and AUROC scores.



In [None]:
# no code is required for this section
'''
if you want to use an image outside this notebook for explanaition,
you can upload it to your google drive and show it with OpenCV or matplotlib
'''
# mount this notebook to your google drive
drive.mount('/content/gdrive')

# define dirs to workspace and data
img_dir = '/content/gdrive/My Drive/Colab Notebooks/<path-to-your-image>'

import cv2
img = cv2.imread(img_dir)
cv2.imshow("Title", img)


# Methodology

This methodology is the core of your project. It consists of run-able codes with necessary annotations to show the expeiment you executed for testing the hypotheses.

The methodology at least contains two subsections **data** and **model** in your experiment.

In [None]:
# import  packages you need
import numpy as np
from google.colab import drive


##  Data
Data includes raw data (MIMIC III tables), descriptive statistics (our homework questions), and data processing (feature engineering).
  * Source of the data: where the data is collected from; if data is synthetic or self-generated, explain how. If possible, please provide a link to the raw datasets.
  * Statistics: include basic descriptive statistics of the dataset like size, cross validation split, label distribution, etc.
  * Data process: how do you munipulate the data, e.g., change the class labels, split the dataset to train/valid/test, refining the dataset.
  * Illustration: printing results, plotting figures for illustration.
  * You can upload your raw dataset to Google Drive and mount this Colab to the same directory. If your raw dataset is too large, you can upload the processed dataset and have a code to load the processed dataset.

Source of the data: The data is extracted from the MIMIC III database [3]. This database contains de-identified health data associated with over 40,000 patients who stayed in intensive care units of the Beth Israel Deaconess Medical Center in Boston, MA, between 2001 and 2012.

Statistics: The database includes 2,083,180 notes across 15 categories. The data covers more than 40,000 patients.

Data process: I use the MicrobiologyEvents dataset and lable the tests with 80002 - Escherichia Coli as positive and other as negative. And find the notes of those patients in NoteEvents table. The clinical notes undergo punctuation removal and conversion to lowercase to standardize the text. Any names or private information relating to medical staff or patients are removed. Notes missing a 'charttime' are assigned a time of 23:59:59 on their respective 'chartdate'. The notes are tokenized using the WordPiece embedding technique and divided into segments of 128 tokens each.

In [None]:
def label(row):
   if row['ORG_ITEMID'] == 80002:
      return 1
   return 0

def main():
    data1 = pd.read_csv('./MICROBIOLOGYEVENTS.csv')
    data2 = pd.read_csv('./NOTEEVENTS.csv')

    output1 = pd.merge(data1, data2,
                   on='SUBJECT_ID',
                   how='inner')

    output1['Label'] = output1.apply(label, axis=1)
    df1 = output1[['HADM_ID','ROW_ID','CHARTDATE','CHARTTIME','TEXT','Label']]
    df1.to_csv('./newdata.csv', index=False)

    parser = argparse.ArgumentParser()

    parser.add_argument("--original_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input data file path."
                             " Should be the .tsv file (or other data file) for the task.")
    parser.add_argument("--output_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The output directory where the processed data will be written.")
    parser.add_argument("--temp_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The output directory where the intermediate processed data will be written.")
    parser.add_argument("--task_name",
                        default=None,
                        type=str,
                        required=True,
                        help="The name of the task.")
    parser.add_argument("--log_path",
                        default=None,
                        type=str,
                        required=True,
                        help="The log file path.")
    parser.add_argument("--id_num_neg",
                        default=None,
                        type=int,
                        required=True,
                        help="The number of admission ids that we want to use for negative category.")
    parser.add_argument("--id_num_pos",
                        default=None,
                        type=int,
                        required=True,
                        help="The number of admission ids that we want to use for positive category.")
    parser.add_argument("--random_seed",
                        default=1,
                        type=int,
                        required=True,
                        help="The random_seed for train/val/test split.")
    parser.add_argument("--bert_model",
                        default="bert-base-uncased",
                        type=str,
                        required=True,
                        help="Bert pre-trained model selected in the list: bert-base-uncased, "
                             "bert-large-uncased, bert-base-cased, bert-base-multilingual, bert-base-chinese.")

    ## Other parameters
    parser.add_argument("--Kfold",
                        default=None,
                        type=int,
                        required=False,
                        help="The number of folds that we want ot use for cross validation. "
                             "Default is not doing cross validation")

    args = parser.parse_args()
    RANDOM_SEED = args.random_seed
    LOG_PATH = args.log_path
    TEMP_DIR = args.temp_dir

    if os.path.exists(TEMP_DIR) and os.listdir(TEMP_DIR):
        raise ValueError("Temp Output directory ({}) already exists and is not empty.".format(TEMP_DIR))
    os.makedirs(TEMP_DIR, exist_ok=True)

    if os.path.exists(args.output_dir) and os.listdir(args.output_dir):
        raise ValueError("Output directory ({}) already exists and is not empty.".format(args.output_dir))
    os.makedirs(args.output_dir, exist_ok=True)

    original_df = pd.read_csv(args.original_data, header=None)
    original_df.rename(columns={0: "Adm_ID",
                                1: "Note_ID",
                                2: "chartdate",
                                3: "charttime",
                                4: "TEXT",
                                5: "Label"}, inplace=True)

    tokenizer = BertTokenizer.from_pretrained(args.bert_model, do_lower_case=True)

    write_log(("New Pre-processing Job Start! \n"
               "original_data: {}, output_dir: {}, temp_dir: {} \n"
               "task_name: {}, log_path: {}\n"
               "id_num_neg: {}, id_num_pos: {}\n"
               "random_seed: {}, bert_model: {}").format(args.original_data, args.output_dir, args.temp_dir,
                                                         args.task_name, args.log_path,
                                                         args.id_num_neg, args.id_num_pos,
                                                         args.random_seed, args.bert_model), LOG_PATH)

    for i in range(int(np.ceil(len(original_df) / 10000))):
        write_log("chunk {} tokenize start!".format(i), LOG_PATH)
        df_chunk = original_df.iloc[i * 10000:(i + 1) * 10000].copy()
        df_processed_chunk = preprocessing(df_chunk, tokenizer)
        df_processed_chunk = df_processed_chunk.astype({'Adm_ID': 'int64', 'Note_ID': 'int64', 'Label': 'int64'}, errors='ignore')
        temp_file_dir = os.path.join(TEMP_DIR, 'Processed_{}.csv'.format(i))
        df_processed_chunk.to_csv(temp_file_dir, index=False)

    df = pd.DataFrame({'Adm_ID': [], 'Note_ID': [], 'TEXT': [], 'Input_ID': [],
                       'Label': [], 'chartdate': [], 'charttime': []})
    for i in range(int(np.ceil(len(original_df) / 10000))):
        temp_file_dir = os.path.join(TEMP_DIR, 'Processed_{}.csv'.format(i))
        df_chunk = pd.read_csv(temp_file_dir, header=0)
        write_log("chunk {} has {} notes".format(i, len(df_chunk)), LOG_PATH)
        df_tmp = pd.concat([df, df_chunk], ignore_index=True)
        df = df_tmp
        del df_tmp


    result = df.Label.value_counts()
    write_log(
        "In the full dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}".format(result[1],
                                                                                          result[0]),
        LOG_PATH)

    dead_ID = pd.Series(df[df.Label == 1].Adm_ID.unique())
    not_dead_ID = pd.Series(df[df.Label == 0].Adm_ID.unique())
    write_log("Total Positive Patients' ids: {}, Total Negative Patients' ids: {}".format(len(dead_ID), len(not_dead_ID)), LOG_PATH)

    not_dead_ID_use = not_dead_ID.sample(n=args.id_num_neg, random_state=RANDOM_SEED)
    dead_ID_use = dead_ID.sample(n=args.id_num_pos, random_state=RANDOM_SEED)

    if args.Kfold is None:
        id_val_test_t = dead_ID_use.sample(frac=0.2, random_state=RANDOM_SEED)
        id_val_test_f = not_dead_ID_use.sample(frac=0.2, random_state=RANDOM_SEED)

        id_train_t = dead_ID_use.drop(id_val_test_t.index)
        id_train_f = not_dead_ID_use.drop(id_val_test_f.index)

        id_val_t = id_val_test_t.sample(frac=0.5, random_state=RANDOM_SEED)
        id_test_t = id_val_test_t.drop(id_val_t.index)
        id_val_f = id_val_test_f.sample(frac=0.5, random_state=RANDOM_SEED)
        id_test_f = id_val_test_f.drop(id_val_f.index)

        id_test = pd.concat([id_test_t, id_test_f])
        test_id_label = pd.DataFrame(data=list(zip(id_test, [1] * len(id_test_t) + [0] * len(id_test_f))),
                                     columns=['id', 'label'])

        id_val = pd.concat([id_val_t, id_val_f])
        val_id_label = pd.DataFrame(data=list(zip(id_val, [1] * len(id_val_t) + [0] * len(id_val_f))),
                                    columns=['id', 'label'])

        id_train = pd.concat([id_train_t, id_train_f])
        train_id_label = pd.DataFrame(data=list(zip(id_train, [1] * len(id_train_t) + [0] * len(id_train_f))),
                                      columns=['id', 'label'])

        mortality_train = df[df.Adm_ID.isin(train_id_label.id)]
        mortality_val = df[df.Adm_ID.isin(val_id_label.id)]
        mortality_test = df[df.Adm_ID.isin(test_id_label.id)]
        mortality_not_use = df[
            (~df.Adm_ID.isin(train_id_label.id)) & (~df.Adm_ID.isin(val_id_label.id) & (~df.Adm_ID.isin(test_id_label.id)))]

        train_result = mortality_train.Label.value_counts()

        val_result = mortality_val.Label.value_counts()

        test_result = mortality_test.Label.value_counts()

        no_result = mortality_not_use.Label.value_counts()

        mortality_train.to_csv(os.path.join(args.output_dir, 'train.csv'), index=False)
        mortality_val.to_csv(os.path.join(args.output_dir, 'val.csv'), index=False)
        mortality_test.to_csv(os.path.join(args.output_dir, 'test.csv'), index=False)
        mortality_not_use.to_csv(os.path.join(args.output_dir, 'not_use.csv'), index=False)
        df.to_csv(os.path.join(args.output_dir, 'full.csv'), index=False)

        if len(no_result) == 2:
            write_log(("In the train dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                       "In the validation dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                       "In the test dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                       "In the not use dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}").format(
                train_result[1],
                train_result[0],
                val_result[1],
                val_result[0],
                test_result[1],
                test_result[0],
                no_result[1],
                no_result[0]),
                LOG_PATH)
        else:
            try:
                write_log(("In the train dataset Positive Patients' Notes: {}, Negative  Patients' Notes: {}\n"
                           "In the validation dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                           "In the test dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                           "In the not use dataset Negative Patients' Notes: {}").format(train_result[1],
                                                                                          train_result[0],
                                                                                          val_result[1],
                                                                                          val_result[0],
                                                                                          test_result[1],
                                                                                          test_result[0],
                                                                                          no_result[0]),
                          LOG_PATH)
            except KeyError:
                write_log(("In the train dataset Positive Patients' Notes: {}, Negative  Patients' Notes: {}\n"
                           "In the validation dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                           "In the test dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                           "In the not use dataset Positive Patients' Notes: {}").format(train_result[1],
                                                                                          train_result[0],
                                                                                          val_result[1],
                                                                                          val_result[0],
                                                                                          test_result[1],
                                                                                          test_result[0],
                                                                                          no_result[1]),
                          LOG_PATH)

        write_log("Data saved in the {}".format(args.output_dir), LOG_PATH)
    else:
        folds_t = KFold(args.Kfold, False, RANDOM_SEED)
        folds_f = KFold(args.Kfold, False, RANDOM_SEED)
        dead_ID_use.reset_index(inplace=True, drop=True)
        not_dead_ID_use.reset_index(inplace=True, drop=True)
        for num, ((train_t, test_t), (train_f, test_f)) in enumerate(zip(folds_t.split(dead_ID_use),
                                                                         folds_f.split(not_dead_ID_use))):
            id_train_t = dead_ID_use[train_t]
            id_val_test_t = dead_ID_use[test_t]
            id_train_f = not_dead_ID_use[train_f]
            id_val_test_f = not_dead_ID_use[test_f]
            id_val_t = id_val_test_t.sample(frac=0.5, random_state=RANDOM_SEED)
            id_test_t = id_val_test_t.drop(id_val_t.index)
            id_val_f = id_val_test_f.sample(frac=0.5, random_state=RANDOM_SEED)
            id_test_f = id_val_test_f.drop(id_val_f.index)

            id_test = pd.concat([id_test_t, id_test_f])
            test_id_label = pd.DataFrame(data=list(zip(id_test, [1] * len(id_test_t) + [0] * len(id_test_f))),
                                         columns=['id', 'label'])

            id_val = pd.concat([id_val_t, id_val_f])
            val_id_label = pd.DataFrame(data=list(zip(id_val, [1] * len(id_val_t) + [0] * len(id_val_f))),
                                        columns=['id', 'label'])

            id_train = pd.concat([id_train_t, id_train_f])
            train_id_label = pd.DataFrame(data=list(zip(id_train, [1] * len(id_train_t) + [0] * len(id_train_f))),
                                          columns=['id', 'label'])

            mortality_train = df[df.Adm_ID.isin(train_id_label.id)]
            mortality_val = df[df.Adm_ID.isin(val_id_label.id)]
            mortality_test = df[df.Adm_ID.isin(test_id_label.id)]
            mortality_not_use = df[
                (~df.Adm_ID.isin(train_id_label.id)) & (
                            ~df.Adm_ID.isin(val_id_label.id) & (~df.Adm_ID.isin(test_id_label.id)))]

            train_result = mortality_train.Label.value_counts()

            val_result = mortality_val.Label.value_counts()

            test_result = mortality_test.Label.value_counts()

            no_result = mortality_not_use.Label.value_counts()

            os.makedirs(os.path.join(args.output_dir, str(num)))
            mortality_train.to_csv(os.path.join(args.output_dir, str(num), 'train.csv'), index=False)
            mortality_val.to_csv(os.path.join(args.output_dir, str(num), 'val.csv'), index=False)
            mortality_test.to_csv(os.path.join(args.output_dir, str(num), 'test.csv'), index=False)
            mortality_not_use.to_csv(os.path.join(args.output_dir, str(num), 'not_use.csv'), index=False)
            df.to_csv(os.path.join(args.output_dir, str(num), 'full.csv'), index=False)

            if len(no_result) == 2:
                write_log(("In the {}th split of {} folds\n"
                           "In the train dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                           "In the validation dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                           "In the test dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                           "In the not use dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}").format(
                    num,
                    args.Kfold,
                    train_result[1],
                    train_result[0],
                    val_result[1],
                    val_result[0],
                    test_result[1],
                    test_result[0],
                    no_result[1],
                    no_result[0]),
                    LOG_PATH)
            else:
                try:
                    write_log(("In the {}th split of {} folds\n"
                               "In the train dataset Positive Patients' Notes: {}, Negative  Patients' Notes: {}\n"
                               "In the validation dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                               "In the test dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                               "In the not use dataset Negative Patients' Notes: {}").format(num,
                                                                                              args.Kfold,
                                                                                              train_result[1],
                                                                                              train_result[0],
                                                                                              val_result[1],
                                                                                              val_result[0],
                                                                                              test_result[1],
                                                                                              test_result[0],
                                                                                              no_result[0]),
                              LOG_PATH)
                except KeyError:
                    write_log(("In the {}th split of {} folds\n"
                               "In the train dataset Positive Patients' Notes: {}, Negative  Patients' Notes: {}\n"
                               "In the validation dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                               "In the test dataset Positive Patients' Notes: {}, Negative Patients' Notes: {}\n"
                               "In the not use dataset Positive Patients' Notes: {}").format(num,
                                                                                             args.Kfold,
                                                                                             train_result[1],
                                                                                             train_result[0],
                                                                                             val_result[1],
                                                                                             val_result[0],
                                                                                             test_result[1],
                                                                                             test_result[0],
                                                                                             no_result[1]),
                              LOG_PATH)

            write_log("Data saved in the {}".format(os.path.join(args.output_dir, str(num))), LOG_PATH)


##   Model
The model includes the model definitation which usually is a class, model training, and other necessary parts.
  * Model architecture: layer number/size/type, activation function, etc
  * Training objectives: loss function, optimizer, weight of each loss term, etc
  * Others: whether the model is pretrained, Monte Carlo simulation for uncertainty analysis, etc
  * The code of model should have classes of the model, functions of model training, model validation, etc.
  * If your model training is done outside of this notebook, please upload the trained model here and develop a function to load and test it.

Base model using is BERT [2].

In [None]:
def get_patient_score(df, c):
    df_sort = df.sort_values(by=['Adm_ID'])
    # score
    temp = (df_sort.groupby(['Adm_ID'])['logits'].agg(max) + df_sort.groupby(['Adm_ID'])['logits'].agg(
        sum) / c) / (1 + df_sort.groupby(['Adm_ID'])['logits'].agg(len) / c)
    x = df_sort.groupby(['Adm_ID'])['label'].agg(np.min).values
    predictions = (temp.values >= 0.5).astype(np.int)
    ids = df_sort['Adm_ID'].unique()
    df_out = pd.DataFrame({'logits': temp.values, 'pred_label': predictions, 'label': x, 'Adm_ID': ids})
    return df_out


def test_func(sublist):
    if sublist.shape is ():
        return [sublist.tolist()]
    else:
        return sublist


def main():
    parser = argparse.ArgumentParser()
    ## Required parameters
    parser.add_argument("--data_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The input data dir. Should contain the .tsv files (or other data files) for the task.")

    parser.add_argument("--train_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input training data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--val_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input validation data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--test_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input test data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--log_path",
                        default=None,
                        type=str,
                        required=True,
                        help="The log file path.")

    parser.add_argument("--output_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The output directory where the model checkpoints will be written.")

    parser.add_argument("--save_model",
                        default=False,
                        action='store_true',
                        help="Whether to save the model.")

    parser.add_argument("--bert_model",
                        default="bert-base-uncased",
                        type=str,
                        required=True,
                        help="Bert pre-trained model selected in the list: bert-base-uncased, "
                             "bert-large-uncased, bert-base-cased, bert-base-multilingual, bert-base-chinese.")

    parser.add_argument("--embed_mode",
                        default=None,
                        type=str,
                        required=True,
                        help="The embedding type selected in the list: all, note, chunk, no.")

    parser.add_argument("--c",
                        type=float,
                        required=True,
                        help="The parameter c for scaled adjusted mean method")

    parser.add_argument("--task_name",
                        default="BERT_mortality_am",
                        type=str,
                        required=True,
                        help="The name of the task.")

    ## Other parameters
    parser.add_argument("--max_seq_length",
                        default=128,
                        type=int,
                        help="The maximum total input sequence length after WordPiece tokenization. \n"
                             "Sequences longer than this will be truncated, and sequences shorter \n"
                             "than this will be padded.")
    parser.add_argument("--max_chunk_num",
                        default=64,
                        type=int,
                        help="The maximum total input chunk numbers after WordPiece tokenization.")
    parser.add_argument("--train_batch_size",
                        default=1,
                        type=int,
                        help="Total batch size for training.")
    parser.add_argument("--eval_batch_size",
                        default=1,
                        type=int,
                        help="Total batch size for eval.")
    parser.add_argument("--learning_rate",
                        default=2e-5,
                        type=float,
                        help="The initial learning rate for Adam.")
    parser.add_argument("--warmup_proportion",
                        default=0.0,
                        type=float,
                        help="Proportion of training to perform linear learning rate warmup for. "
                             "E.g., 0.1 = 10%% of training.")
    parser.add_argument("--num_train_epochs",
                        default=3,
                        type=int,
                        help="Total number of training epochs to perform.")
    parser.add_argument('--seed',
                        type=int,
                        default=42,
                        help="random seed for initialization")
    parser.add_argument('--gradient_accumulation_steps',
                        type=int,
                        default=1,
                        help="Number of updates steps to accumualte before performing a backward/update pass.")

    args = parser.parse_args()

    if os.path.exists(args.output_dir) and os.listdir(args.output_dir) and args.save_model:
        raise ValueError("Output directory ({}) already exists and is not empty.".format(args.output_dir))
    os.makedirs(args.output_dir, exist_ok=True)

    LOG_PATH = args.log_path
    MAX_LEN = args.max_seq_length

    config = DotMap()
    config.hidden_dropout_prob = 0.1
    config.layer_norm_eps = 1e-12
    config.initializer_range = 0.02
    config.max_note_position_embedding = 1000
    config.max_chunk_position_embedding = 1000
    config.embed_mode = args.embed_mode
    config.layer_norm_eps = 1e-12
    config.hidden_size = 768

    config.task_name = args.task_name

    write_log(("New Job Start! \n"
               "Data directory: {}, Directory Code: {}, Save Model: {}\n"
               "Output_dir: {}, Task Name: {}, embed_mode: {}\n"
               "max_seq_length: {},  max_chunk_num: {}\n"
               "train_batch_size: {}, eval_batch_size: {}\n"
               "learning_rate: {}, warmup_proportion: {}\n"
               "num_train_epochs: {}, seed: {}, gradient_accumulation_steps: {}").format(args.data_dir,
                                                                                         args.data_dir.split('_')[-1],
                                                                                         args.save_model,
                                                                                         args.output_dir,
                                                                                         config.task_name,
                                                                                         config.embed_mode,
                                                                                         args.max_seq_length,
                                                                                         args.max_chunk_num,
                                                                                         args.train_batch_size,
                                                                                         args.eval_batch_size,
                                                                                         args.learning_rate,
                                                                                         args.warmup_proportion,
                                                                                         args.num_train_epochs,
                                                                                         args.seed,
                                                                                         args.gradient_accumulation_steps),
              LOG_PATH)

    content = "config setting: \n"
    for k, v in config.items():
        content += "{}: {} \n".format(k, v)
    write_log(content, LOG_PATH)

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    n_gpu = torch.cuda.device_count()
    write_log("Number of GPU is {}".format(n_gpu), LOG_PATH)
    for i in range(n_gpu):
        write_log(("Device Name: {},"
                   "Device Capability: {}").format(torch.cuda.get_device_name(i),
                                                   torch.cuda.get_device_capability(i)), LOG_PATH)

    train_file_path = os.path.join(args.data_dir, args.train_data)
    val_file_path = os.path.join(args.data_dir, args.val_data)
    test_file_path = os.path.join(args.data_dir, args.test_data)
    train_df = pd.read_csv(train_file_path)
    val_df = pd.read_csv(val_file_path)
    test_df = pd.read_csv(test_file_path)

    random.seed(args.seed)
    np.random.seed(args.seed)
    torch.manual_seed(args.seed)
    if n_gpu > 0:
        torch.cuda.manual_seed_all(args.seed)

    tokenizer = BertTokenizer.from_pretrained(args.bert_model, do_lower_case=True)

    write_log("Tokenize Start!", LOG_PATH)
    train_labels, train_inputs, train_masks, train_note_ids = Tokenize_with_note_id(train_df, MAX_LEN, tokenizer)
    validation_labels, validation_inputs, validation_masks, validation_note_ids = Tokenize_with_note_id(val_df, MAX_LEN,
                                                                                                        tokenizer)
    test_labels, test_inputs, test_masks, test_note_ids = Tokenize_with_note_id(test_df, MAX_LEN, tokenizer)
    write_log("Tokenize Finished!", LOG_PATH)
    train_inputs = torch.tensor(train_inputs)
    validation_inputs = torch.tensor(validation_inputs)
    test_inputs = torch.tensor(test_inputs)
    train_labels = torch.tensor(train_labels)
    validation_labels = torch.tensor(validation_labels)
    test_labels = torch.tensor(test_labels)
    train_masks = torch.tensor(train_masks)
    validation_masks = torch.tensor(validation_masks)
    test_masks = torch.tensor(test_masks)
    write_log(("train dataset size is %d,\n"
               "validation dataset size is %d,\n"
               "test dataset size is %d") % (len(train_inputs), len(validation_inputs), len(test_inputs)), LOG_PATH)

    (train_labels, train_inputs,
     train_masks, train_ids,
     train_note_ids, train_chunk_ids) = concat_by_id_list_with_note_chunk_id(train_df, train_labels,
                                                                             train_inputs, train_masks,
                                                                             train_note_ids, MAX_LEN)
    (validation_labels, validation_inputs,
     validation_masks, validation_ids,
     validation_note_ids, validation_chunk_ids) = concat_by_id_list_with_note_chunk_id(val_df, validation_labels,
                                                                                       validation_inputs,
                                                                                       validation_masks,
                                                                                       validation_note_ids, MAX_LEN)
    (test_labels, test_inputs,
     test_masks, test_ids,
     test_note_ids, test_chunk_ids) = concat_by_id_list_with_note_chunk_id(test_df, test_labels,
                                                                           test_inputs, test_masks,
                                                                           test_note_ids, MAX_LEN)

    model = BertForSequenceClassification.from_pretrained(args.bert_model, num_labels=2)
    model.to(device)
    if n_gpu > 1:
        model = torch.nn.DataParallel(model)
    param_optimizer = list(model.named_parameters())
    no_decay = ['bias', 'gamma', 'beta']
    optimizer_grouped_parameters = [
        {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.01},
        {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.0}
    ]
    num_train_steps = int(
        len(train_df) / args.train_batch_size / args.gradient_accumulation_steps * args.num_train_epochs)

    optimizer = BertAdam(optimizer_grouped_parameters,
                         lr=args.learning_rate,
                         warmup=args.warmup_proportion,
                         t_total=num_train_steps)

    m = torch.nn.Softmax(dim=1)

    start = time.time()
    # Store our loss and accuracy for plotting
    train_loss_set = []

    # Number of training epochs (authors recommend between 2 and 4)
    epochs = args.num_train_epochs

    train_batch_generator = mask_batch_generator(args.max_chunk_num, train_inputs, train_labels, train_masks)
    validation_batch_generator = mask_batch_generator(args.max_chunk_num, validation_inputs, validation_labels,
                                                      validation_masks)

    write_log("Training start!", LOG_PATH)
    # trange is a tqdm wrapper around the normal python range
    with torch.autograd.set_detect_anomaly(True):
        for epoch in trange(epochs, desc="Epoch"):
            # Training

            # Set our model to training mode (as opposed to evaluation mode)
            model.train()

            # Tracking variables
            tr_loss = 0
            nb_tr_examples, nb_tr_steps = 0, 0

            # Train the data for one epoch
            tr_ids_num = len(train_ids)
            tr_batch_loss = []
            for step in range(tr_ids_num):
                b_input_ids, b_labels, b_input_mask = next(train_batch_generator)
                b_input_ids = b_input_ids.to(device)
                b_input_mask = b_input_mask.to(device)
                b_labels = b_labels.repeat(b_input_ids.shape[0]).to(device)
                # Forward pass
                outputs = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask, labels=b_labels)
                loss, logits = outputs[:2]
                if n_gpu > 1:
                    loss = loss.mean()  # mean() to average on multi-gpu.
                train_loss_set.append(loss.item())
                # Backward pass
                loss.backward()
                # Update parameters and take a step using the computed gradient
                if (step + 1) % args.train_batch_size == 0:
                    optimizer.step()
                    optimizer.zero_grad()
                    train_loss_set.append(np.mean(tr_batch_loss))
                    tr_batch_loss = []

                # Update tracking variables
                tr_loss += loss.item()
                nb_tr_examples += b_input_ids.size(0)
                nb_tr_steps += 1


            write_log("Train loss: {}".format(tr_loss / nb_tr_steps), LOG_PATH)

            # Validation

            # Put model in evaluation mode to evaluate loss on the validation set
            model.eval()

            # Tracking variables
            eval_loss, eval_accuracy = 0, 0
            nb_eval_steps, nb_eval_examples = 0, 0
            # Evaluate data for one epoch
            ev_ids_num = len(validation_ids)
            for step in range(ev_ids_num):
                with torch.no_grad():
                    b_input_ids, b_labels, b_input_mask = next(validation_batch_generator)
                    b_input_ids = b_input_ids.to(device)
                    b_input_mask = b_input_mask.to(device)
                    b_labels = b_labels.repeat(b_input_ids.shape[0])
                    outputs = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)
                    # Move logits and labels to CPU
                    logits = outputs[-1]
                    logits = m(logits).detach().cpu().numpy()[:, 1]
                    label_ids = b_labels.numpy()

                    tmp_eval_accuracy = flat_accuracy(logits, label_ids)

                    eval_accuracy += tmp_eval_accuracy
                    nb_eval_steps += 1

            write_log("Validation Accuracy: {}".format(eval_accuracy / nb_eval_steps), LOG_PATH)
            output_checkpoints_path = os.path.join(args.output_dir,
                                                   "bert_fine_tuned_with_note_checkpoint_%d.pt" % epoch)
            if args.save_model:
                if n_gpu > 1:
                    torch.save({
                        'epoch': epoch,
                        'model_state_dict': model.module.state_dict(),
                        'optimizer_state_dict': optimizer.state_dict(),
                        'loss': loss,
                    },
                        output_checkpoints_path)

                else:
                    torch.save({
                        'epoch': epoch,
                        'model_state_dict': model.state_dict(),
                        'optimizer_state_dict': optimizer.state_dict(),
                        'loss': loss,
                    },
                        output_checkpoints_path)
    end = time.time()

    write_log("total training time is: {}s".format(end - start), LOG_PATH)

    fig1 = plt.figure(figsize=(15, 8))
    plt.title("Training loss")
    plt.xlabel("Chunk Batch")
    plt.ylabel("Loss")
    plt.plot(train_loss_set)
    if args.save_model:
        output_fig_path = os.path.join(args.output_dir, "bert_fine_tuned_with_note_training_loss.png")
        plt.savefig(output_fig_path, dpi=fig1.dpi)
        output_model_state_dict_path = os.path.join(args.output_dir,
                                                    "bert_fine_tuned_with_note_state_dict.pt")
        if n_gpu > 1:
            torch.save(model.module.state_dict(), output_model_state_dict_path)
        else:
            torch.save(model.state_dict(), output_model_state_dict_path)
        write_log("Model saved!", LOG_PATH)
    else:
        output_fig_path = os.path.join(args.output_dir,
                                       "bert_fine_tuned_with_note_training_loss_{}_{}.png".format(
                                           args.seed,
                                           args.data_dir.split(
                                               '_')[-1]))
        plt.savefig(output_fig_path, dpi=fig1.dpi)
        write_log("Model not saved as required", LOG_PATH)

    # Prediction on test set

    # Put model in evaluation mode
    model.eval()

    # Tracking variables
    predictions, true_labels, test_adm_ids = [], [], []

    # Predict
    te_ids_num = len(test_ids)
    for step in range(te_ids_num):
        b_input_ids = test_inputs[step][-args.max_chunk_num:, :].to(device)
        b_input_mask = test_masks[step][-args.max_chunk_num:, :].to(device)
        b_labels = test_labels[step].repeat(b_input_ids.shape[0])
        # Telling the model not to compute or store gradients, saving memory and speeding up prediction
        with torch.no_grad():
            # Forward pass, calculate logit predictions
            outputs = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)

        # Move logits and labels to CPU
        logits = outputs[-1]
        logits = m(logits).detach().cpu().numpy()[:, 1]
        label_ids = b_labels.numpy()
        adm_ids = test_ids[step].repeat(b_input_ids.shape[0])

        # Store predictions and true labels
        predictions.append(logits)
        true_labels.append(label_ids)
        test_adm_ids.append(adm_ids)

    try:
        flat_logits = [item for sublist in predictions for item in sublist]
    except TypeError:
        flat_logits = [item for sublist in predictions for item in test_func(sublist)]
    flat_predictions = (np.array(flat_logits) >= 0.5).astype(np.int)
    try:
        flat_true_labels = [item for sublist in true_labels for item in sublist]
    except TypeError:
        flat_true_labels = [item for sublist in true_labels for item in test_func(sublist)]
    try:
        flat_adm_ids = [item for sublist in test_adm_ids for item in sublist]
    except TypeError:
        flat_adm_ids = [item for sublist in test_adm_ids for item in test_func(sublist)]

    output_chunk_df = pd.DataFrame({'logits': flat_logits,
                                    'pred_label': flat_predictions,
                                    'label': flat_true_labels,
                                    'Adm_ID': flat_adm_ids})

    if args.save_model:
        output_chunk_df.to_csv(os.path.join(args.output_dir, 'test_chunk_predictions.csv'), index=False)
    else:
        output_chunk_df.to_csv(os.path.join(args.output_dir,
                                            'test_chunk_predictions_{}_{}.csv'.format(args.seed,
                                                                                      args.data_dir.split('_')[-1])),
                               index=False)

    output_df = get_patient_score(output_chunk_df, args.c)
    if args.save_model:
        output_df.to_csv(os.path.join(args.output_dir, 'test_predictions.csv'), index=False)
    else:
        output_df.to_csv(os.path.join(args.output_dir,
                                      'test_predictions_{}_{}.csv'.format(args.seed,
                                                                          args.data_dir.split('_')[-1])),
                         index=False)
    write_performance(output_df['label'].values, output_df['pred_label'].values,
                      output_df['logits'].values, config, args)


The model without time info to test H2.

In [None]:
def main():
    parser = argparse.ArgumentParser()
    ## Required parameters
    parser.add_argument("--data_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The input data dir. Should contain the .tsv files (or other data files) for the task.")

    parser.add_argument("--train_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input training data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--val_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input validation data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--test_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input test data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--log_path",
                        default=None,
                        type=str,
                        required=True,
                        help="The log file path.")

    parser.add_argument("--output_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The output directory where the model checkpoints will be written.")

    parser.add_argument("--save_model",
                        default=False,
                        action='store_true',
                        help="Whether to save the model.")

    parser.add_argument("--bert_model",
                        default="bert-base-uncased",
                        type=str,
                        required=True,
                        help="Bert pre-trained model selected in the list: bert-base-uncased, "
                             "bert-large-uncased, bert-base-cased, bert-base-multilingual, bert-base-chinese.")

    parser.add_argument("--embed_mode",
                        default=None,
                        type=str,
                        required=True,
                        help="The embedding type selected in the list: all, note, chunk, no.")

    parser.add_argument("--task_name",
                        default="Patient_Transformer_with_ClBERT_mortality",
                        type=str,
                        required=True,
                        help="The name of the task.")


    ## Other parameters
    parser.add_argument("--max_seq_length",
                        default=128,
                        type=int,
                        help="The maximum total input sequence length after WordPiece tokenization. \n"
                             "Sequences longer than this will be truncated, and sequences shorter \n"
                             "than this will be padded.")
    parser.add_argument("--max_chunk_num",
                        default=64,
                        type=int,
                        help="The maximum total input chunk numbers after WordPiece tokenization.")
    parser.add_argument("--train_batch_size",
                        default=1,
                        type=int,
                        help="Total batch size for training.")
    parser.add_argument("--eval_batch_size",
                        default=1,
                        type=int,
                        help="Total batch size for eval.")
    parser.add_argument("--learning_rate",
                        default=2e-5,
                        type=float,
                        help="The initial learning rate for Adam.")
    parser.add_argument("--warmup_proportion",
                        default=0.0,
                        type=float,
                        help="Proportion of training to perform linear learning rate warmup for. "
                             "E.g., 0.1 = 10%% of training.")
    parser.add_argument("--num_train_epochs",
                        default=3,
                        type=int,
                        help="Total number of training epochs to perform.")
    parser.add_argument('--seed',
                        type=int,
                        default=42,
                        help="random seed for initialization")
    parser.add_argument('--gradient_accumulation_steps',
                        type=int,
                        default=1,
                        help="Number of updates steps to accumulate before performing a backward/update pass.")
    parser.add_argument('--num_hidden_layers',
                        type=int,
                        default=2,
                        help="Number of hidden layers in the patient model.")
    parser.add_argument('--num_attention_heads',
                        type=int,
                        default=12,
                        help="Number of attention heads in the patient model.")

    args = parser.parse_args()

    if os.path.exists(args.output_dir) and os.listdir(args.output_dir) and args.save_model:
        raise ValueError("Output directory ({}) already exists and is not empty.".format(args.output_dir))
    os.makedirs(args.output_dir, exist_ok=True)

    LOG_PATH = args.log_path
    MAX_LEN = args.max_seq_length

    config = DotMap()
    config.hidden_dropout_prob = 0.1
    config.attention_probs_dropout_prob = 0.1
    config.initializer_range = 0.02
    config.num_hidden_layers = args.num_hidden_layers
    config.num_attention_heads = args.num_attention_heads
    config.max_note_position_embedding = 1000
    config.max_chunk_position_embedding = 1000
    config.embed_mode = args.embed_mode
    config.layer_norm_eps = 1e-12
    config.hidden_act = "gelu"
    config.hidden_size = 768
    config.intermediate_size = 3072

    config.task_name = args.task_name

    write_log(("New Job Start! \n"
               "Data directory: {}, Directory Code: {}, Save Model: {}\n"
               "Output_dir: {}, Task Name: {}, embed_mode: {}\n"
               "max_seq_length: {},  max_chunk_num: {}\n"
               "train_batch_size: {}, eval_batch_size: {}\n"
               "learning_rate: {}, warmup_proportion: {} \n"
               "num_train_epochs: {}, seed: {}, gradient_accumulation_steps: {} \n"
               "Patient Model's num_hidden_layers: {}, num_attention_heads: {}").format(args.data_dir,
                                                                                        args.data_dir.split('_')[-1],
                                                                                        args.save_model,
                                                                                        args.output_dir,
                                                                                        config.task_name,
                                                                                        config.embed_mode,
                                                                                        args.max_seq_length,
                                                                                        args.max_chunk_num,
                                                                                        args.train_batch_size,
                                                                                        args.eval_batch_size,
                                                                                        args.learning_rate,
                                                                                        args.warmup_proportion,
                                                                                        args.num_train_epochs,
                                                                                        args.seed,
                                                                                        args.gradient_accumulation_steps,
                                                                                        config.num_hidden_layers,
                                                                                        config.num_attention_heads),
              LOG_PATH)

    content = "config setting: \n"
    for k, v in config.items():
        content += "{}: {} \n".format(k, v)
    write_log(content, LOG_PATH)

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    n_gpu = torch.cuda.device_count()
    write_log("Number of GPU is {}".format(n_gpu), LOG_PATH)
    for i in range(n_gpu):
        write_log(("Device Name: {},"
                   "Device Capability: {}").format(torch.cuda.get_device_name(i),
                                                   torch.cuda.get_device_capability(i)), LOG_PATH)

    train_file_path = os.path.join(args.data_dir, args.train_data)
    val_file_path = os.path.join(args.data_dir, args.val_data)
    test_file_path = os.path.join(args.data_dir, args.test_data)
    train_df = pd.read_csv(train_file_path)
    val_df = pd.read_csv(val_file_path)
    test_df = pd.read_csv(test_file_path)

    random.seed(args.seed)
    np.random.seed(args.seed)
    torch.manual_seed(args.seed)
    if n_gpu > 0:
        torch.cuda.manual_seed_all(args.seed)

    tokenizer = BertTokenizer.from_pretrained(args.bert_model, do_lower_case=True)

    write_log("Tokenize Start!", LOG_PATH)
    train_df = reorder_by_time(train_df)
    val_df = reorder_by_time(val_df)
    test_df = reorder_by_time(test_df)
    train_labels, train_inputs, train_masks, train_note_ids = Tokenize_with_note_id(train_df, MAX_LEN, tokenizer)
    validation_labels, validation_inputs, validation_masks, validation_note_ids = Tokenize_with_note_id(val_df, MAX_LEN, tokenizer)
    test_labels, test_inputs, test_masks, test_note_ids = Tokenize_with_note_id(test_df, MAX_LEN, tokenizer)
    write_log("Tokenize Finished!", LOG_PATH)
    train_inputs = torch.tensor(train_inputs)
    validation_inputs = torch.tensor(validation_inputs)
    test_inputs = torch.tensor(test_inputs)
    train_labels = torch.tensor(train_labels)
    validation_labels = torch.tensor(validation_labels)
    test_labels = torch.tensor(test_labels)
    train_masks = torch.tensor(train_masks)
    validation_masks = torch.tensor(validation_masks)
    test_masks = torch.tensor(test_masks)
    write_log(("train dataset size is %d,\n"
               "validation dataset size is %d,\n"
               "test dataset size is %d") % (len(train_inputs), len(validation_inputs), len(test_inputs)), LOG_PATH)

    (train_labels, train_inputs,
     train_masks, train_ids,
     train_note_ids, train_chunk_ids) = concat_by_id_list_with_note_chunk_id(train_df, train_labels,
                                                                             train_inputs, train_masks,
                                                                             train_note_ids, MAX_LEN)
    (validation_labels, validation_inputs,
     validation_masks, validation_ids,
     validation_note_ids, validation_chunk_ids) = concat_by_id_list_with_note_chunk_id(val_df, validation_labels,
                                                                                       validation_inputs,
                                                                                       validation_masks,
                                                                                       validation_note_ids, MAX_LEN)
    (test_labels, test_inputs,
     test_masks, test_ids,
     test_note_ids, test_chunk_ids) = concat_by_id_list_with_note_chunk_id(test_df, test_labels,
                                                                           test_inputs, test_masks,
                                                                           test_note_ids, MAX_LEN)

    model = BertModel.from_pretrained(args.bert_model).to(device)
    patient_model = PatientLevelBertForSequenceClassification(config=config, num_labels=1).to(device)

    if n_gpu > 1:
        model = torch.nn.DataParallel(model)
        patient_model = torch.nn.DataParallel(patient_model)
    param_optimizer = list(model.named_parameters()) + list(patient_model.named_parameters())
    no_decay = ['bias', 'gamma', 'beta']
    optimizer_grouped_parameters = [
        {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.01},
        {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.0}
    ]
    num_train_steps = int(
        len(train_labels) / args.gradient_accumulation_steps * args.num_train_epochs)

    optimizer = BertAdam(optimizer_grouped_parameters,
                         lr=args.learning_rate,
                         warmup=args.warmup_proportion,
                         t_total=num_train_steps)

    start = time.time()
    # Store our loss and accuracy for plotting
    train_loss_set = []

    # Number of training epochs (authors recommend between 2 and 4)
    epochs = args.num_train_epochs

    train_batch_generator = time_batch_generator(args.max_chunk_num, train_inputs, train_labels, train_masks,
                                                 train_note_ids, train_chunk_ids)
    validation_batch_generator = time_batch_generator(args.max_chunk_num, validation_inputs, validation_labels,
                                                      validation_masks, validation_note_ids, validation_chunk_ids)
    write_log("Training start!", LOG_PATH)
    # trange is a tqdm wrapper around the normal python range
    with torch.autograd.set_detect_anomaly(True):
        for epoch in trange(epochs, desc="Epoch"):

            # Training

            # Set our model to training mode (as opposed to evaluation mode)
            model.train()
            patient_model.train()

            # Tracking variables
            tr_loss = 0
            nb_tr_examples, nb_tr_steps = 0, 0

            # Train the data for one epoch
            tr_ids_num = len(train_ids)
            for step in range(tr_ids_num):
                b_input_ids, b_labels, b_input_mask, b_note_ids, b_chunk_ids = next(train_batch_generator)
                b_input_ids = b_input_ids.to(device)
                b_input_mask = b_input_mask.to(device)
                b_new_note_ids = convert_note_ids(b_note_ids).to(device)
                b_chunk_ids = b_chunk_ids.unsqueeze(0).to(device)
                b_labels = b_labels.to(device)
                b_labels.resize_((1))
                _, whole_output = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)
                whole_input = whole_output.unsqueeze(0)
                b_new_note_ids = b_new_note_ids.unsqueeze(0)
                loss, pred = patient_model(whole_input, b_new_note_ids, b_chunk_ids, b_labels)

                if n_gpu > 1:
                    loss = loss.mean()  # mean() to average on multi-gpu.
                train_loss_set.append(loss.item())
                # Backward pass
                loss.backward()
                # Update parameters and take a step using the computed gradient
                if (step + 1) % args.train_batch_size == 0:
                    optimizer.step()
                    optimizer.zero_grad()

                # Update tracking variables
                tr_loss += loss.item()
                nb_tr_examples += b_input_ids.size(0)
                nb_tr_steps += 1

            write_log("Train loss: {}".format(tr_loss / nb_tr_steps), LOG_PATH)

            # Validation

            # Put model in evaluation mode to evaluate loss on the validation set
            model.eval()
            patient_model.eval()

            # Tracking variables
            eval_loss, eval_accuracy = 0, 0
            nb_eval_steps, nb_eval_examples = 0, 0
            # Evaluate data for one epoch
            ev_ids_num = len(validation_ids)
            for step in range(ev_ids_num):
                with torch.no_grad():
                    b_input_ids, b_labels, b_input_mask, b_note_ids, b_chunk_ids = next(validation_batch_generator)
                    b_input_ids = b_input_ids.to(device)
                    b_input_mask = b_input_mask.to(device)
                    b_new_note_ids = convert_note_ids(b_note_ids).to(device)
                    b_labels.resize_((1))
                    _, whole_output = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)
                    whole_input = whole_output.unsqueeze(0)
                    b_new_note_ids = b_new_note_ids.unsqueeze(0)
                    pred = patient_model(whole_input, b_new_note_ids, b_chunk_ids).detach().cpu().numpy()
                label_ids = b_labels.numpy()
                tmp_eval_accuracy = flat_accuracy(pred, label_ids)
                eval_accuracy += tmp_eval_accuracy
                nb_eval_steps += 1

            write_log("Validation Accuracy: {}".format(eval_accuracy / nb_eval_steps), LOG_PATH)
            output_checkpoints_path = os.path.join(args.output_dir,
                                                   "bert_fine_tuned_with_note_checkpoint_%d.pt" % epoch)

            if args.save_model:
                if n_gpu > 1:
                    torch.save({
                        'epoch': epoch,
                        'model_state_dict': model.module.state_dict(),
                        'patient_model_state_dict': patient_model.module.state_dict(),
                        'optimizer_state_dict': optimizer.state_dict(),
                        'loss': loss,
                    },
                        output_checkpoints_path)
                else:
                    torch.save({
                        'epoch': epoch,
                        'model_state_dict': model.state_dict(),
                        'patient_model_state_dict': patient_model.state_dict(),
                        'optimizer_state_dict': optimizer.state_dict(),
                        'loss': loss,
                    },
                        output_checkpoints_path)
    end = time.time()
    write_log("total training time is: {}s".format(end - start), LOG_PATH)

    fig1 = plt.figure(figsize=(15, 8))
    plt.title("Training loss")
    plt.xlabel("Patient Batch")
    plt.ylabel("Loss")
    plt.plot(train_loss_set)
    if args.save_model:
        output_fig_path = os.path.join(args.output_dir, "bert_fine_tuned_with_note_training_loss.png")
        plt.savefig(output_fig_path, dpi=fig1.dpi)

        output_model_state_dict_path = os.path.join(args.output_dir, "bert_fine_tuned_with_note_state_dict.pt")
        if n_gpu > 1:
            torch.save({
                'model_state_dict': model.module.state_dict(),
                'patient_model_state_dict': patient_model.state_dict(),
            },
                output_model_state_dict_path)
        else:
            torch.save({
                'model_state_dict': model.state_dict(),
                'patient_model_state_dict': patient_model.state_dict(),
            },
                output_model_state_dict_path)
        write_log("Model saved!", LOG_PATH)
    else:
        output_fig_path = os.path.join(args.output_dir,
                                       "bert_fine_tuned_with_note_training_loss_{}_{}.png".format(args.seed,
                                                                                                  args.data_dir.split('_')[-1]))
        plt.savefig(output_fig_path, dpi=fig1.dpi)
        write_log("Model not saved as required", LOG_PATH)

    # Prediction on test set

    # Put model in evaluation mode
    model.eval()
    patient_model.eval()

    # Tracking variables
    predictions, true_labels = [], []

    # Predict
    te_ids_num = len(test_ids)
    for step in range(te_ids_num):
        b_input_ids = test_inputs[step][-args.max_chunk_num:, :].to(device)
        b_input_mask = test_masks[step][-args.max_chunk_num:, :].to(device)
        b_note_ids = test_note_ids[step][-args.max_chunk_num:]
        b_new_note_ids = convert_note_ids(b_note_ids).to(device)
        b_chunk_ids = test_chunk_ids[step][-args.max_chunk_num:].unsqueeze(0).to(device)
        b_labels = test_labels[step]
        b_labels.resize_((1))
        with torch.no_grad():
            _, whole_output = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)
            whole_input = whole_output.unsqueeze(0)
            b_new_note_ids = b_new_note_ids.unsqueeze(0)
            pred = patient_model(whole_input, b_new_note_ids, b_chunk_ids).detach().cpu().numpy()
        label_ids = b_labels.numpy()[0]
        predictions.append(pred)
        true_labels.append(label_ids)

    # Flatten the predictions and true values for aggregate Matthew's evaluation on the whole dataset
    flat_logits = [item for sublist in predictions for item in sublist]
    flat_predictions = np.asarray([1 if i else 0 for i in (np.array(flat_logits) >= 0.5)])
    flat_true_labels = np.asarray(true_labels)

    output_df = pd.DataFrame({'pred_prob': flat_logits,
                              'pred_label': flat_predictions,
                              'label': flat_true_labels,
                              'Adm_ID': test_ids})

    if args.save_model:
        output_df.to_csv(os.path.join(args.output_dir, 'test_predictions.csv'), index=False)
    else:
        output_df.to_csv(os.path.join(args.output_dir,
                                      'test_predictions_{}_{}.csv'.format(args.seed,
                                                                          args.data_dir.split('_')[-1])), index=False)

    write_performance(flat_true_labels, flat_predictions, flat_logits, config, args)

The FTL-Trans model proposed in this study.

In [None]:
def main():
    parser = argparse.ArgumentParser()
    ## Required parameters
    parser.add_argument("--data_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The input data dir. Should contain the .tsv files (or other data files) for the task.")

    parser.add_argument("--train_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input training data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--val_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input validation data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--test_data",
                        default=None,
                        type=str,
                        required=True,
                        help="The input test data file name."
                             " Should be the .tsv file (or other data file) for the task.")

    parser.add_argument("--log_path",
                        default=None,
                        type=str,
                        required=True,
                        help="The log file path.")

    parser.add_argument("--output_dir",
                        default=None,
                        type=str,
                        required=True,
                        help="The output directory where the model checkpoints will be written.")

    parser.add_argument("--save_model",
                        default=False,
                        action='store_true',
                        help="Whether to save the model.")

    parser.add_argument("--bert_model",
                        default="bert-base-uncased",
                        type=str,
                        required=True,
                        help="Bert pre-trained model selected in the list: bert-base-uncased, "
                             "bert-large-uncased, bert-base-cased, bert-base-multilingual, bert-base-chinese.")

    parser.add_argument("--embed_mode",
                        default=None,
                        type=str,
                        required=True,
                        help="The embedding type selected in the list: all, note, chunk, no.")

    parser.add_argument("--task_name",
                        default="FTLSTM_with_ClBERT_mortality",
                        type=str,
                        required=True,
                        help="The name of the task.")

    ## Other parameters
    parser.add_argument("--max_seq_length",
                        default=128,
                        type=int,
                        help="The maximum total input sequence length after WordPiece tokenization. \n"
                             "Sequences longer than this will be truncated, and sequences shorter \n"
                             "than this will be padded.")
    parser.add_argument("--max_chunk_num",
                        default=64,
                        type=int,
                        help="The maximum total input chunk numbers after WordPiece tokenization.")
    parser.add_argument("--train_batch_size",
                        default=1,
                        type=int,
                        help="Total batch size for training.")
    parser.add_argument("--eval_batch_size",
                        default=1,
                        type=int,
                        help="Total batch size for eval.")
    parser.add_argument("--learning_rate",
                        default=2e-5,
                        type=float,
                        help="The initial learning rate for Adam.")
    parser.add_argument("--warmup_proportion",
                        default=0.0,
                        type=float,
                        help="Proportion of training to perform linear learning rate warmup for. "
                             "E.g., 0.1 = 10%% of training.")
    parser.add_argument("--num_train_epochs",
                        default=3,
                        type=int,
                        help="Total number of training epochs to perform.")
    parser.add_argument('--seed',
                        type=int,
                        default=42,
                        help="random seed for initialization")
    parser.add_argument('--gradient_accumulation_steps',
                        type=int,
                        default=1,
                        help="Number of updates steps to accumualte before performing a backward/update pass.")

    args = parser.parse_args()

    if os.path.exists(args.output_dir) and os.listdir(args.output_dir) and args.save_model:
        raise ValueError("Output directory ({}) already exists and is not empty.".format(args.output_dir))
    os.makedirs(args.output_dir, exist_ok=True)

    LOG_PATH = args.log_path
    MAX_LEN = args.max_seq_length

    config = DotMap()
    config.hidden_dropout_prob = 0.1
    config.layer_norm_eps = 1e-12
    config.initializer_range = 0.02
    config.max_note_position_embedding = 1000
    config.max_chunk_position_embedding = 1000
    config.embed_mode = args.embed_mode
    config.layer_norm_eps = 1e-12
    config.hidden_size = 768
    config.lstm_layers = 1

    config.task_name = args.task_name

    write_log(("New Job Start! \n"
               "Data directory: {}, Directory Code: {}, Save Model: {}\n"
               "Output_dir: {}, Task Name: {}, embed_mode: {}\n"
               "max_seq_length: {},  max_chunk_num: {}\n"
               "train_batch_size: {}, eval_batch_size: {}\n"
               "learning_rate: {}, warmup_proportion: {}\n"
               "num_train_epochs: {}, seed: {}, gradient_accumulation_steps: {}\n"
               "FTLSTM Model's lstm_layers: {}").format(args.data_dir,
                                                        args.data_dir.split('_')[-1],
                                                        args.save_model,
                                                        args.output_dir,
                                                        config.task_name,
                                                        config.embed_mode,
                                                        args.max_seq_length,
                                                        args.max_chunk_num,
                                                        args.train_batch_size,
                                                        args.eval_batch_size,
                                                        args.learning_rate,
                                                        args.warmup_proportion,
                                                        args.num_train_epochs,
                                                        args.seed,
                                                        args.gradient_accumulation_steps,
                                                        config.lstm_layers),
              LOG_PATH)

    content = "config setting: \n"
    for k, v in config.items():
        content += "{}: {} \n".format(k, v)
    write_log(content, LOG_PATH)

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    n_gpu = torch.cuda.device_count()
    write_log("Number of GPU is {}".format(n_gpu), LOG_PATH)
    for i in range(n_gpu):
        write_log(("Device Name: {},"
                   "Device Capability: {}").format(torch.cuda.get_device_name(i),
                                                   torch.cuda.get_device_capability(i)), LOG_PATH)

    train_file_path = os.path.join(args.data_dir, args.train_data)
    val_file_path = os.path.join(args.data_dir, args.val_data)
    test_file_path = os.path.join(args.data_dir, args.test_data)
    train_df = pd.read_csv(train_file_path)
    val_df = pd.read_csv(val_file_path)
    test_df = pd.read_csv(test_file_path)

    random.seed(args.seed)
    np.random.seed(args.seed)
    torch.manual_seed(args.seed)
    if n_gpu > 0:
        torch.cuda.manual_seed_all(args.seed)

    tokenizer = BertTokenizer.from_pretrained(args.bert_model, do_lower_case=True)

    write_log("Tokenize Start!", LOG_PATH)
    train_df = reorder_by_time(train_df)
    val_df = reorder_by_time(val_df)
    test_df = reorder_by_time(test_df)
    train_labels, train_inputs, train_masks, train_note_ids, train_times = Tokenize_with_note_id_hour(train_df, MAX_LEN,
                                                                                                      tokenizer)
    validation_labels, validation_inputs, validation_masks, validation_note_ids, validation_times = Tokenize_with_note_id_hour(
        val_df, MAX_LEN, tokenizer)
    test_labels, test_inputs, test_masks, test_note_ids, test_times = Tokenize_with_note_id_hour(test_df, MAX_LEN,
                                                                                                 tokenizer)
    write_log("Tokenize Finished!", LOG_PATH)
    train_inputs = torch.tensor(train_inputs)
    validation_inputs = torch.tensor(validation_inputs)
    test_inputs = torch.tensor(test_inputs)
    train_labels = torch.tensor(train_labels)
    validation_labels = torch.tensor(validation_labels)
    test_labels = torch.tensor(test_labels)
    train_masks = torch.tensor(train_masks)
    validation_masks = torch.tensor(validation_masks)
    test_masks = torch.tensor(test_masks)
    train_times = torch.tensor(train_times)
    validation_times = torch.tensor(validation_times)
    test_times = torch.tensor(test_times)
    write_log(("train dataset size is %d,\n"
               "validation dataset size is %d,\n"
               "test dataset size is %d") % (len(train_inputs), len(validation_inputs), len(test_inputs)), LOG_PATH)

    (train_labels, train_inputs,
     train_masks, train_ids,
     train_note_ids, train_chunk_ids, train_times) = concat_by_id_list_with_note_chunk_id_time(train_df, train_labels,
                                                                                               train_inputs,
                                                                                               train_masks,
                                                                                               train_note_ids,
                                                                                               train_times, MAX_LEN)
    (validation_labels, validation_inputs,
     validation_masks, validation_ids,
     validation_note_ids, validation_chunk_ids,
     validation_times) = concat_by_id_list_with_note_chunk_id_time(val_df, validation_labels,
                                                                   validation_inputs, validation_masks,
                                                                   validation_note_ids, validation_times,
                                                                   MAX_LEN)
    (test_labels, test_inputs,
     test_masks, test_ids,
     test_note_ids, test_chunk_ids, test_times) = concat_by_id_list_with_note_chunk_id_time(test_df, test_labels,
                                                                                            test_inputs, test_masks,
                                                                                            test_note_ids, test_times,
                                                                                            MAX_LEN)

    model = BertModel.from_pretrained(args.bert_model).to(device)
    model.to(device)
    lstm_layer = FTLSTMLayer(config=config, num_labels=1)
    lstm_layer.to(device)

    if n_gpu > 1:
        model = torch.nn.DataParallel(model)
        lstm_layer = torch.nn.DataParallel(lstm_layer)
    param_optimizer = list(model.named_parameters()) + list(lstm_layer.named_parameters())
    no_decay = ['bias', 'gamma', 'beta']
    optimizer_grouped_parameters = [
        {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.01},
        {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)],
         'weight_decay_rate': 0.0}
    ]

    num_train_steps = int(
        len(train_labels) / args.gradient_accumulation_steps * args.num_train_epochs)

    optimizer = BertAdam(optimizer_grouped_parameters,
                         lr=args.learning_rate,
                         warmup=args.warmup_proportion,
                         t_total=num_train_steps)
    start = time.time()
    # Store our loss and accuracy for plotting
    train_loss_set = []

    # Number of training epochs (authors recommend between 2 and 4)
    epochs = args.num_train_epochs

    train_batch_generator = time_batch_generator(args.max_chunk_num, train_inputs, train_labels, train_masks,
                                                 train_note_ids, train_chunk_ids, train_times)
    validation_batch_generator = time_batch_generator(args.max_chunk_num, validation_inputs, validation_labels,
                                                      validation_masks, validation_note_ids, validation_chunk_ids,
                                                      validation_times)

    write_log("Training start!", LOG_PATH)
    # trange is a tqdm wrapper around the normal python range
    with torch.autograd.set_detect_anomaly(False):
        for epoch in trange(epochs, desc="Epoch"):

            # Training

            # Set our model to training mode (as opposed to evaluation mode)
            model.train()
            lstm_layer.train()

            # Tracking variables
            tr_loss = 0
            nb_tr_examples, nb_tr_steps = 0, 0

            # Train the data for one epoch
            tr_ids_num = len(train_ids)
            tr_batch_loss = []
            for step in range(tr_ids_num):
                b_input_ids, b_labels, b_input_mask, b_note_ids, b_chunk_ids, b_times = next(train_batch_generator)
                b_input_ids = b_input_ids.to(device)
                b_input_mask = b_input_mask.to(device)
                b_new_note_ids = convert_note_ids(b_note_ids).to(device)
                b_chunk_ids = b_chunk_ids.unsqueeze(0).to(device)
                b_labels = b_labels.to(device)
                b_labels.resize_((1))
                _, whole_output = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)
                whole_input = whole_output.unsqueeze(0)
                b_new_note_ids = b_new_note_ids.unsqueeze(0)
                b_times = b_times.unsqueeze(0).to(device)
                loss, pred = lstm_layer(whole_input, b_times, b_new_note_ids, b_chunk_ids, b_labels)

                if n_gpu > 1:
                    loss = loss.mean()  # mean() to average on multi-gpu.
                tr_batch_loss.append(loss.item())

                # Backward pass
                loss.backward()
                # Update parameters and take a step using the computed gradient
                if (step + 1) % args.train_batch_size == 0:
                    optimizer.step()
                    optimizer.zero_grad()
                    train_loss_set.append(np.mean(tr_batch_loss))
                    tr_batch_loss = []
                # Update tracking variables
                tr_loss += loss.item()
                nb_tr_examples += b_input_ids.size(0)
                nb_tr_steps += 1

            write_log("Train loss: {}".format(tr_loss / nb_tr_steps), LOG_PATH)
            # Validation

            # Put model in evaluation mode to evaluate loss on the validation set
            model.eval()
            lstm_layer.eval()

            # Tracking variables
            eval_loss, eval_accuracy = 0, 0
            nb_eval_steps, nb_eval_examples = 0, 0
            # Evaluate data for one epoch
            ev_ids_num = len(validation_ids)
            for step in range(ev_ids_num):
                with torch.no_grad():
                    b_input_ids, b_labels, b_input_mask, b_note_ids, b_chunk_ids, b_times = next(
                        validation_batch_generator)
                    b_input_ids = b_input_ids.to(device)
                    b_input_mask = b_input_mask.to(device)
                    b_new_note_ids = convert_note_ids(b_note_ids).to(device)
                    b_chunk_ids = b_chunk_ids.unsqueeze(0).to(device)
                    b_labels.resize_((1))
                    _, whole_output = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)
                    whole_input = whole_output.unsqueeze(0)
                    b_new_note_ids = b_new_note_ids.unsqueeze(0)
                    b_times = b_times.unsqueeze(0).to(device)
                    pred = lstm_layer(whole_input, b_times, b_new_note_ids, b_chunk_ids).detach().cpu().numpy()
                label_ids = b_labels.numpy()
                tmp_eval_accuracy = flat_accuracy(pred, label_ids)
                eval_accuracy += tmp_eval_accuracy
                nb_eval_steps += 1

            write_log("Validation Accuracy: {}".format(eval_accuracy / nb_eval_steps), LOG_PATH)
            output_checkpoints_path = os.path.join(args.output_dir,
                                                   "bert_fine_tuned_with_note_checkpoint_%d.pt" % epoch)

            if args.save_model:
                if n_gpu > 1:
                    torch.save({
                        'epoch': epoch,
                        'model_state_dict': model.module.state_dict(),
                        'lstm_layer_state_dict': lstm_layer.module.state_dict(),
                        'optimizer_state_dict': optimizer.state_dict(),
                        'loss': loss,
                    },
                        output_checkpoints_path)
                else:
                    torch.save({
                        'epoch': epoch,
                        'model_state_dict': model.state_dict(),
                        'lstm_layer_state_dict': lstm_layer.state_dict(),
                        'optimizer_state_dict': optimizer.state_dict(),
                        'loss': loss,
                    },
                        output_checkpoints_path)
    end = time.time()
    write_log("total training time is: {}s".format(end - start), LOG_PATH)

    fig1 = plt.figure(figsize=(15, 8))
    plt.title("Training loss")
    plt.xlabel("Patient Batch")
    plt.ylabel("Loss")
    plt.plot(train_loss_set)
    if args.save_model:
        output_fig_path = os.path.join(args.output_dir, "bert_fine_tuned_with_note_training_loss.png")
        plt.savefig(output_fig_path, dpi=fig1.dpi)
        output_model_state_dict_path = os.path.join(args.output_dir, "bert_fine_tuned_with_note_state_dict.pt")
        if n_gpu > 1:
            torch.save({
                'model_state_dict': model.module.state_dict(),
                'lstm_layer_state_dict': lstm_layer.module.state_dict(),
            },
                output_model_state_dict_path)
        else:
            torch.save({
                'model_state_dict': model.state_dict(),
                'lstm_layer_state_dict': lstm_layer.state_dict(),
            },
                output_model_state_dict_path)
        write_log("Model saved!", LOG_PATH)
    else:
        output_fig_path = os.path.join(args.output_dir,
                                       "bert_fine_tuned_with_note_training_loss_{}_{}.png".format(args.seed,
                                                                                                  args.data_dir.split(
                                                                                                      '_')[-1]))
        plt.savefig(output_fig_path, dpi=fig1.dpi)
        write_log("Model not saved as required", LOG_PATH)

    # Prediction on test set

    # Put model in evaluation mode
    model.eval()
    lstm_layer.eval()

    # Tracking variables
    predictions, true_labels = [], []

    # Predict
    te_ids_num = len(test_ids)
    for step in range(te_ids_num):
        b_input_ids = test_inputs[step][-args.max_chunk_num:, :].to(device)
        b_input_mask = test_masks[step][-args.max_chunk_num:, :].to(device)
        b_note_ids = test_note_ids[step][-args.max_chunk_num:]
        b_new_note_ids = convert_note_ids(b_note_ids).to(device)
        b_chunk_ids = test_chunk_ids[step][-args.max_chunk_num:].unsqueeze(0).to(device)
        b_labels = test_labels[step]
        b_labels.resize_((1))
        with torch.no_grad():
            _, whole_output = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask)
            whole_input = whole_output.unsqueeze(0)
            b_new_note_ids = b_new_note_ids.unsqueeze(0)
            b_times = test_times[step][-args.max_chunk_num:].unsqueeze(0).to(device)
            pred = lstm_layer(whole_input, b_times, b_new_note_ids, b_chunk_ids).detach().cpu().numpy()
        label_ids = b_labels.numpy()[0]
        predictions.append(pred)
        true_labels.append(label_ids)

    # Flatten the predictions and true values for aggregate Matthew's evaluation on the whole dataset
    flat_logits = [item for sublist in predictions for item in sublist]
    flat_predictions = np.asarray([1 if i else 0 for i in (np.array(flat_logits) >= 0.5)])
    flat_true_labels = np.asarray(true_labels)

    output_df = pd.DataFrame({'pred_prob': flat_logits,
                              'pred_label': flat_predictions,
                              'label': flat_true_labels,
                              'Adm_ID': test_ids})

    if args.save_model:
        output_df.to_csv(os.path.join(args.output_dir, 'test_predictions.csv'), index=False)
    else:
        output_df.to_csv(os.path.join(args.output_dir,
                                      'test_predictions_{}_{}.csv'.format(args.seed,
                                                                          args.data_dir.split('_')[-1])), index=False)

    write_performance(flat_true_labels, flat_predictions, flat_logits, config, args)

# Results
In this section, you should finish training your model training or loading your trained model. That is a great experiment! You should share the results with others with necessary metrics and figures.

Please test and report results for all experiments that you run with:

*   specific numbers (accuracy, AUC, RMSE, etc)
*   figures (loss shrinkage, outputs from GAN, annotation or label of sample pictures, etc)


In [None]:
# metrics to evaluate my model

# plot figures to better show the results

# it is better to save the numbers and figures for your presentation.

## Model comparison

In [None]:
# compare you model with others
# you don't need to re-run all other experiments, instead, you can directly refer the metrics/numbers in the paper

# Discussion

In this section,you should discuss your work and make future plan. The discussion should address the following questions:
  * Make assessment that the paper is reproducible or not.
  * Explain why it is not reproducible if your results are kind negative.
  * Describe “What was easy” and “What was difficult” during the reproduction.
  * Make suggestions to the author or other reproducers on how to improve the reproducibility.
  * What will you do in next phase.



In [None]:
# no code is required for this section
'''
if you want to use an image outside this notebook for explanaition,
you can read and plot it here like the Scope of Reproducibility
'''

# References

[1] Zhang, D., Thadajarassiri, J., Sen, C., & Rundensteiner, E. (2020). Time-Aware Transformer-based Network for Clinical Notes Series Prediction. *Proceedings of Machine Learning Research*, 126, 1-22.

[2] Huang, K., Altosaar, J., & Ranganath, R. (2019). ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. ArXiv, abs/1904.05342.

[3] Johnson, A. E. W., Pollard, T. J., Shen, L., Li-wei, H. L., Feng, M., Ghassemi, M., ... & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. *Scientific data*, 3(1), 1-9.



# Feel free to add new sections