Skip to content

Code for the EACL 2021 Paper: Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration

License

Notifications You must be signed in to change notification settings

bvanaken/clinical-outcome-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Clinical Outcome Prediction from Admission Notes

This repository contains source code for the task creation and experiments from our paper Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration, EACL 2021.

Use the CORe Model

To apply the CORe model - pre-trained on clinical outcomes - on downstream tasks, simply load it from huggingface's model hub.

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")
model = AutoModel.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")

Create Admission Notes for Outcome Prediction from MIMIC-III

Install Requirements:

pip install -r tasks/requirements.txt

Create train/val/test for e.g. Mortality Prediction:

python tasks/mp/mp.py \
 --mimic_dir {MIMIC_DIR} \   # required
 --save_dir {DIR_TO_SAVE_DATA} \   # required
 --admission_only True \   # required

mimic_dir: Directory that contains unpacked NOTEEVENTS.csv, ADMISSIONS.csv, DIAGNOSES_ICD.csv and PROCEDURES_ICD.csv

save_dir: Any directory to save the data

admission_only: True=Create simulated Admission Notes, False=Keep complete Discharge Summaries

Apply these scripts accordingly for the other outcome tasks:

Length-of-Stay (los/los.py),

Diagnoses (dia/dia.py),

Diagnoses + ICD+ (dia/dia_plus.py),

Procedures (pro/pro.py) and

Procedures + ICD+ (pro/pro_plus.py)

Train Outcome Prediction Tasks

1 - Build using Docker: Dockerfile

2 - Create Config File. See Example for Mortality Prediction: MP Example Config

3 - Run Training with Arguments

python doc_classification.py \
 --task_config {PATH_TO_TASK_CONFIG.yaml} \   # required
 --model_name_or_path {PATH_TO_MODEL_OR_TRANSFORMERS_MODEL_HUB_NAME} \   # required
 --cache_dir {CACHE_DIR} \   # required

See doc_classification.py for optional parameters.

4 - Run Training with Hyperparameter Optimization

python hpo_doc_classification.py \
 # Same parameters as above plus the following:
 --hpo_samples {NO_OF_SAMPLES} \ # required
 --hpo_gpus {NO_OF_GPUS} \ # required

Cite

@inproceedings{vanAken2021,
  author    = {Betty van Aken and
               Jens-Michalis Papaioannou and
               Manuel Mayrdorfer and
               Klemens Budde and
               Felix A. Gers and
               Alexander Löser},
  title     = {Clinical Outcome Prediction from Admission Notes using Self-Supervised
               Knowledge Integration},
  booktitle = {Proceedings of the 16th Conference of the European Chapter of the
               Association for Computational Linguistics: Main Volume, {EACL} 2021,
               Online, April 19 - 23, 2021},
  pages     = {881--893},
  publisher = {Association for Computational Linguistics},
  year      = {2021},
  url       = {https://www.aclweb.org/anthology/2021.eacl-main.75/}
}

About

Code for the EACL 2021 Paper: Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published