IDANN Triage utilizes interpretable deep neural networks with attention to predict critical patient outcomes and resources, aiding emergency departments with patient prioritization.
This repository contains refactored code (first round) from the original repository used during MIDS w210 capstone course:
MIDS-Capstone-EHR-ED-Care (UCB capstone instructors have access to this repository). The code was re-factored for easy reading and replication, keeping model's architecture and performance levels (we are creating other project to evolve the initial model).
The table below lists the notebooks that can be executed in the specified order to replicate results.
N. | Type | Task and its Notebook | files |
---|---|---|---|
1a | Pre-Processing | Transforms CDC fixed format files to CSV format. Notebook: transform_fixed_to_csv.ipynb |
Input: data/raw/ED[year], data/external/format[year].txt Output: data/interim/ED[year].csv |
1b | Pre-Processing | Consolidates files from different years into one and applies exclusion criteria. Notebook: consolidating_files.ipynb |
Input: data/interim/ED[year].csv Output: data/processed/ED_TOTAL_2009_2009.csv , data/processed/ED_TOTAL_2009_2015.csv |
2 | EDA | Exploratory Data Analysis of CDC data from 2009 to 2015. Notebook: CDC_eda_ESI_CO.ipynb |
Input: data/processed/ ED_TOTAL_2009_2015.csv |
N. | Type | Task and its Notebook | files |
---|---|---|---|
3 | LR_BAS: Baseline Model | Logistic Regression replication Model that predicts critical outcomes. Notebook: CDC_LR_2009_baseline.ipynb |
Input: data/processed/ ED_TOTAL_2009_2009.csv |
4a | LR_RMH: Improving LR Model | Logistic Regression with additional features like: Reason for Visit (RFV) codes as vectors to capture its hierarchical semantic. Notebook: CDC_LR_2009_more_features.ipyn |
Input: data/processed/ ED_TOTAL_2009_2009.csv |
4b | RF: Random Forest | We also implemented a Random forest (RF) model which provides a list of feature importance, relevant for model interpretation. However its performance was lower than the LR model. Notebook: CDC_RF_2009.ipynb |
Input: data/processed/ ED_TOTAL_2009_2009.csv |
4c | FNN:Feed-Forward Neural Newtwork | FNN with the same features than RF and LR_RMH. Notebook: CDC_NN_2009_modeling.ipynb |
Input: data/processed/ ED_TOTAL_2009_2009.csv |
4d | FNN_TE: FNN and Embeddings | Forward Neural Newtwork (FNN) with embedding for the Reason for Visit (RFV) codes. Notebook:CDC_NN_Embedding_2009_modeling.ipynb |
Input: data/processed/ ED_TOTAL_2009_2009.csv |
5 | FNN_TE_ATT: FNN Attention for Critical Outcomes | FNN_TE with Attention Layer, predicts critical outcomes, with the attention layer. Notebooks: CDC_ATT_NN_2009_Text_Emb.ipynb |
Input: data/processed/ ED_TOTAL_2009_2009.csv |
6 | FNN_TEA_RS: FNN Attention for Resource estimation | FNN_TE with Attention Layer, with multiclass outcome for resource utilization, Notebook: CDC_ATT_RSS_NN_2009_Text_Emb.ipynb | Input: data/processed/ ED_TOTAL_2009_2009.csv |
N. | Type | Task and its Notebook | files |
---|---|---|---|
7 | Prediction | prediction with 2009 data to determine ESI thresholds. Notebook: Batch_2009_Prediction_for_Thresholds_Optimization.ipynb |
Input: data/processed/ ED_TOTAL_2009_2009.csv Output: data/result/Predictions_2009_DataForThresholds.json |
8 | Thresholds | Determining thresholds for ESI values. Notebook: ESI_threshold_optimization.ipynb |
Input: data/result/Predictions_2009_DataForThresholds.json Output: data/result/thresholds.json |
N. | Type | Task and its Notebook | files |
---|---|---|---|
8 | Prediction | prediction with 2010 for evaluation. Notebook: Batch_Prediction_2010.ipynb |
Input: data/processed/ ED_TOTAL_2010_2010.csv Output: data/result/Predictions_2010_DataForThresholds.json |
9 | Results Viz | Several Visualizations comparing new ESI with original ESI and its performance improvement . Notebook: ESI_vs_pred_ESI_Viz.ipynb , test_evaluation/ESI_sankeys_viz.ipynb |
Input: data/result/ thresholds.json, data/result/Predictions_2010_DataForThresholds.json Output: viz images and report |
N. | Type | Task and its Notebook | files |
---|---|---|---|
10 | Prediction | Example of predicting for one record and pulling attention relative weights from the model that had just done the prediction Notebook: CO_ATT_prediction_interpretability_sample.ipynb |
Input: data/processed/ ED_TOTAL_2010_2010.csv Output: |
11 | Prediction | Example of how to predict for one record (it includes call to py method that is used in the cloud api-service) Notebook: ATTNN_predict_for_one_record.ipynb |
Input: data/processed/ ED_TOTAL_2010_2010.csv Output: |
N. | Type | Task and its Notebook | files |
---|---|---|---|
12 | API Service | call to REST API example. Notebook: API_Service_Call_example.ipynb |
Input: Output: |
Project structure based on: https://drivendata.github.io/cookiecutter-data-science/