Skip to content

meghna-diwan/datamining_fp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prediction Patient Survival in the Intensive Care Unit (ICU)

This is the repository for the final project for the MSCA Data Mining Course where we use patient information to predict their dealth probability in the first 24 hours at the ICU.

Getting Started

This project requires Python 3.7 and the following Python libraries installed:

  • NumPy 1.16.5
  • Pandas 1.3.1
  • matplotlib 3.1.1
  • scikit-learn 0.22
  • seaborn
  • lightgbm
  • imblearn

Running the Code

All our finalized notebooks are in the "notebooks/Finalized" folder. Please run the jupyter notebook in the order of step numbers in title.

  1. Run Step1_WIDS+EDA.ipynb - This is our EDA process.

  2. Run Step2_WIDS_Feature_Engineer_0309.ipynb - We created additional features.

  3. Run Step3_lightgbm_baselinemodels.ipynb - We built a basedline model by only using the raw and unimputed dataset.

  4. Run Step4_Model_Iterations.ipynb - We ran three models plus resampling methods on our imputed dataset.

  5. Run Step5_LGB_U_Hyperpara_Tuning.ipynb - We tuned hyperparameters of our finalized model. NOTE: Please DO NOT RUN the notebook beginning at the "Submit to Kaggle" portion as that is only for the purposes of our own validation. Thanks!

  6. Run Step6_Probability Calibration.ipynb - We tried calibrating our probability threshold.

Data

The original data can be found in "data" folder. This includes the following files:

  1. trainingv2.csv : Our training data with 91,713 oberservations and 185 features. The target variable "hospital_death" is a binary variable where 1 indicates death and 0 indicates survival.

  2. icu-apache-codes.csv : Map to match APACHE III diagnosis codes with their parent chronic condition.

  3. unlabeled.csv : The holdout unlabeled data used to predict survival

About

Data Mining 2020 Final Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%