# Prediction of Hospital Readmissions

This notebook's goal is to reproduce the claims from Zebin and Chaussalet's paper, 'Design and implementation of a deep recurrent model for prediction of readadmission in urgent care using electronic health records'.



## Claims

When predicting ICU readmissions:

1. LSTM+CNN produced higher accuracy than logistic regression, random forest, and SVM.
2. LSTM+CNN produced higher precision than logistic regression, random forest, and SVM.
3. LSTM+CNN produced higher recall than logistic regression and SVM

In [1]:
import numpy as np
import pandas as pd

In [2]:
patients = pd.read_csv('./mimic-iii/PATIENTS.csv')
patients.head(1)

Unnamed: 0,ROW_ID,SUBJECT_ID,GENDER,DOB,DOD,DOD_HOSP,DOD_SSN,EXPIRE_FLAG
0,234,249,F,2075-03-13 00:00:00,,,,0


In [3]:
admissions = pd.read_csv('./mimic-iii/ADMISSIONS.csv')
admissions.head(1)

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ADMITTIME,DISCHTIME,DEATHTIME,ADMISSION_TYPE,ADMISSION_LOCATION,DISCHARGE_LOCATION,INSURANCE,LANGUAGE,RELIGION,MARITAL_STATUS,ETHNICITY,EDREGTIME,EDOUTTIME,DIAGNOSIS,HOSPITAL_EXPIRE_FLAG,HAS_CHARTEVENTS_DATA
0,21,22,165315,2196-04-09 12:26:00,2196-04-10 15:54:00,,EMERGENCY,EMERGENCY ROOM ADMIT,DISC-TRAN CANCER/CHLDRN H,Private,,UNOBTAINABLE,MARRIED,WHITE,2196-04-09 10:06:00,2196-04-09 13:24:00,BENZODIAZEPINE OVERDOSE,0,1


In [4]:
transfers = pd.read_csv('./mimic-iii/TRANSFERS.csv')
transfers.head(1)

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,DBSOURCE,EVENTTYPE,PREV_CAREUNIT,CURR_CAREUNIT,PREV_WARDID,CURR_WARDID,INTIME,OUTTIME,LOS
0,657,111,192123,254245.0,carevue,transfer,CCU,MICU,7.0,23.0,2142-04-29 15:27:11,2142-05-04 20:38:33,125.19


In [5]:
icustays = pd.read_csv('./mimic-iii/ICUSTAYS.csv')
icustays.head(1)


Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,DBSOURCE,FIRST_CAREUNIT,LAST_CAREUNIT,FIRST_WARDID,LAST_WARDID,INTIME,OUTTIME,LOS
0,365,268,110404,280836,carevue,MICU,MICU,52,52,2198-02-14 23:27:38,2198-02-18 05:26:11,3.249


In [6]:
# print(len(icustays.groupby(['HADM_ID'])))
# test = (icustays.groupby(['HADM_ID']).size() > 1).reset_index()
# test[test[0]==True]

## Example

Patient with `SUBJECT_ID = 250`

In [7]:
# patients[patients['SUBJECT_ID'] == 250]

In [8]:
# admissions[admissions['SUBJECT_ID'] == 250]

In [9]:
# transfers[transfers['SUBJECT_ID'] == 250]

## Create Dataset

SUBJECT_ID exmaples  291, 283, 250

In [10]:
patients['DOB'] = pd.to_datetime(patients['DOB'], errors='coerce')
patients['DOD'] = pd.to_datetime(patients['DOD'], errors='coerce')
patients['DOD_HOSP'] = pd.to_datetime(patients['DOD_HOSP'], errors='coerce')
patients['DOD_SSN'] = pd.to_datetime(patients['DOD_SSN'], errors='coerce')
patients = patients.drop('ROW_ID', axis=1)

In [11]:
admissions['ADMITTIME'] = pd.to_datetime(admissions['ADMITTIME'], errors='coerce')
admissions['DISCHTIME'] = pd.to_datetime(admissions['DISCHTIME'], errors='coerce')
admissions['DEATHTIME'] = pd.to_datetime(admissions['DEATHTIME'], errors='coerce')
admissions = admissions.drop('ROW_ID', axis=1)

In [12]:
transfers['INTIME'] = pd.to_datetime(transfers['INTIME'], errors='coerce')
transfers['OUTTIME'] = pd.to_datetime(transfers['OUTTIME'], errors='coerce')
transfers = transfers.drop('ROW_ID', axis=1)

In [13]:
icustays['INTIME'] = pd.to_datetime(icustays['INTIME'], errors='coerce')
icustays['OUTTIME'] = pd.to_datetime(icustays['OUTTIME'], errors='coerce')
icustays = icustays.drop('ROW_ID', axis=1)

In [14]:
admissions_subset = admissions[['SUBJECT_ID', 'HADM_ID', 'ADMITTIME', 'DISCHTIME', 'DEATHTIME']]
patients_subset = patients[['SUBJECT_ID','GENDER','DOB','DOD','EXPIRE_FLAG']]
dataset = pd.merge(admissions_subset, patients_subset, how='left', on='SUBJECT_ID')

In [15]:
# Remove under 18
dataset['AGE'] = ((pd.to_datetime(dataset['ADMITTIME']).dt.date - pd.to_datetime(dataset['DOB']).dt.date) / np.timedelta64(1, 'Y')).astype(int)
dataset = dataset[dataset['AGE'] >= 18]

In [16]:
icustays_subset = icustays[['SUBJECT_ID', 'HADM_ID', 'ICUSTAY_ID', 'INTIME', 'OUTTIME', 'LOS']]
dataset = pd.merge(dataset, icustays_subset, how='left', on=['SUBJECT_ID','HADM_ID'])
dataset = dataset.sort_values(by=['SUBJECT_ID', 'DISCHTIME'], ignore_index=True)
#dataset = dataset.reindex()

In [17]:
dataset.head(3)

Unnamed: 0,SUBJECT_ID,HADM_ID,ADMITTIME,DISCHTIME,DEATHTIME,GENDER,DOB,DOD,DOD_HOSP,DOD_SSN,EXPIRE_FLAG,AGE,ICUSTAY_ID,INTIME,OUTTIME,LOS
0,3,145834,2101-10-20 19:08:00,2101-10-31 13:58:00,NaT,M,2025-04-11,2102-06-14,NaT,2102-06-14,1,76,211552.0,2101-10-20 19:10:11,2101-10-26 20:43:09,6.0646
1,4,185777,2191-03-16 00:28:00,2191-03-23 18:41:00,NaT,F,2143-05-12,NaT,NaT,NaT,0,47,294638.0,2191-03-16 00:29:31,2191-03-17 16:46:31,1.6785
2,6,107064,2175-05-30 07:15:00,2175-06-15 16:00:00,NaT,F,2109-06-21,NaT,NaT,NaT,0,65,228232.0,2175-05-30 21:30:54,2175-06-03 13:39:54,3.6729


In [18]:
# cases of returning to ICU
dataset['RETURNED_AFTER_TRANSFER'] = dataset.duplicated(subset=['HADM_ID']).astype(int)

In [19]:
shifted = dataset.shift(1)
dataset['RETURNED_AFTER_DISCHARGE'] = (shifted['SUBJECT_ID'] == dataset['SUBJECT_ID']) & (dataset['DISCHTIME'] - shifted['ADMITTIME'] <= np.timedelta64(30, 'D')) 

In [20]:
dataset['DEATH_AFTER_TRANSFER'] = ((~dataset['DEATHTIME'].isnull())  & ((dataset['DEATHTIME'] > dataset['OUTTIME']) | (dataset['DEATHTIME'] < dataset['INTIME']))).astype(int)


In [21]:
dataset['DEATH_AFTER_DISCHARGE'] = ((~dataset['DOD'].isnull())  & (dataset['DOD'] - dataset['DISCHTIME'] <= np.timedelta64(30, 'D')) & (dataset['DOD'] - dataset['DISCHTIME'] > np.timedelta64(0, 'D'))).astype(int)


In [22]:
print('Returned after transfer:\t', np.sum(dataset['RETURNED_AFTER_TRANSFER']))
print('Returned after discharge:\t', np.sum(dataset['RETURNED_AFTER_DISCHARGE']))
print('Died after transfer:\t\t', np.sum(dataset['DEATH_AFTER_TRANSFER']))
print('Died after discharge:\t\t', np.sum(dataset['DEATH_AFTER_DISCHARGE']))


Returned after transfer:	 3637
Returned after discharge:	 4020
Died after transfer:		 2090
Died after discharge:		 2436


In [23]:
print('ICU stays:\t', len(dataset['HADM_ID'].unique()))
print('Patients:\t', len(dataset['SUBJECT_ID'].unique()))

ICU stays:	 50766
Patients:	 38552
