## Extracting Cohort for MIMIC-III

In this notebook we will be extracting the patient cohort from the MIMIC-III database. The original work of Roggeveen et al. focused on the use of a sepsis cohort filtered by SOFA$\geq2$ and a suspected or documented infection; as we are concerned with including all patients admitted to intensive care, we will generate a new cohort consisting of all patients admitted.

### All ICU admissions

In [10]:
import pandas as pd

In [11]:
icustays = pd.read_csv(r"D:/mimic-iii-clinical-database-1.4/ICUSTAYS.csv", usecols=['ICUSTAY_ID', 'HADM_ID', 'INTIME', 'OUTTIME'])
icustays.head()

Unnamed: 0,HADM_ID,ICUSTAY_ID,INTIME,OUTTIME
0,110404,280836,2198-02-14 23:27:38,2198-02-18 05:26:11
1,106296,206613,2170-11-05 11:05:29,2170-11-08 17:46:57
2,188028,220345,2128-06-24 15:05:20,2128-06-27 12:32:29
3,173727,249196,2120-08-07 23:12:42,2120-08-10 00:39:04
4,164716,210407,2186-12-25 21:08:04,2186-12-27 12:01:13


In [12]:
# how many HADM_IDs per ICUSTAY_ID on average?
icustays.groupby('HADM_ID').size().mean()

1.0648253902329283

i.e. there may be re-admissions, that is, two or more trips to the ICU while at the hospital.

### Which ICU admissions have data?

In [13]:
admissions = pd.read_csv(r"D:/mimic-iii-clinical-database-1.4/ADMISSIONS.csv", usecols=['HADM_ID', 'ADMISSION_TYPE', 'HOSPITAL_EXPIRE_FLAG', 'HAS_CHARTEVENTS_DATA'])
print('Admissions (all): %d\n' % len(admissions))

# if no data has been charted or is newborn -> drop admission
admissions = admissions[admissions.HAS_CHARTEVENTS_DATA.astype(bool) & (admissions.ADMISSION_TYPE != 'NEWBORN')]
admissions = admissions[['HADM_ID', 'HOSPITAL_EXPIRE_FLAG']]
print('Admissions excluding births/no chartdata: %d\n' % len(admissions))

# merge with ICU stays
icustays = icustays.merge(admissions, on='HADM_ID')

icustays.head()

Admissions (all): 58976

Admissions excluding births/no chartdata: 49621



Unnamed: 0,HADM_ID,ICUSTAY_ID,INTIME,OUTTIME,HOSPITAL_EXPIRE_FLAG
0,110404,280836,2198-02-14 23:27:38,2198-02-18 05:26:11,1
1,106296,206613,2170-11-05 11:05:29,2170-11-08 17:46:57,0
2,188028,220345,2128-06-24 15:05:20,2128-06-27 12:32:29,0
3,173727,249196,2120-08-07 23:12:42,2120-08-10 00:39:04,0
4,164716,210407,2186-12-25 21:08:04,2186-12-27 12:01:13,0


In [24]:
icustays.HOSPITAL_EXPIRE_FLAG.value_counts()

0    46649
1     6533
Name: HOSPITAL_EXPIRE_FLAG, dtype: int64