# MIMIC III Clinical Database

MIMIC-III (Medical Information Mart for Intensive Care III) is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012.

The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (both in and out of hospital).

MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors:

1. it is freely available to researchers worldwide
2. it encompasses a diverse and very large population of ICU patients
3. it contains high temporal resolution data including lab results, electronic documentation, and bedside monitor trends and waveforms.

Total number of patients = 46520 <br>
Total patient records = 58976

## Refernece

* https://mimic.physionet.org/about/mimic/
* download page: https://physionet.org/works/MIMICIIIClinicalDatabase/files/
* Table refernece: https://mimic.physionet.org/mimictables/admissions/ 
* Table connections: https://mit-lcp.github.io/mimic-schema-spy/tables/noteevents.html 
* demo page: https://physionet.org/works/MIMICIIIClinicalDatabaseDemo/files/
* Github repository: https://github.com/MIT-LCP/mimic-code/
* Github workshop: https://github.com/MIT-LCP/mimic-workshop/
* Code issue discussions: https://github.com/MIT-LCP/mimic-code/issues

## Citations

> MIMIC-III, a freely accessible critical care database. Johnson AEW, Pollard TJ, Shen L, Lehman L, Feng M, Ghassemi M, Moody B, Szolovits P, Celi LA, and Mark RG. Scientific Data (2016). DOI: 10.1038/sdata.2016.35. Available from: http://www.nature.com/articles/sdata201635

> Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101(23):e215-e220 [Circulation Electronic Pages; http://circ.ahajournals.org/content/101/23/e215.full]; 2000 (June 13).

## List of tables
There are totally 26 working tables in the Mimic III datasets.

The following tables are used to define and track patient stays:
* ADMISSIONS: Every unique hospitalization for each patient in the database (defines HADM_ID)
* CALLOUT: Information regarding when a patient was cleared for ICU discharge and when the patient was actually discharged
* ICUSTAYS: Every unique ICU stay in the database (defines ICUSTAY_ID)
* PATIENTS: Every unique patient in the database (defines SUBJECT_ID)
* SERVICES: The clinical service under which a patient is registered
* TRANSFERS: Patient movement from bed to bed within the hospital, including ICU admission and discharge


Each ICUSTAY_ID corresponds to a single HADM_ID and a single SUBJECT_ID. Each HADM_ID corresponds to a single SUBJECT_ID. A   single SUBJECT_ID can correspond to multiple HADM_ID (multiple hospitalizations of the same patient), and multiple ICUSTAY_ID (multiple ICU stays either within the same hospitalization, or across multiple hospitalizations, or both).
<br>

The following tables contain data collected in the critical care unit:
* CAREGIVERS: Every caregiver who has recorded data in the database (defines CGID)
* CHARTEVENTS: All charted observations for patients
* DATETIMEEVENTS: All recorded observations which are dates, for example time of dialysis or insertion of lines.
* INPUTEVENTS_CV: Intake for patients monitored using the Philips CareVue system while in the ICU
* INPUTEVENTS_MV: Intake for patients monitored using the iMDSoft Metavision system while in the ICU
* NOTEEVENTS: Deidentified notes, including nursing and physician notes, ECG reports, imaging reports, and discharge summaries.
* OUTPUTEVENTS: Output information for patients while in the ICU
* PROCEDUREEVENTS_MV: Patient procedures for the subset of patients who were monitored in the ICU using the iMDSoft MetaVision system.
<br>

The following tables contain data collected in the hospital record system:
    
* CPTEVENTS: Procedures recorded as Current Procedural Terminology (CPT) codes
* DIAGNOSES_ICD: Hospital assigned diagnoses, coded using the International Statistical Classification of Diseases and Related Health Problems (ICD) system
* DRGCODES: Diagnosis Related Groups (DRG), which are used by the hospital for billing purposes.
* MICROBIOLOGYEVENTS: Microbiology measurements and sensitivities from the hospital database
* PRESCRIPTIONS: Medications ordered, and not necessarily administered, for a given patient
* PROCEDURES_ICD: Patient procedures, coded using the International Statistical Classification of Diseases and Related Health Problems (ICD) system
<br>

The following tables are dictionaries:

* D_CPT: High-level dictionary of Current Procedural Terminology (CPT) codes
* D_ICD_DIAGNOSES: Dictionary of International Statistical Classification of Diseases and Related Health Problems (ICD) codes relating to diagnoses
* D_ICD_PROCEDURES: Dictionary of International Statistical Classification of Diseases and Related Health Problems (ICD) codes relating to procedures
* D_ITEMS: Dictionary of ITEMIDs appearing in the MIMIC database, except those that relate to laboratory tests
* D_LABITEMS: Dictionary of ITEMIDs in the laboratory database that relate to laboratory tests
<br>
<br>

![Image](https://raw.githubusercontent.com/ericxumit/MGH_medical_AI/master/data_table_connection.jpg)

## Glance of the data (in 26 tables)

In [1]:
# Load libraries that we'll use in this study
import os
from os import listdir
from os.path import isfile, join
import pandas as pd
import numpy as np

In [2]:
# Get current path for the project and data
path = os.path.realpath('')
parent_path, projectname = os.path.split(path)
data_path = os.path.join(path, 'data')

print('parent_path = ', parent_path)
print('projectname = ', projectname)
print('path = ', path)
print('data_path = ', data_path)

parent_path =  C:\Users\ericx\Jupyter Projects
projectname =  MGH_medical_AI
path =  C:\Users\ericx\Jupyter Projects\MGH_medical_AI
data_path =  C:\Users\ericx\Jupyter Projects\MGH_medical_AI\data


In [3]:
# read files under data_path
data_files = [f for f in listdir(data_path) if isfile(join(data_path, f))]
print('Number of data files =', len(data_files))
print('Note in Mimic III download list, first 2 files are MD5 checksums')
print('So 1st Mimic III dataset starts from #3')
data_files

Number of data files = 26
Note in Mimic III download list, first 2 files are MD5 checksums
So 1st Mimic III dataset starts from #3


['ADMISSIONS.csv',
 'CALLOUT.csv',
 'CAREGIVERS.csv',
 'CHARTEVENTS.csv',
 'CPTEVENTS.csv',
 'DATETIMEEVENTS.csv',
 'DIAGNOSES_ICD.csv',
 'DRGCODES.csv',
 'D_CPT.csv',
 'D_ICD_DIAGNOSES.csv',
 'D_ICD_PROCEDURES.csv',
 'D_ITEMS.csv',
 'D_LABITEMS.csv',
 'ICUSTAYS.csv',
 'INPUTEVENTS_CV.csv',
 'INPUTEVENTS_MV.csv',
 'LABEVENTS.csv',
 'MICROBIOLOGYEVENTS.csv',
 'NOTEEVENTS.csv',
 'OUTPUTEVENTS.csv',
 'PATIENTS.csv',
 'PRESCRIPTIONS.csv',
 'PROCEDUREEVENTS_MV.csv',
 'PROCEDURES_ICD.csv',
 'SERVICES.csv',
 'TRANSFERS.csv']

In [7]:
# Totaly number of patients
admissions = pd.read_csv('data/' + data_files[0])
print('Total number of patients =', len(admissions['SUBJECT_ID'].unique()))
print('Total patient records =', len(admissions['SUBJECT_ID']))

Total number of patients = 46520
Total patient records = 58976


In [8]:
# The 03 admissions table
# ADMISSIONS: Every unique hospitalization for each patient in the database (defines HADM_ID)
# Number of rows: 58976
# admissions = pd.read_csv('data/' + data_files[0], nrows=10)      # only load certain rows (since full data could consume big meomery size)
admissions.head()                                                # only display top 5 lines to preview the data set

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ADMITTIME,DISCHTIME,DEATHTIME,ADMISSION_TYPE,ADMISSION_LOCATION,DISCHARGE_LOCATION,INSURANCE,LANGUAGE,RELIGION,MARITAL_STATUS,ETHNICITY,EDREGTIME,EDOUTTIME,DIAGNOSIS,HOSPITAL_EXPIRE_FLAG,HAS_CHARTEVENTS_DATA
0,21,22,165315,2196-04-09 12:26:00,2196-04-10 15:54:00,,EMERGENCY,EMERGENCY ROOM ADMIT,DISC-TRAN CANCER/CHLDRN H,Private,,UNOBTAINABLE,MARRIED,WHITE,2196-04-09 10:06:00,2196-04-09 13:24:00,BENZODIAZEPINE OVERDOSE,0,1
1,22,23,152223,2153-09-03 07:15:00,2153-09-08 19:10:00,,ELECTIVE,PHYS REFERRAL/NORMAL DELI,HOME HEALTH CARE,Medicare,,CATHOLIC,MARRIED,WHITE,,,CORONARY ARTERY DISEASE\CORONARY ARTERY BYPASS...,0,1
2,23,23,124321,2157-10-18 19:34:00,2157-10-25 14:00:00,,EMERGENCY,TRANSFER FROM HOSP/EXTRAM,HOME HEALTH CARE,Medicare,ENGL,CATHOLIC,MARRIED,WHITE,,,BRAIN MASS,0,1
3,24,24,161859,2139-06-06 16:14:00,2139-06-09 12:48:00,,EMERGENCY,TRANSFER FROM HOSP/EXTRAM,HOME,Private,,PROTESTANT QUAKER,SINGLE,WHITE,,,INTERIOR MYOCARDIAL INFARCTION,0,1
4,25,25,129635,2160-11-02 02:06:00,2160-11-05 14:55:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME,Private,,UNOBTAINABLE,MARRIED,WHITE,2160-11-02 01:01:00,2160-11-02 04:27:00,ACUTE CORONARY SYNDROME,0,1


In [140]:
# The 04 callout table
# CALLOUT: Information regarding when a patient was cleared for ICU discharge and when the patient was actually discharged
# Number of rows: 34499
callout = pd.read_csv('data/' + data_files[1], nrows=10)     
callout.head()   

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,SUBMIT_WARDID,SUBMIT_CAREUNIT,CURR_WARDID,CURR_CAREUNIT,CALLOUT_WARDID,CALLOUT_SERVICE,REQUEST_TELE,...,CALLOUT_STATUS,CALLOUT_OUTCOME,DISCHARGE_WARDID,ACKNOWLEDGE_STATUS,CREATETIME,UPDATETIME,ACKNOWLEDGETIME,OUTCOMETIME,FIRSTRESERVATIONTIME,CURRENTRESERVATIONTIME
0,402,854,175684,52,,29,MICU,1,MED,0,...,Inactive,Discharged,29.0,Acknowledged,2146-10-05 13:16:55,2146-10-05 13:16:55,2146-10-05 13:24:00,2146-10-05 18:55:22,2146-10-05 15:27:44,
1,403,864,138624,15,,55,CSRU,55,CSURG,0,...,Inactive,Discharged,55.0,Acknowledged,2114-11-28 08:31:39,2114-11-28 09:42:08,2114-11-28 09:43:08,2114-11-28 12:10:02,,
2,404,864,138624,12,,55,CSRU,55,CSURG,1,...,Inactive,Discharged,55.0,Acknowledged,2114-11-30 10:24:25,2114-12-01 09:06:18,2114-12-01 12:26:05,2114-12-01 21:55:05,,
3,405,867,184298,7,,17,CCU,17,CCU,1,...,Inactive,Discharged,17.0,Acknowledged,2136-12-29 08:45:42,2136-12-29 10:17:16,2136-12-29 10:33:51,2136-12-29 18:10:02,,
4,157,306,167129,57,,3,SICU,44,NSURG,1,...,Inactive,Discharged,3.0,Acknowledged,2199-09-18 11:47:47,2199-09-18 11:47:47,2199-09-18 11:58:33,2199-09-18 15:10:02,,


In [111]:
# 5. caregivers
#    CAREGIVERS: Every caregiver who has recorded data in the database (defines CGID)
caregivers = pd.read_csv('data/' + data_files[2], nrows=10)      
caregivers.head()  

Unnamed: 0,ROW_ID,CGID,LABEL,DESCRIPTION
0,2228,16174,RO,Read Only
1,2229,16175,RO,Read Only
2,2230,16176,Res,Resident/Fellow/PA/NP
3,2231,16177,RO,Read Only
4,2232,16178,RT,Respiratory


In [112]:
# 6. chartevents
#    CHARTEVENTS: All charted observations for patients
chartevents = pd.read_csv('data/' + data_files[3], nrows=10)      
chartevents.head()  

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,ITEMID,CHARTTIME,STORETIME,CGID,VALUE,VALUENUM,VALUEUOM,WARNING,ERROR,RESULTSTATUS,STOPPED
0,788,36,165660,241249,223834,2134-05-12 12:00:00,2134-05-12 13:56:00,17525,15.0,15.0,L/min,0,0,,
1,789,36,165660,241249,223835,2134-05-12 12:00:00,2134-05-12 13:56:00,17525,100.0,100.0,,0,0,,
2,790,36,165660,241249,224328,2134-05-12 12:00:00,2134-05-12 12:18:00,20823,0.37,0.37,,0,0,,
3,791,36,165660,241249,224329,2134-05-12 12:00:00,2134-05-12 12:19:00,20823,6.0,6.0,min,0,0,,
4,792,36,165660,241249,224330,2134-05-12 12:00:00,2134-05-12 12:19:00,20823,2.5,2.5,,0,0,,


In [121]:
# 7. cptevents
#    CPTEVENTS: Procedures recorded as Current Procedural Terminology (CPT) codes
cptevents = pd.read_csv('data/' + data_files[4], nrows=10)      
cptevents.head()  

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,COSTCENTER,CHARTDATE,CPT_CD,CPT_NUMBER,CPT_SUFFIX,TICKET_ID_SEQ,SECTIONHEADER,SUBSECTIONHEADER,DESCRIPTION
0,317,11743,129545,ICU,,99232,99232,,6,Evaluation and management,Hospital inpatient services,
1,318,11743,129545,ICU,,99232,99232,,7,Evaluation and management,Hospital inpatient services,
2,319,11743,129545,ICU,,99232,99232,,8,Evaluation and management,Hospital inpatient services,
3,320,11743,129545,ICU,,99232,99232,,9,Evaluation and management,Hospital inpatient services,
4,321,6185,183725,ICU,,99223,99223,,1,Evaluation and management,Hospital inpatient services,


In [113]:
# 8. datetimeevents
#    DATETIMEEVENTS: All recorded observations which are dates, for example time of dialysis or insertion of lines.
datetimeevents = pd.read_csv('data/' + data_files[5], nrows=10)      
datetimeevents.head()  

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,ITEMID,CHARTTIME,STORETIME,CGID,VALUE,VALUEUOM,WARNING,ERROR,RESULTSTATUS,STOPPED
0,711,7657,121183,297945,3411,2172-03-14 11:00:00,2172-03-14 11:52:00,16446,,Date,,,,NotStopd
1,712,7657,121183,297945,3411,2172-03-14 13:00:00,2172-03-14 12:36:00,16446,,Date,,,,NotStopd
2,713,7657,121183,297945,3411,2172-03-14 15:00:00,2172-03-14 15:10:00,14957,,Date,,,,NotStopd
3,714,7657,121183,297945,3411,2172-03-14 17:00:00,2172-03-14 17:01:00,16446,,Date,,,,NotStopd
4,715,7657,121183,297945,3411,2172-03-14 19:00:00,2172-03-14 19:29:00,14815,,Date,,,,NotStopd


In [123]:
# 9. diagnoses_icd
#    DIAGNOSES_ICD: Hospital assigned diagnoses, coded using the International Statistical Classification of 
#                   Diseases and Related Health Problems (ICD) system
diagnoses_icd = pd.read_csv('data/' + data_files[6], nrows=10)      
diagnoses_icd.head()  

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,SEQ_NUM,ICD9_CODE
0,1297,109,172335,1,40301
1,1298,109,172335,2,486
2,1299,109,172335,3,58281
3,1300,109,172335,4,5855
4,1301,109,172335,5,4254


In [124]:
# 10. drgcodes
#     DRGCODES: Diagnosis Related Groups (DRG), which are used by the hospital for billing purposes.
drgcodes = pd.read_csv('data/' + data_files[7], nrows=10)      
drgcodes.head()  

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,DRG_TYPE,DRG_CODE,DESCRIPTION,DRG_SEVERITY,DRG_MORTALITY
0,342,2491,144486,HCFA,28,"TRAUMATIC STUPOR & COMA, COMA <1 HR AGE >17 WI...",,
1,343,24958,162910,HCFA,110,MAJOR CARDIOVASCULAR PROCEDURES WITH COMPLICAT...,,
2,344,18325,153751,HCFA,390,NEONATE WITH OTHER SIGNIFICANT PROBLEMS,,
3,345,17887,182692,HCFA,14,SPECIFIC CEREBROVASCULAR DISORDERS EXCEPT TRAN...,,
4,346,11113,157980,HCFA,390,NEONATE WITH OTHER SIGNIFICANT PROBLEMS,,


In [129]:
# 11. d_cpt
#     D_CPT: High-level dictionary of Current Procedural Terminology (CPT) codes
d_cpt = pd.read_csv('data/' + data_files[8], nrows=10)      
d_cpt.head() 

Unnamed: 0,ROW_ID,CATEGORY,SECTIONRANGE,SECTIONHEADER,SUBSECTIONRANGE,SUBSECTIONHEADER,CODESUFFIX,MINCODEINSUBSECTION,MAXCODEINSUBSECTION
0,1,1,99201-99499,Evaluation and management,99201-99216,Office/other outpatient services,,99201,99216
1,2,1,99201-99499,Evaluation and management,99217-99220,Hospital observation services,,99217,99220
2,3,1,99201-99499,Evaluation and management,99221-99239,Hospital inpatient services,,99221,99239
3,4,1,99201-99499,Evaluation and management,99241-99255,Consultations,,99241,99255
4,5,1,99201-99499,Evaluation and management,99261-99263,Follow-up inpatient consultations (deleted codes),,99261,99263


In [133]:
# 12. d_icd_diagnoses
#     D_ICD_DIAGNOSES: Dictionary of International Statistical Classification of 
#     Diseases and Related Health Problems (ICD) codes relating to diagnoses
d_icd_diagnoses = pd.read_csv('data/' + data_files[9], nrows=10)      
d_icd_diagnoses.head() 

Unnamed: 0,ROW_ID,ICD9_CODE,SHORT_TITLE,LONG_TITLE
0,174,1166,TB pneumonia-oth test,"Tuberculous pneumonia [any form], tubercle bac..."
1,175,1170,TB pneumothorax-unspec,"Tuberculous pneumothorax, unspecified"
2,176,1171,TB pneumothorax-no exam,"Tuberculous pneumothorax, bacteriological or h..."
3,177,1172,TB pneumothorx-exam unkn,"Tuberculous pneumothorax, bacteriological or h..."
4,178,1173,TB pneumothorax-micro dx,"Tuberculous pneumothorax, tubercle bacilli fou..."


In [134]:
# 13. d_icd_procedures
#     D_ICD_PROCEDURES: Dictionary of International Statistical Classification of 
#                       Diseases and Related Health Problems (ICD) codes relating to procedures
d_icd_procedures = pd.read_csv('data/' + data_files[10], nrows=10)      
d_icd_procedures.head() 

Unnamed: 0,ROW_ID,ICD9_CODE,SHORT_TITLE,LONG_TITLE
0,264,851,Canthotomy,Canthotomy
1,265,852,Blepharorrhaphy,Blepharorrhaphy
2,266,859,Adjust lid position NEC,Other adjustment of lid position
3,267,861,Lid reconst w skin graft,Reconstruction of eyelid with skin flap or graft
4,268,862,Lid reconst w muc graft,Reconstruction of eyelid with mucous membrane ...


In [135]:
# 14. d_items
#     D_ITEMS: Dictionary of ITEMIDs appearing in the MIMIC database, except those that relate to laboratory tests
d_items = pd.read_csv('data/' + data_files[11], nrows=10)      
d_items.head() 

Unnamed: 0,ROW_ID,ITEMID,LABEL,ABBREVIATION,DBSOURCE,LINKSTO,CATEGORY,UNITNAME,PARAM_TYPE,CONCEPTID
0,457,497,Patient controlled analgesia (PCA) [Inject],,carevue,chartevents,,,,
1,458,498,PCA Lockout (Min),,carevue,chartevents,,,,
2,459,499,PCA Medication,,carevue,chartevents,,,,
3,460,500,PCA Total Dose,,carevue,chartevents,,,,
4,461,501,PCV Exh Vt (Obser),,carevue,chartevents,,,,


In [136]:
# 15. d_labitems
#     D_LABITEMS: Dictionary of ITEMIDs in the laboratory database that relate to laboratory tests
d_labitems = pd.read_csv('data/' + data_files[12], nrows=10)      
d_labitems.head() 

Unnamed: 0,ROW_ID,ITEMID,LABEL,FLUID,CATEGORY,LOINC_CODE
0,546,51346,Blasts,Cerebrospinal Fluid (CSF),Hematology,26447-3
1,547,51347,Eosinophils,Cerebrospinal Fluid (CSF),Hematology,26451-5
2,548,51348,"Hematocrit, CSF",Cerebrospinal Fluid (CSF),Hematology,30398-2
3,549,51349,Hypersegmented Neutrophils,Cerebrospinal Fluid (CSF),Hematology,26506-6
4,550,51350,Immunophenotyping,Cerebrospinal Fluid (CSF),Hematology,


In [107]:
# 16. icustays
#     ICUSTAYS: Every unique ICU stay in the database (defines ICUSTAY_ID)
icustays = pd.read_csv('data/' + data_files[13], nrows=10)      
icustays.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,DBSOURCE,FIRST_CAREUNIT,LAST_CAREUNIT,FIRST_WARDID,LAST_WARDID,INTIME,OUTTIME,LOS
0,365,268,110404,280836,carevue,MICU,MICU,52,52,2198-02-14 23:27:38,2198-02-18 05:26:11,3.249
1,366,269,106296,206613,carevue,MICU,MICU,52,52,2170-11-05 11:05:29,2170-11-08 17:46:57,3.2788
2,367,270,188028,220345,carevue,CCU,CCU,57,57,2128-06-24 15:05:20,2128-06-27 12:32:29,2.8939
3,368,271,173727,249196,carevue,MICU,SICU,52,23,2120-08-07 23:12:42,2120-08-10 00:39:04,2.06
4,369,272,164716,210407,carevue,CCU,CCU,57,57,2186-12-25 21:08:04,2186-12-27 12:01:13,1.6202


In [114]:
# 17. inputevents_cv
#     INPUTEVENTS_CV: Intake for patients monitored using the Philips CareVue system while in the ICU
inputevents_cv = pd.read_csv('data/' + data_files[14], nrows=10)      
inputevents_cv.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,CHARTTIME,ITEMID,AMOUNT,AMOUNTUOM,RATE,RATEUOM,...,ORDERID,LINKORDERID,STOPPED,NEWBOTTLE,ORIGINALAMOUNT,ORIGINALAMOUNTUOM,ORIGINALROUTE,ORIGINALRATE,ORIGINALRATEUOM,ORIGINALSITE
0,592,24457,184834,205776,2193-09-11 09:00:00,30056,100,ml,,,...,756654,9359133,,,,ml,Oral,,,
1,593,24457,184834,205776,2193-09-11 12:00:00,30056,200,ml,,,...,3564075,9359133,,,,ml,Oral,,,
2,594,24457,184834,205776,2193-09-11 16:00:00,30056,160,ml,,,...,422646,9359133,,,,ml,Oral,,,
3,595,24457,184834,205776,2193-09-11 19:00:00,30056,240,ml,,,...,5137889,9359133,,,,ml,Oral,,,
4,596,24457,184834,205776,2193-09-11 21:00:00,30056,50,ml,,,...,8343792,9359133,,,,ml,Oral,,,


In [115]:
# 18. inputevents_mv
#     INPUTEVENTS_MV: Intake for patients monitored using the iMDSoft Metavision system while in the ICU
inputevents_mv = pd.read_csv('data/' + data_files[15], nrows=10)      
inputevents_mv.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,STARTTIME,ENDTIME,ITEMID,AMOUNT,AMOUNTUOM,RATE,...,TOTALAMOUNTUOM,ISOPENBAG,CONTINUEINNEXTDEPT,CANCELREASON,STATUSDESCRIPTION,COMMENTS_EDITEDBY,COMMENTS_CANCELEDBY,COMMENTS_DATE,ORIGINALAMOUNT,ORIGINALRATE
0,241,27063,139787,223259,2133-02-05 06:29:00,2133-02-05 08:45:00,225166,6.774532,mEq,,...,ml,0,0,1,Rewritten,,RN,2133-02-05 12:52:00,10.0,0.05
1,242,27063,139787,223259,2133-02-05 05:34:00,2133-02-05 06:30:00,225944,28.132997,ml,30.142497,...,ml,0,0,0,FinishedRunning,,,,28.132998,30.255817
2,243,27063,139787,223259,2133-02-05 05:34:00,2133-02-05 06:30:00,225166,2.8133,mEq,,...,ml,0,0,0,FinishedRunning,,,,2.8133,0.050426
3,244,27063,139787,223259,2133-02-03 12:00:00,2133-02-03 12:01:00,225893,1.0,dose,,...,ml,0,0,2,Rewritten,RN,,2133-02-03 17:06:00,1.0,1.0
4,245,27063,139787,223259,2133-02-03 12:00:00,2133-02-03 12:01:00,220949,100.0,ml,,...,ml,0,0,2,Rewritten,RN,,2133-02-03 17:06:00,100.0,0.0


In [125]:
# 19. labevents
#     LABEVENTS: Laboratory measurements for patients both within the hospital and in out patient clinics
labevents = pd.read_csv('data/' + data_files[16], nrows=10)      
labevents.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ITEMID,CHARTTIME,VALUE,VALUENUM,VALUEUOM,FLAG
0,281,3,,50820,2101-10-12 16:07:00,7.39,7.39,units,
1,282,3,,50800,2101-10-12 18:17:00,ART,,,
2,283,3,,50802,2101-10-12 18:17:00,-1,-1.0,mEq/L,
3,284,3,,50804,2101-10-12 18:17:00,22,22.0,mEq/L,
4,285,3,,50808,2101-10-12 18:17:00,0.93,0.93,mmol/L,abnormal


In [126]:
# 20. microbiologyevents
#     MICROBIOLOGYEVENTS: Microbiology measurements and sensitivities from the hospital database
microbiologyevents = pd.read_csv('data/' + data_files[17], nrows=10)      
microbiologyevents.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,CHARTDATE,CHARTTIME,SPEC_ITEMID,SPEC_TYPE_DESC,ORG_ITEMID,ORG_NAME,ISOLATE_NUM,AB_ITEMID,AB_NAME,DILUTION_TEXT,DILUTION_COMPARISON,DILUTION_VALUE,INTERPRETATION
0,744,96,170324,2156-04-13 00:00:00,2156-04-13 14:18:00,70021,BRONCHOALVEOLAR LAVAGE,80026.0,PSEUDOMONAS AERUGINOSA,1.0,,,,,,
1,745,96,170324,2156-04-20 00:00:00,2156-04-20 13:10:00,70062,SPUTUM,,,,,,,,,
2,746,96,170324,2156-04-20 00:00:00,2156-04-20 16:00:00,70012,BLOOD CULTURE,,,,,,,,,
3,747,96,170324,2156-04-20 00:00:00,,70012,BLOOD CULTURE,,,,,,,,,
4,748,96,170324,2156-04-20 00:00:00,,70079,URINE,,,,,,,,,


In [10]:
# 21. noteevents
#     NOTEEVENTS: Deidentified notes, including nursing and physician notes, ECG reports, imaging reports, and discharge summaries.
noteevents = pd.read_csv('data/' + data_files[18], nrows=10)      
noteevents.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,CHARTDATE,CHARTTIME,STORETIME,CATEGORY,DESCRIPTION,CGID,ISERROR,TEXT
0,174,22532,167853,2151-08-04,,,Discharge summary,Report,,,Admission Date: [**2151-7-16**] Dischar...
1,175,13702,107527,2118-06-14,,,Discharge summary,Report,,,Admission Date: [**2118-6-2**] Discharg...
2,176,13702,167118,2119-05-25,,,Discharge summary,Report,,,Admission Date: [**2119-5-4**] D...
3,177,13702,196489,2124-08-18,,,Discharge summary,Report,,,Admission Date: [**2124-7-21**] ...
4,178,26880,135453,2162-03-25,,,Discharge summary,Report,,,Admission Date: [**2162-3-3**] D...


In [117]:
# 22. outputevents
#     OUTPUTEVENTS: Output information for patients while in the ICU
outputevents = pd.read_csv('data/' + data_files[19], nrows=10)      
outputevents.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,CHARTTIME,ITEMID,VALUE,VALUEUOM,STORETIME,CGID,STOPPED,NEWBOTTLE,ISERROR
0,344,21219,177991,225765,2142-09-08 10:00:00,40055,200,ml,2142-09-08 12:08:00,17269,,,
1,345,21219,177991,225765,2142-09-08 12:00:00,40055,200,ml,2142-09-08 12:08:00,17269,,,
2,346,21219,177991,225765,2142-09-08 13:00:00,40055,120,ml,2142-09-08 13:39:00,17269,,,
3,347,21219,177991,225765,2142-09-08 14:00:00,40055,100,ml,2142-09-08 16:17:00,17269,,,
4,348,21219,177991,225765,2142-09-08 16:00:00,40055,200,ml,2142-09-08 16:17:00,17269,,,


In [108]:
# 23. patients
#     PATIENTS: Every unique patient in the database (defines SUBJECT_ID)
patients = pd.read_csv('data/' + data_files[20], nrows=10)      
patients.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,GENDER,DOB,DOD,DOD_HOSP,DOD_SSN,EXPIRE_FLAG
0,234,249,F,2075-03-13 00:00:00,,,,0
1,235,250,F,2164-12-27 00:00:00,2188-11-22 00:00:00,2188-11-22 00:00:00,,1
2,236,251,M,2090-03-15 00:00:00,,,,0
3,237,252,M,2078-03-06 00:00:00,,,,0
4,238,253,F,2089-11-26 00:00:00,,,,0


In [127]:
# 24. prescriptions
#     PRESCRIPTIONS: Medications ordered, and not necessarily administered, for a given patient
prescriptions = pd.read_csv('data/' + data_files[21], nrows=10)      
prescriptions.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,STARTDATE,ENDDATE,DRUG_TYPE,DRUG,DRUG_NAME_POE,DRUG_NAME_GENERIC,FORMULARY_DRUG_CD,GSN,NDC,PROD_STRENGTH,DOSE_VAL_RX,DOSE_UNIT_RX,FORM_VAL_DISP,FORM_UNIT_DISP,ROUTE
0,2214776,6,107064,,2175-06-11 00:00:00,2175-06-12 00:00:00,MAIN,Tacrolimus,Tacrolimus,Tacrolimus,TACR1,21796.0,469061711,1mg Capsule,2,mg,2,CAP,PO
1,2214775,6,107064,,2175-06-11 00:00:00,2175-06-12 00:00:00,MAIN,Warfarin,Warfarin,Warfarin,WARF5,6562.0,56017275,5mg Tablet,5,mg,1,TAB,PO
2,2215524,6,107064,,2175-06-11 00:00:00,2175-06-12 00:00:00,MAIN,Heparin Sodium,,,HEPAPREMIX,6522.0,338055002,"25,000 unit Premix Bag",25000,UNIT,1,BAG,IV
3,2216265,6,107064,,2175-06-11 00:00:00,2175-06-12 00:00:00,BASE,D5W,,,HEPBASE,,0,HEPARIN BASE,250,ml,250,ml,IV
4,2214773,6,107064,,2175-06-11 00:00:00,2175-06-12 00:00:00,MAIN,Furosemide,Furosemide,Furosemide,FURO20,8208.0,54829725,20mg Tablet,20,mg,1,TAB,PO


In [119]:
# 25. procedureevents_mv
#     PROCEDUREEVENTS_MV: Patient procedures for the subset of patients who were monitored in the ICU using the iMDSoft MetaVision system.
procedureevents_mv = pd.read_csv('data/' + data_files[22], nrows=10)      
procedureevents_mv.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,STARTTIME,ENDTIME,ITEMID,VALUE,VALUEUOM,LOCATION,...,ORDERCATEGORYNAME,SECONDARYORDERCATEGORYNAME,ORDERCATEGORYDESCRIPTION,ISOPENBAG,CONTINUEINNEXTDEPT,CANCELREASON,STATUSDESCRIPTION,COMMENTS_EDITEDBY,COMMENTS_CANCELEDBY,COMMENTS_DATE
0,379,29070,115071,232563,2145-03-12 23:04:00,2145-03-12 23:05:00,225401,1,,,...,Procedures,,Electrolytes,0,0,0,FinishedRunning,,,
1,380,29070,115071,232563,2145-03-12 23:04:00,2145-03-12 23:05:00,225454,1,,,...,Procedures,,Electrolytes,0,0,0,FinishedRunning,,,
2,381,29070,115071,232563,2145-03-12 23:05:00,2145-03-18 20:01:00,225792,8456,hour,,...,Ventilation,,Task,1,0,0,FinishedRunning,,,
3,382,29070,115071,232563,2145-03-12 23:36:00,2145-03-12 23:37:00,225402,1,,,...,Procedures,,Electrolytes,0,0,0,FinishedRunning,,,
4,383,29070,115071,232563,2145-03-13 01:27:00,2145-03-16 16:00:00,224560,5193,min,Right IJ,...,Invasive Lines,,Task,1,0,0,FinishedRunning,,,


In [128]:
# 26. procedures_icd
#     PROCEDURES_ICD: Patient procedures, coded using the International Statistical Classification of 
#                     Diseases and Related Health Problems (ICD) system
procedures_icd = pd.read_csv('data/' + data_files[23], nrows=10)      
procedures_icd.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,SEQ_NUM,ICD9_CODE
0,944,62641,154460,3,3404
1,945,2592,130856,1,9671
2,946,2592,130856,2,3893
3,947,55357,119355,1,9672
4,948,55357,119355,2,331


In [109]:
# 27. services
#     SERVICES: The clinical service under which a patient is registered
services = pd.read_csv('data/' + data_files[24], nrows=10)      
services.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,TRANSFERTIME,PREV_SERVICE,CURR_SERVICE
0,758,471,135879,2122-07-22 14:07:27,TSURG,MED
1,759,471,135879,2122-07-26 18:31:49,MED,TSURG
2,760,472,173064,2172-09-28 19:22:15,,CMED
3,761,473,129194,2201-01-09 20:16:45,,NB
4,762,474,194246,2181-03-23 08:24:41,,NB


In [110]:
# 28. transfers
#     TRANSFERS: Patient movement from bed to bed within the hospital, including ICU admission and discharge
transfers = pd.read_csv('data/' + data_files[25], nrows=10)      
transfers.head()

Unnamed: 0,ROW_ID,SUBJECT_ID,HADM_ID,ICUSTAY_ID,DBSOURCE,EVENTTYPE,PREV_CAREUNIT,CURR_CAREUNIT,PREV_WARDID,CURR_WARDID,INTIME,OUTTIME,LOS
0,657,111,192123,254245.0,carevue,transfer,CCU,MICU,7.0,23.0,2142-04-29 15:27:11,2142-05-04 20:38:33,125.19
1,658,111,192123,,carevue,transfer,MICU,,23.0,45.0,2142-05-04 20:38:33,2142-05-05 11:46:32,15.13
2,659,111,192123,,carevue,discharge,,,45.0,,2142-05-05 11:46:32,,
3,660,111,155897,249202.0,metavision,admit,,MICU,,52.0,2144-07-01 04:13:59,2144-07-01 05:19:39,1.09
4,661,111,155897,,metavision,transfer,MICU,,52.0,32.0,2144-07-01 05:19:39,2144-07-01 06:28:29,1.15


The end