# MIMIC-III CSV Files

Before we start handeling the data, we will have a look at what we are presented with. We have 26 files, each conatining data relating to different events, interconnected by various ID numbers.

![alt text >](./pics/file_descriptions.png)

## I. Preparing the Analysis

Lets load the necessary modules and store the names of all the CSV files.

In [135]:
# Import necessary packages
import datetime
import dateutil
from importlib import reload
import multiprocessing
import numpy as np
import pandas as pd
import psutil
import pdb
import time
import os
from pathlib import Path

# The docker environment manipulates the python path to include our source directory
# Execute this from within the docker environ to make these import work
import utils

In [40]:
file_list = [
    "ADMISSIONS",
    "CALLOUT",
    "CAREGIVERS",
    "CHARTEVENTS",
    "CPTEVENTS",
    "D_CPT",
    "D_ICD_DIAGNOSES",
    "D_ICD_PROCEDURES",
    "D_ITEMS",
    "D_LABITEMS",
    "DATETIMEEVENTS",
    "DIAGNOSES_ICD",
    "DRGCODES",
    "ICUSTAYS",
    "INPUTEVENTS_CV",
    "INPUTEVENTS_MV",
    "OUTPUTEVENTS",
    "LABEVENTS",
    "MICROBIOLOGYEVENTS",
    "NOTEEVENTS",
    "PATIENTS",
    "PRESCRIPTIONS",
    "PROCEDUREEVENTS_MV",
    "PROCEDURES_ICD",
    "SERVICES",
    "TRANSFERS"
]

In [41]:
dataset_folder = Path(os.getenv("DATA"), "mimic-iii-demo")

## II.  Characterising the Data

We are going to characterize the data with a brief description and its link to other data points. Additionally, we are going to try and trivially identify non-relevant columns for our prediction task.

### Admissions
Hospital admissions, per subject defined through subject_id and his admission defined through hadm_id.

Link:
- subject_id
- hadm_id
- adm-disch time

Irrelevant Columns:
- marital_status
- religion
- language
- ethnicity

In [76]:
print(file_list[0])
pd.read_csv(Path(dataset_folder, f"{file_list[0]}.csv")).head()

ADMISSIONS


Unnamed: 0,row_id,subject_id,hadm_id,admittime,dischtime,deathtime,admission_type,admission_location,discharge_location,insurance,language,religion,marital_status,ethnicity,edregtime,edouttime,diagnosis,hospital_expire_flag,has_chartevents_data
0,12258,10006,142345,2164-10-23 21:09:00,2164-11-01 17:15:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME HEALTH CARE,Medicare,,CATHOLIC,SEPARATED,BLACK/AFRICAN AMERICAN,2164-10-23 16:43:00,2164-10-23 23:00:00,SEPSIS,0,1
1,12263,10011,105331,2126-08-14 22:32:00,2126-08-28 18:59:00,2126-08-28 18:59:00,EMERGENCY,TRANSFER FROM HOSP/EXTRAM,DEAD/EXPIRED,Private,,CATHOLIC,SINGLE,UNKNOWN/NOT SPECIFIED,,,HEPATITIS B,1,1
2,12265,10013,165520,2125-10-04 23:36:00,2125-10-07 15:13:00,2125-10-07 15:13:00,EMERGENCY,TRANSFER FROM HOSP/EXTRAM,DEAD/EXPIRED,Medicare,,CATHOLIC,,UNKNOWN/NOT SPECIFIED,,,SEPSIS,1,1
3,12269,10017,199207,2149-05-26 17:19:00,2149-06-03 18:42:00,,EMERGENCY,EMERGENCY ROOM ADMIT,SNF,Medicare,,CATHOLIC,DIVORCED,WHITE,2149-05-26 12:08:00,2149-05-26 19:45:00,HUMERAL FRACTURE,0,1
4,12270,10019,177759,2163-05-14 20:43:00,2163-05-15 12:00:00,2163-05-15 12:00:00,EMERGENCY,TRANSFER FROM HOSP/EXTRAM,DEAD/EXPIRED,Medicare,,CATHOLIC,DIVORCED,WHITE,,,ALCOHOLIC HEPATITIS,1,1


### Callout

 Information regarding when a patient was cleared for ICU discharge and when the patient was actually discharged.

Link:
- subject_id
- hadm_id
- various timings

In [80]:
print(file_list[1])
pd.read_csv(Path(dataset_folder, f"{file_list[1]}.csv")).head()

CALLOUT


Unnamed: 0,row_id,subject_id,hadm_id,submit_wardid,submit_careunit,curr_wardid,curr_careunit,callout_wardid,callout_service,request_tele,...,callout_status,callout_outcome,discharge_wardid,acknowledge_status,createtime,updatetime,acknowledgetime,outcometime,firstreservationtime,currentreservationtime
0,3917,10017,199207,7,,45,CCU,1,MED,1,...,Inactive,Discharged,45.0,Acknowledged,2149-05-31 10:44:34,2149-05-31 10:44:34,2149-05-31 15:08:04,2149-05-31 22:40:02,,
1,3919,10026,103770,33,,3,SICU,3,NMED,1,...,Inactive,Discharged,3.0,Revised,2195-05-18 13:56:20,2195-05-19 15:45:30,,2195-05-19 17:40:03,,
2,3920,10027,199395,12,,55,CSRU,55,CSURG,1,...,Inactive,Discharged,55.0,Acknowledged,2190-07-20 08:15:20,2190-07-20 08:15:20,2190-07-20 08:57:46,2190-07-20 17:10:02,,
3,3921,10029,132349,33,,45,SICU,1,MED,0,...,Inactive,Discharged,45.0,Acknowledged,2139-09-24 09:53:37,2139-09-24 09:53:37,2139-09-24 09:56:02,2139-09-25 19:10:01,,
4,3922,10033,157235,33,,4,SICU,1,MED,1,...,Inactive,Discharged,4.0,Revised,2132-12-06 10:16:08,2132-12-06 14:53:53,,2132-12-06 15:10:02,,


### Caregivers

Every caregiver who has recorded data in the database (defines CGID).

Link:
- cgid

In [81]:
print(file_list[2])
pd.read_csv(Path(dataset_folder, f"{file_list[2]}.csv")).head()

CAREGIVERS


Unnamed: 0,row_id,cgid,label,description
0,2228,16174,RO,Read Only
1,2229,16175,RO,Read Only
2,2230,16176,Res,Resident/Fellow/PA/NP
3,2231,16177,RO,Read Only
4,2232,16178,RT,Respiratory


### Chartevents
All charted observations for patients

Link:
- subject_id
- hadm_id 
- icustay_id
- charttime
- cgid

Irrelevant Columns:
- Storetime

In [82]:
print(file_list[3])
pd.read_csv(Path(dataset_folder, f"{file_list[3]}.csv")).head()

CHARTEVENTS


  exec(code_obj, self.user_global_ns, self.user_ns)


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,itemid,charttime,storetime,cgid,value,valuenum,valueuom,warning,error,resultstatus,stopped
0,5279021,40124,126179,279554.0,223761,2130-02-04 04:00:00,2130-02-04 04:35:00,19085,95.9,95.9,?F,0.0,0.0,,
1,5279022,40124,126179,279554.0,224695,2130-02-04 04:25:00,2130-02-04 05:55:00,18999,2222221.7,2222221.7,cmH2O,0.0,0.0,,
2,5279023,40124,126179,279554.0,220210,2130-02-04 04:30:00,2130-02-04 04:43:00,21452,15.0,15.0,insp/min,0.0,0.0,,
3,5279024,40124,126179,279554.0,220045,2130-02-04 04:32:00,2130-02-04 04:43:00,21452,94.0,94.0,bpm,0.0,0.0,,
4,5279025,40124,126179,279554.0,220179,2130-02-04 04:32:00,2130-02-04 04:43:00,21452,163.0,163.0,mmHg,0.0,0.0,,


### CPT Events
Procedures recorded as Current Procedural Terminology (CPT) codes.

Link:
- subject_id
- hadm_id
- cpt_number

In [83]:
print(file_list[4])
pd.read_csv(Path(dataset_folder, f"{file_list[4]}.csv")).head()

CPTEVENTS


Unnamed: 0,row_id,subject_id,hadm_id,costcenter,chartdate,cpt_cd,cpt_number,cpt_suffix,ticket_id_seq,sectionheader,subsectionheader,description
0,4615,10117,105150,ICU,,99254,99254,,1.0,Evaluation and management,Consultations,
1,4616,10117,105150,ICU,,99231,99231,,2.0,Evaluation and management,Hospital inpatient services,
2,4617,10117,105150,ICU,,90935,90935,,3.0,Medicine,Dialysis,
3,4618,10117,105150,ICU,,99231,99231,,4.0,Evaluation and management,Hospital inpatient services,
4,7753,10111,174739,ICU,,99253,99253,,1.0,Evaluation and management,Consultations,


### Dictionary for CPT Codes
 High level dictionary of Current Procedural Terminology (CPT) codes.

In [84]:
print(file_list[5])
pd.read_csv(Path(dataset_folder, f"{file_list[5]}.csv")).head()

D_CPT


Unnamed: 0,row_id,category,sectionrange,sectionheader,subsectionrange,subsectionheader,codesuffix,mincodeinsubsection,maxcodeinsubsection
0,1,1,99201-99499,Evaluation and management,99201-99216,Office/other outpatient services,,99201,99216
1,2,1,99201-99499,Evaluation and management,99217-99220,Hospital observation services,,99217,99220
2,3,1,99201-99499,Evaluation and management,99221-99239,Hospital inpatient services,,99221,99239
3,4,1,99201-99499,Evaluation and management,99241-99255,Consultations,,99241,99255
4,5,1,99201-99499,Evaluation and management,99261-99263,Follow-up inpatient consultations (deleted codes),,99261,99263


### Dictionary for ICD Diagnoses
Dictionary of International Statistical Classification of Diseases and Related Health Problems (ICD-9) codes relating to diagnoses.

Link:
- icd9_code

Irrelevant Columns:
- long_title


In [110]:
print(file_list[6])
pd.read_csv(Path(dataset_folder, f"{file_list[6]}.csv")).head()

D_ICD_DIAGNOSES


Unnamed: 0,row_id,icd9_code,short_title,long_title
0,1,1716,Erythem nod tb-oth test,Erythema nodosum with hypersensitivity reactio...
1,2,1720,TB periph lymph-unspec,"Tuberculosis of peripheral lymph nodes, unspec..."
2,3,1721,TB periph lymph-no exam,"Tuberculosis of peripheral lymph nodes, bacter..."
3,4,1722,TB periph lymph-exam unk,"Tuberculosis of peripheral lymph nodes, bacter..."
4,5,1723,TB periph lymph-micro dx,"Tuberculosis of peripheral lymph nodes, tuberc..."


### Dictionary for ICD Procedures
Dictionary of International Statistical Classification of Diseases and Related Health Problems (ICD-9) codes relating to procedures.

Link:
- icd9_code

Irrelevant Columns:

- long_title

In [86]:
print(file_list[7])
pd.read_csv(Path(dataset_folder, f"{file_list[7]}.csv")).head()

D_ICD_PROCEDURES


Unnamed: 0,row_id,icd9_code,short_title,long_title
0,1,1423,Chorioret les xenon coag,Destruction of chorioretinal lesion by xenon a...
1,2,1424,Chorioret les laser coag,Destruction of chorioretinal lesion by laser p...
2,3,1425,Chorioret les p/coag NOS,Destruction of chorioretinal lesion by photoco...
3,4,1426,Chorioret les radiother,Destruction of chorioretinal lesion by radiati...
4,5,1427,Chorioret les rad implan,Destruction of chorioretinal lesion by implant...


### Dictionary for Items
Dictionary of local codes (’ITEMIDs’) appearing in the MIMIC database, except those that relate to laboratory tests.

Link:
- itemid

Irrelevant Columns:
- dbsource ?

In [134]:
print(file_list[8])
item_dict = pd.read_csv(Path(dataset_folder, f"{file_list[8]}.csv"))
print(f"There are {len(item_dict['label'].unique())} unique labels!")
print(item_dict["itemid"][763])
print(item_dict[item_dict["label"]=="Heart Rate"])
item_dict.head()

D_ITEMS
There are 11847 unique labels!
3074
      row_id  itemid       label abbreviation    dbsource      linksto  \
211      212     211  Heart Rate          NaN     carevue  chartevents   
9524   12712  220045  Heart Rate           HR  metavision  chartevents   

                 category unitname param_type  conceptid  
211                   NaN      NaN        NaN        NaN  
9524  Routine Vital Signs      bpm    Numeric        NaN  


Unnamed: 0,row_id,itemid,label,abbreviation,dbsource,linksto,category,unitname,param_type,conceptid
0,1,1435,Sustained Nystamus,,carevue,chartevents,,,,
1,2,1436,Tactile Disturbances,,carevue,chartevents,,,,
2,3,1437,Tremor,,carevue,chartevents,,,,
3,4,1438,Ulnar Pulse [Right],,carevue,chartevents,,,,
4,5,1439,Visual Disturbances,,carevue,chartevents,,,,


### Dictionary for Laboratory Items
Dictionary of local codes (’ITEMIDs’) appearing in the MIMIC database that relate to laboratory tests.

Link:

- itemid

In [89]:
print(file_list[9])
pd.read_csv(Path(dataset_folder, f"{file_list[9]}.csv")).head()

D_LABITEMS


Unnamed: 0,row_id,itemid,label,fluid,category,loinc_code
0,1,50800,SPECIMEN TYPE,BLOOD,BLOOD GAS,
1,2,50801,Alveolar-arterial Gradient,Blood,Blood Gas,19991-9
2,3,50802,Base Excess,Blood,Blood Gas,11555-0
3,4,50803,"Calculated Bicarbonate, Whole Blood",Blood,Blood Gas,1959-6
4,5,50804,Calculated Total CO2,Blood,Blood Gas,34728-6


### DateTime Events
 All recorded observations which are dates, for example time of dialysis or insertion of lines.
 
Link:
- subject_id
- hadm_id
- icustay_id
- itemid
- cgid

In [90]:
print(file_list[10])
pd.read_csv(Path(dataset_folder, f"{file_list[10]}.csv")).head()

DATETIMEEVENTS


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,itemid,charttime,storetime,cgid,value,valueuom,warning,error,resultstatus,stopped
0,208474,10076,198503,201006.0,5684,2107-03-25 04:00:00,2107-03-25 04:34:00,20482,2107-03-24 00:00:00,Date,,,,NotStopd
1,208475,10076,198503,201006.0,5684,2107-03-25 07:00:00,2107-03-25 07:06:00,15004,2107-03-24 00:00:00,Date,,,,NotStopd
2,208836,10076,198503,201006.0,5684,2107-03-26 04:00:00,2107-03-26 05:31:00,20834,2107-03-24 00:00:00,Date,,,,NotStopd
3,208837,10076,198503,201006.0,5684,2107-03-26 08:00:00,2107-03-26 08:33:00,17480,2107-03-24 00:00:00,Date,,,,NotStopd
4,208838,10076,198503,201006.0,5684,2107-03-26 16:00:00,2107-03-26 16:08:00,17480,2107-03-24 00:00:00,Date,,,,NotStopd


### Diagnosis by ICD
 All recorded observations which are dates, for example time of dialysis or insertion of lines.
 
Link:
- subject_id
- hadm_id
- icd9_code

In [91]:
print(file_list[11])
pd.read_csv(Path(dataset_folder, f"{file_list[11]}.csv")).head()

DIAGNOSES_ICD


Unnamed: 0,row_id,subject_id,hadm_id,seq_num,icd9_code
0,112344,10006,142345,1,99591
1,112345,10006,142345,2,99662
2,112346,10006,142345,3,5672
3,112347,10006,142345,4,40391
4,112348,10006,142345,5,42731


### Diagnosis Related Groups
Diagnosis Related Groups (DRG), which are used by the hospital for billing purposes.

Link:
- subject_id
- hadm_id
- drg_code

In [93]:
print(file_list[12])
pd.read_csv(Path(dataset_folder, f"{file_list[12]}.csv")).head()

DRGCODES


Unnamed: 0,row_id,subject_id,hadm_id,drg_type,drg_code,description,drg_severity,drg_mortality
0,1338,10130,156668,HCFA,148,MAJOR SMALL & LARGE BOWEL PROCEDURES WITH COMP...,,
1,2188,10114,167957,HCFA,518,PERCUTANEOUS CARDIOVASCULAR PROCEDURES WITHOUT...,,
2,2599,10117,187023,HCFA,185,DENTAL & ORAL DIS EXCEPT EXTRACTIONS & RESTORA...,,
3,2703,10046,133110,HCFA,1,CRANIOTOMY AGE >17 EXCEPT FOR TRAUMA,,
4,3020,10011,105331,HCFA,205,"DISORDERS OF LIVER EXCEPT MALIGNANCY, CIRRHOSI...",,


### ICU Stays
Every unique ICU stay in the database (defines ICUSTAY_ID).

Link:
- subject_id
- hadm_id
- icustay_id
- intime

In [92]:
print(file_list[13])
pd.read_csv(Path(dataset_folder, f"{file_list[13]}.csv")).head()

ICUSTAYS


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,dbsource,first_careunit,last_careunit,first_wardid,last_wardid,intime,outtime,los
0,12742,10006,142345,206504,carevue,MICU,MICU,52,52,2164-10-23 21:10:15,2164-10-25 12:21:07,1.6325
1,12747,10011,105331,232110,carevue,MICU,MICU,15,15,2126-08-14 22:34:00,2126-08-28 18:59:00,13.8507
2,12749,10013,165520,264446,carevue,MICU,MICU,15,15,2125-10-04 23:38:00,2125-10-07 15:13:52,2.6499
3,12754,10017,199207,204881,carevue,CCU,CCU,7,7,2149-05-29 18:52:29,2149-05-31 22:19:17,2.1436
4,12755,10019,177759,228977,carevue,MICU,MICU,15,15,2163-05-14 20:43:56,2163-05-16 03:47:04,1.2938


### Philips CareVue
Intake for patients monitored using the Philips CareVue system while in the ICU, e.g., intravenous medications, enteral feeding, etc.

Link:
- subject_id
- hadm_id
- icustay_id
- charttime
- itemid

In [114]:
print(file_list[14])
care_vue_df = pd.read_csv(Path(dataset_folder, f"{file_list[14]}.csv"))
print(care_vue_df.columns)
care_vue_df.head()

INPUTEVENTS_CV
Index(['row_id', 'subject_id', 'hadm_id', 'icustay_id', 'charttime', 'itemid',
       'amount', 'amountuom', 'rate', 'rateuom', 'storetime', 'cgid',
       'orderid', 'linkorderid', 'stopped', 'newbottle', 'originalamount',
       'originalamountuom', 'originalroute', 'originalrate', 'originalrateuom',
       'originalsite'],
      dtype='object')


  exec(code_obj, self.user_global_ns, self.user_ns)


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,charttime,itemid,amount,amountuom,rate,rateuom,...,orderid,linkorderid,stopped,newbottle,originalamount,originalamountuom,originalroute,originalrate,originalrateuom,originalsite
0,1184,10114,167957,234989,2171-11-03 15:00:00,30056,400.0,ml,,,...,2557279,2557279,,,,ml,Oral,,,
1,1185,10114,167957,234989,2171-11-03 20:00:00,30056,120.0,ml,,,...,7828849,2557279,,,,ml,Oral,,,
2,1186,10114,167957,234989,2171-11-03 23:00:00,30056,120.0,ml,,,...,2744159,2557279,,,,ml,Oral,,,
3,1187,10114,167957,234989,2171-11-04 02:00:00,30056,120.0,ml,,,...,8475006,2557279,,,,ml,Oral,,,
4,1188,10114,167957,234989,2171-11-04 05:00:00,30056,120.0,ml,,,...,11183474,2557279,,,,ml,Oral,,,


### iMDSoft Meta Vision
 Intake for patients monitored using the iMDSoft MetaVision system while in the ICU, e.g., intravenous medications, enteral feeding,
etc.

Link:
- subject_id
- hadm_id
- icustay_id
- start-end time
- itemid

In [115]:
print(file_list[15])
soft_meta_vision_df = pd.read_csv(Path(dataset_folder, f"{file_list[15]}.csv"))
print(soft_meta_vision_df.columns)
soft_meta_vision_df.head()

INPUTEVENTS_MV
Index(['row_id', 'subject_id', 'hadm_id', 'icustay_id', 'starttime', 'endtime',
       'itemid', 'amount', 'amountuom', 'rate', 'rateuom', 'storetime', 'cgid',
       'orderid', 'linkorderid', 'ordercategoryname',
       'secondaryordercategoryname', 'ordercomponenttypedescription',
       'ordercategorydescription', 'patientweight', 'totalamount',
       'totalamountuom', 'isopenbag', 'continueinnextdept', 'cancelreason',
       'statusdescription', 'comments_editedby', 'comments_canceledby',
       'comments_date', 'originalamount', 'originalrate'],
      dtype='object')


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,starttime,endtime,itemid,amount,amountuom,rate,...,totalamountuom,isopenbag,continueinnextdept,cancelreason,statusdescription,comments_editedby,comments_canceledby,comments_date,originalamount,originalrate
0,118897,42367,139932,250305,2147-10-29 16:45:00,2147-10-29 16:46:00,225799,60.0,ml,,...,ml,0,0,0,FinishedRunning,,,,60.0,60.0
1,118898,42367,139932,250305,2147-10-20 13:17:00,2147-10-20 13:18:00,223258,10.0,units,,...,,0,0,1,Rewritten,,RN,2147-10-20 13:18:00,10.0,10.0
2,118899,42367,139932,250305,2147-10-29 03:23:00,2147-10-29 03:53:00,226089,99.999999,ml,199.999998,...,ml,0,0,0,FinishedRunning,,,,100.0,200.0
3,118900,42367,139932,250305,2147-10-22 22:00:00,2147-10-22 22:01:00,225799,40.0,ml,,...,ml,0,0,0,FinishedRunning,,,,40.0,40.0
4,118901,42367,139932,250305,2147-10-16 06:21:00,2147-10-17 06:10:00,225936,1309.899995,ml,54.9993,...,ml,0,0,0,FinishedRunning,,,,1309.9,54.999298


### Output Events
Output information for patients while in the ICU.

Link:
- subject_id
- hadm_id
- icustay_id
- charttime
- itemid

In [97]:
print(file_list[16])
pd.read_csv(Path(dataset_folder, f"{file_list[16]}.csv")).head()

OUTPUTEVENTS


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,charttime,itemid,value,valueuom,storetime,cgid,stopped,newbottle,iserror
0,6540,10114,167957,234989.0,2171-10-30 20:00:00,40055,39.0,ml,2171-10-30 20:38:00,15029,,,
1,6541,10114,167957,234989.0,2171-10-30 21:00:00,40055,35.0,ml,2171-10-30 21:18:00,15029,,,
2,6542,10114,167957,234989.0,2171-10-30 23:00:00,40055,100.0,ml,2171-10-30 23:31:00,15029,,,
3,6543,10114,167957,234989.0,2171-10-31 00:00:00,40055,45.0,ml,2171-10-31 00:24:00,15029,,,
4,6544,10114,167957,234989.0,2171-10-31 02:00:00,40055,80.0,ml,2171-10-31 02:02:00,15029,,,


### Laboratory Events
Laboratory measurements for patients both within the hospital and in outpatient clinics.
Link:
- subject_id
- hadm_id
- charttime
- itemid

In [98]:
print(file_list[17])
pd.read_csv(Path(dataset_folder, f"{file_list[17]}.csv")).head()

LABEVENTS


Unnamed: 0,row_id,subject_id,hadm_id,itemid,charttime,value,valuenum,valueuom,flag
0,6244563,10006,,50868,2164-09-24 20:21:00,19.0,19.0,mEq/L,
1,6244564,10006,,50882,2164-09-24 20:21:00,27.0,27.0,mEq/L,
2,6244565,10006,,50893,2164-09-24 20:21:00,10.0,10.0,mg/dL,
3,6244566,10006,,50902,2164-09-24 20:21:00,97.0,97.0,mEq/L,
4,6244567,10006,,50912,2164-09-24 20:21:00,7.0,7.0,mg/dL,abnormal


### Micro Biology Events
Microbiology culture results and antibiotic sensitivities from the hospital database.

Link:
- subject_id
- hadm_id
- icustay_id
- charttime
- spec-org itemid

In [99]:
print(file_list[18])
pd.read_csv(Path(dataset_folder, f"{file_list[18]}.csv")).head()

MICROBIOLOGYEVENTS


Unnamed: 0,row_id,subject_id,hadm_id,chartdate,charttime,spec_itemid,spec_type_desc,org_itemid,org_name,isolate_num,ab_itemid,ab_name,dilution_text,dilution_comparison,dilution_value,interpretation
0,134694,10006,142345,2164-10-23 00:00:00,2164-10-23 15:30:00,70012,BLOOD CULTURE,80155.0,"STAPHYLOCOCCUS, COAGULASE NEGATIVE",2.0,,,,,,
1,134695,10006,142345,2164-10-23 00:00:00,2164-10-23 15:30:00,70012,BLOOD CULTURE,80155.0,"STAPHYLOCOCCUS, COAGULASE NEGATIVE",1.0,90015.0,VANCOMYCIN,2,=,2.0,S
2,134696,10006,142345,2164-10-23 00:00:00,2164-10-23 15:30:00,70012,BLOOD CULTURE,80155.0,"STAPHYLOCOCCUS, COAGULASE NEGATIVE",1.0,90012.0,GENTAMICIN,<=0.5,<=,1.0,S
3,134697,10006,142345,2164-10-23 00:00:00,2164-10-23 15:30:00,70012,BLOOD CULTURE,80155.0,"STAPHYLOCOCCUS, COAGULASE NEGATIVE",1.0,90025.0,LEVOFLOXACIN,4,=,4.0,I
4,134698,10006,142345,2164-10-23 00:00:00,2164-10-23 15:30:00,70012,BLOOD CULTURE,80155.0,"STAPHYLOCOCCUS, COAGULASE NEGATIVE",1.0,90016.0,OXACILLIN,=>4,=>,4.0,R


### Note Events
 Deidentified notes, including nursing and physician notes, ECG reports, radiology reports, and discharge summaries.

In [100]:
print(file_list[19])
pd.read_csv(Path(dataset_folder, f"{file_list[19]}.csv")).head()

NOTEEVENTS


Unnamed: 0,row_id,subject_id,hadm_id,chartdate,charttime,storetime,category,description,cgid,iserror,text


### Patients
Every unique patient in the database (defines SUBJECT_ID).

Link:
- subject_id

In [101]:
print(file_list[20])
pd.read_csv(Path(dataset_folder, f"{file_list[20]}.csv")).head()

PATIENTS


Unnamed: 0,row_id,subject_id,gender,dob,dod,dod_hosp,dod_ssn,expire_flag
0,9467,10006,F,2094-03-05 00:00:00,2165-08-12 00:00:00,2165-08-12 00:00:00,2165-08-12 00:00:00,1
1,9472,10011,F,2090-06-05 00:00:00,2126-08-28 00:00:00,2126-08-28 00:00:00,,1
2,9474,10013,F,2038-09-03 00:00:00,2125-10-07 00:00:00,2125-10-07 00:00:00,2125-10-07 00:00:00,1
3,9478,10017,F,2075-09-21 00:00:00,2152-09-12 00:00:00,,2152-09-12 00:00:00,1
4,9479,10019,M,2114-06-20 00:00:00,2163-05-15 00:00:00,2163-05-15 00:00:00,2163-05-15 00:00:00,1


### PRESCRIPTIONS
Medications ordered for a given patient.

Link:
- subject_id
- hadm_id
- icustay_id
- start-end date

In [103]:
print(file_list[21])
pd.read_csv(Path(dataset_folder, f"{file_list[21]}.csv")).head()

PRESCRIPTIONS


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,startdate,enddate,drug_type,drug,drug_name_poe,drug_name_generic,formulary_drug_cd,gsn,ndc,prod_strength,dose_val_rx,dose_unit_rx,form_val_disp,form_unit_disp,route
0,32600,42458,159647,,2146-07-21 00:00:00,2146-07-22 00:00:00,MAIN,Pneumococcal Vac Polyvalent,Pneumococcal Vac Polyvalent,PNEUMOcoccal Vac Polyvalent,PNEU25I,48548.0,6494300.0,25mcg/0.5mL Vial,0.5,mL,1,VIAL,IM
1,32601,42458,159647,,2146-07-21 00:00:00,2146-07-22 00:00:00,MAIN,Bisacodyl,Bisacodyl,Bisacodyl,BISA5,2947.0,536338101.0,5 mg Tab,10.0,mg,2,TAB,PO
2,32602,42458,159647,,2146-07-21 00:00:00,2146-07-22 00:00:00,MAIN,Bisacodyl,Bisacodyl,Bisacodyl (Rectal),BISA10R,2944.0,574705050.0,10mg Suppository,10.0,mg,1,SUPP,PR
3,32603,42458,159647,,2146-07-21 00:00:00,2146-07-22 00:00:00,MAIN,Senna,Senna,Senna,SENN187,19964.0,904516561.0,1 Tablet,1.0,TAB,1,TAB,PO
4,32604,42458,159647,,2146-07-21 00:00:00,2146-07-21 00:00:00,MAIN,Docusate Sodium (Liquid),Docusate Sodium (Liquid),Docusate Sodium (Liquid),DOCU100L,3017.0,121054410.0,100mg UD Cup,100.0,mg,1,UDCUP,PO


### Procedure Events iMDSoft Meta Vision
 Patient procedures for the subset of patients who were monitored in the ICU using the iMDSoft MetaVision system.
 
 Link:
- subject_id
- hadm_id
- icustay_id
- start-end time
- itemid

In [116]:
print(file_list[22])
proc_events = pd.read_csv(Path(dataset_folder, f"{file_list[22]}.csv"))
print(proc_events.columns)
proc_events.head()

PROCEDUREEVENTS_MV
Index(['row_id', 'subject_id', 'hadm_id', 'icustay_id', 'starttime', 'endtime',
       'itemid', 'value', 'valueuom', 'location', 'locationcategory',
       'storetime', 'cgid', 'orderid', 'linkorderid', 'ordercategoryname',
       'secondaryordercategoryname', 'ordercategorydescription', 'isopenbag',
       'continueinnextdept', 'cancelreason', 'statusdescription',
       'comments_editedby', 'comments_canceledby', 'comments_date'],
      dtype='object')


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,starttime,endtime,itemid,value,valueuom,location,...,ordercategoryname,secondaryordercategoryname,ordercategorydescription,isopenbag,continueinnextdept,cancelreason,statusdescription,comments_editedby,comments_canceledby,comments_date
0,8641,42367,139932,250305,2147-10-03 16:40:00,2147-10-06 20:00:00,224263,4520,min,Right Femoral.,...,Invasive Lines,,Task,1,0,0,FinishedRunning,,,
1,8642,42367,139932,250305,2147-10-03 16:41:00,2147-10-12 16:38:00,225204,12957,min,Right Antecube,...,Invasive Lines,,Task,1,0,0,FinishedRunning,,,
2,8643,42367,139932,250305,2147-10-03 17:10:00,2147-10-18 15:15:00,225792,21485,min,,...,Ventilation,,Task,1,0,0,FinishedRunning,,,
3,8644,42367,139932,250305,2147-10-04 11:00:00,2147-10-04 11:01:00,221214,1,,,...,Imaging,,Electrolytes,0,0,0,FinishedRunning,,,
4,8645,42367,139932,250305,2147-10-04 14:16:00,2147-10-04 14:17:00,221223,1,,,...,Procedures,,Electrolytes,0,0,0,FinishedRunning,,,


### Procedure ICD
 Patient procedures, coded using the International Statistical Classification of Diseases and Related Health Problems (ICD) system.
  
Link:
- subject_id
- hadm_id
- icd9_code

In [106]:
print(file_list[23])
pd.read_csv(Path(dataset_folder, f"{file_list[23]}.csv")).head()

PROCEDURES_ICD


Unnamed: 0,row_id,subject_id,hadm_id,seq_num,icd9_code
0,3994,10114,167957,1,3605
1,3995,10114,167957,2,3722
2,3996,10114,167957,3,8856
3,3997,10114,167957,4,9920
4,3998,10114,167957,5,9671


### Services
 The clinical service under which a patient is registered.

Link:
- subject_id
- hadm_id
- transfertime

In [107]:
print(file_list[24])
pd.read_csv(Path(dataset_folder, f"{file_list[24]}.csv")).head()

SERVICES


Unnamed: 0,row_id,subject_id,hadm_id,transfertime,prev_service,curr_service
0,14974,10006,142345,2164-10-23 21:10:15,,MED
1,14979,10011,105331,2126-08-14 22:34:00,,MED
2,14981,10013,165520,2125-10-04 23:38:00,,MED
3,14985,10017,199207,2149-05-26 17:21:09,,MED
4,14986,10019,177759,2163-05-14 20:43:56,,MED


### Transfers
Patient movement from bed to bed within the hospital, including ICU admission and discharge.
Link:
- subject_id
- hadm_id
- icustay_id

In [108]:
print(file_list[25])
pd.read_csv(Path(dataset_folder, f"{file_list[25]}.csv")).head()

TRANSFERS


Unnamed: 0,row_id,subject_id,hadm_id,icustay_id,dbsource,eventtype,prev_careunit,curr_careunit,prev_wardid,curr_wardid,intime,outtime,los
0,54440,10006,142345,206504.0,carevue,admit,,MICU,,52.0,2164-10-23 21:10:15,2164-10-25 12:21:07,39.18
1,54441,10006,142345,,carevue,transfer,MICU,,52.0,45.0,2164-10-25 12:21:07,2164-11-01 17:14:27,172.89
2,54442,10006,142345,,carevue,discharge,,,45.0,,2164-11-01 17:14:27,,
3,54460,10011,105331,232110.0,carevue,admit,,MICU,,15.0,2126-08-14 22:34:00,2126-08-28 18:59:00,332.42
4,54461,10011,105331,,carevue,discharge,MICU,,15.0,,2126-08-28 18:59:00,,
