In [1]:
import numpy as np
import pandas as pd
import datetime
import copy
import time
import os
import re
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import operator

from tqdm.auto import tqdm, trange
from tqdm.notebook import tqdm
from datetime import timedelta

tqdm.pandas()

In [3]:
# Edit to point to your MIMIC directory.
dataDirStr = '/Users/gmessier/data/mimic-1.4/'

In [4]:
d_items_df = pd.read_csv(dataDirStr + "D_ITEMS.csv")
d_items_df.columns = d_items_df.columns.str.lower()
d_items_df

Unnamed: 0,row_id,itemid,label,abbreviation,dbsource,linksto,category,unitname,param_type,conceptid
0,457,497,Patient controlled analgesia (PCA) [Inject],,carevue,chartevents,,,,
1,458,498,PCA Lockout (Min),,carevue,chartevents,,,,
2,459,499,PCA Medication,,carevue,chartevents,,,,
3,460,500,PCA Total Dose,,carevue,chartevents,,,,
4,461,501,PCV Exh Vt (Obser),,carevue,chartevents,,,,
...,...,...,...,...,...,...,...,...,...,...
12482,14518,226757,GCSMotorApacheIIValue,GCSMotorApacheIIValue,metavision,chartevents,Scores - APACHE II,,Text,
12483,14519,226758,GCSVerbalApacheIIValue,GCSVerbalApacheIIValue,metavision,chartevents,Scores - APACHE II,,Text,
12484,14520,226759,HCO3ApacheIIValue,HCO3ApacheIIValue,metavision,chartevents,Scores - APACHE II,,Numeric,
12485,14521,226760,HCO3Score,HCO3Score,metavision,chartevents,Scores - APACHE II,,Numeric,


`D_ITEMS` is the definition table for all items or `itemid` in the ICU databases.

This includes:

CHARTEVENTS on `itemid`

DATETIMEEVENTS on `itemid`

INPUTEVENTS_CV on `itemid`

INPUTEVENTS_MV on `itemid`

MICROBIOLOGYEVENTS on `SPEC_itemid`, `ORG_itemid`, or `AB_itemid` (for example, d_items.`itemid` = microbiologyevents.`SPEC_itemid`)

OUTPUTEVENTS on `itemid`

PROCEDUREEVENTS_MV on `itemid`

The `label` column describes the concept which is represented by the `itemid`. The `abbreviation` column, only available in Metavision, lists a common abbreviation for the label.

The `dbsource` column was generated to clarify which database the given `itemid` was sourced from: `carevue` indicates the `itemid` was sourced from CareVue, while `metavision` indicated the `itemid` was sourced from Metavision.

In [5]:
c = d_items_df.dbsource.value_counts()
p = d_items_df.dbsource.value_counts(normalize=True).mul(100).round(2)
pd.concat([c,p], axis=1, keys=['counts', '%'])

Unnamed: 0,counts,%
carevue,9059,72.55
metavision,2992,23.96
hospital,436,3.49


`linksto` provides the table name which the data links to. For example, a value of `CHARTEVENTS` indicates that the ITEMID of the given row is contained in `CHARTEVENTS`. A single `itemid` is only used in one event table, that is, if an `itemid` is contained in `CHARTEVENTS` it will not be contained in any other event table (e.g. `IOEVENTS`, `CHARTEVENTS`, etc).

`category` provides some information of the type of data the `itemid` corresponds to. For example, `IV Medication’, which indicates that the medication is administered through an intravenous line.

In [6]:
c = d_items_df.category.value_counts()[:5]
p = d_items_df.category.value_counts(normalize=True).mul(100).round(2)[:5]
pd.concat([c,p], axis=1, keys=['counts', '%'])

Unnamed: 0,counts,%
Free Form Intake,2420,40.01
Access Lines - Invasive,312,5.16
ORGANISM,312,5.16
Skin - Impairment,271,4.48
Labs,148,2.45


`unitname` specifies the unit of measurement used for the `itemid`. This column is not always available, and this may be because the unit of measurement varies, a unit of measurement does not make sense for the given data type, or the unit of measurement is simply missing. Note that there is sometimes additional information on the unit of measurement in the associated event table, e.g. the `valueuom` column in `CHARTEVENTS`.

`param_type` describes the type of data which is recorded: a date, a number or a text field.

In [7]:
c = d_items_df.param_type.value_counts()[:5]
p = d_items_df.param_type.value_counts(normalize=True).mul(100).round(2)[:5]
pd.concat([c,p], axis=1, keys=['counts', '%'])

Unnamed: 0,counts,%
Text,1309,43.75
Numeric,647,21.62
Solution,422,14.1
Checkbox,307,10.26
Date time,142,4.75
