Exploring MIMIC-III Critical Care Database
The examples in this repository build on the information from the MIMIC-III: Getting Started repository.
PowerPoint slides are in the root directory in both .pptx and .pdf formats for a talk given at the Kansas City R Users Group on Feb. 9, 2019: Exploring MIMIMC-III Critical Care Data Using Postgres/tidyverse/dbplyr/dplyr.
Let's build a data dictionary with information about fields by table and how they can be used.
The MIMIC-III database schema can be viewed online here.
Explore the six fields in the
Why is there a pattern in the counts of records by
Let's compute age at death for patients when both
dob(date of birth) and
dod(date of death) are defined.
To protect patient anonymity dates were shifted. Patients originally >89 years old show up as 300 years old! This is an unusual approach.
PostgreSQLgive different age-at-death density plots (for now) since date computations are not always correct.
Explore the 15 fields in the
A log scale was used to view a density plot for length-of-stay computed from admit time to discharge time.
The relationship between the
diagnosisfield in this table and the
diagnoses_icdtable information is unclear.
diagnoses_icd and d_icd_diagnoses
diagnoses_icdfact table with additional information in the
A bar plot of
seq_num(diagnosis priority) shows values can range from 1 to 39, but usually are less than 10.
The dimension table has many icd 9 codes that are never referenced by a fact table record.
The fact table has over 140 icd 9 codes that cannot be found in the dimension table.
left_joinis likely more desirable than an
inner_joinwhen connecting the dimension table to the fact table.
Counts by ICD Diagnosis COde
Counts by primary vs secondary diagnosis by ICD Diagnosis Code (stored in file)
Counts by 10-year age intervals by ICD Diagnosis Code (stored in file)
charteventstable with 330 million records and over 6400 types of chart events from two different
Explore event categories, parameter types, and units of measure.
The "first look" file identified data inconsistencies with
valueuom(value unit of measure), and several fields in the lab items dimension.
The file Lab-Event-Multiple-Units.xlsx identifies
itemids with multiple units of measure.
The file Lab-Event-Item-Counts.xlsx shows lab event counts by
itemidalong with the label and unit(s) of measure.
The file Lab-Event-Fluid-Category-Counts shows lab event counts by fluid (9 groups) and category (3 groups).