Skip to content
Getting started with MIMIC-III Critical Care Database
HTML Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

MIMIC III: Getting Started

The files in this repository are intended to help data scientists get started working with MIMIC-III critical care data using R/RStudio in a PostgreSQL database.

Background information

Alistair E W Johnson, David J Stone, et al, The MIMIC Code Repository: enabling reproducibility in critical care research, Journal of the American Medical Informatics Association, Volume 25, Issue 1, 1 January 2018, Pages 32–39.

Johnson AEW, Pollard TJ, et al, MIMIC-III, a freely accessible critical care database. Scientific Data (2016).

MIMIC GitHub Repository.

The latest version of MIMIC in late 2018 is MIMIC-III v1.4, which comprises over 58,000 hospital admissions for 38,645 adults and 7,875 neonates. The data spans June 2001 - October 2012. All examples in this repository use v1.4.

An updated version of MIMIC is expected in 2019.



Instructions on training requirements to gain access to MIMIC-III data.

The DownloadMIMIC-III.nb.html notebook shows how to download the files, unzip them, and compute md5sums.


Look at each of the .csv raw data files for unexpected ASCII characters. The "bad" characters in some of the files may be annoying but probably will not adversely affect most down-stream processing.


The RStudio notebook, MIMIC-III-Will-Files-Parse.nb.html, looks at parsing the files before loading them into the Postgres database.


Information in this folder shows:

  • Installing-PostgreSQL-on-Windows-for-MIMIC-III.docx

  • Loading-MIMIC-III-into-PostgreSQL.docx

  • MIMIC-Install-on-Postgres.html

  • MIMIC-III-First-Look using several database packages

Several database drivers can be used in RStudio to access a Postgres database, including RPostgres, RPostgreSQL and odbc.

I did not use odbc much, but both RPostgres and RPostgreSQL have problems manipulating datetime fields.

For now, I switch between RPostgres and RPostgreSQL when one fails to solve a problem. You'll see this in the next section.

(A bug in the package RPostgres should be fixed in Feb. 2019, and it may be the package of choice at that time.)


SQL and tidyverse/dplyr solutions in RStudio notebooks for the Querying MIMIC-III examples. Tutorials from PhysioNet site.


Jupyter notebook showing plots of example patient. Part of online site MIMIC-III, a freely accessible critical care database.

You can’t perform that action at this time.