This notebook explores the MIMIC III dataset using BigQuery and Google Cloud. First, you need to gain access to the dataset through this [link](https://physionet.org/content/mimiciii/1.4/) by meeting their requirements. Then, go to [Google Cloud BigQuery](https://cloud.google.com/bigquery), navigate to your BigQuery Studio, and create a new folder called "physionet-data." Next, return to the MIMIC III webpage, scroll down, and click on "Request access using Google BigQuery." This will automatically add the dataset to your Google account in the "physionet-data" folder. Now, you are ready to follow along with this notebook.

## Install necessary libraries.
tableone is a tool for producing summary statistics

In [1]:
%pip install tableone



In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import tableone

from google.colab import files
from google.colab import auth
auth.authenticate_user()
print('Authenticated')

%load_ext google.colab.data_table

Authenticated


In [3]:
# change the PROJECT_ID to yours
%env GOOGLE_CLOUD_PROJECT = PROJECT_ID

env: GOOGLE_CLOUD_PROJECT=sccm-test-426219


## Exploring the dataset

In [4]:
%%bigquery --project PROJECT_ID

  SELECT
      le.SUBJECT_ID,
      le.CHARTTIME,
      le.VALUENUM,
      dle.LABEL,
  FROM
      `physionet-data.mimiciii_clinical.labevents` le
  LEFT JOIN (
      SELECT
          ITEMID,
          LABEL
      FROM
          `physionet-data.mimiciii_clinical.d_labitems`
      LIMIT 100
  ) dle
      ON dle.ITEMID = le.ITEMID
  WHERE
      le.ITEMID IN (
          SELECT
              ITEMID
          FROM
              `physionet-data.mimiciii_clinical.d_labitems` dle
          WHERE
              CATEGORY IN ("Blood Gas", "BLOOD GAS")
          LIMIT 100
      )
  ORDER BY
      le.SUBJECT_ID ASC,
      le.CHARTTIME ASC
  LIMIT 100;

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,SUBJECT_ID,CHARTTIME,VALUENUM,LABEL
0,3,2101-10-12 09:18:00,106.00,Glucose
1,3,2101-10-12 09:18:00,32.00,"Hematocrit, Calculated"
2,3,2101-10-12 09:18:00,244.00,pO2
3,3,2101-10-12 09:18:00,,Ventilation Rate
4,3,2101-10-12 09:18:00,1.15,Free Calcium
...,...,...,...,...
95,3,2101-10-12 18:17:00,,SPECIMEN TYPE
96,3,2101-10-12 18:17:00,80.00,pO2
97,3,2101-10-12 18:17:00,-1.00,Base Excess
98,3,2101-10-12 18:17:00,0.93,Free Calcium


## Store data into a python variable

In [5]:
%%bigquery df_example --project PROJECT_ID

  SELECT
      le.SUBJECT_ID,
      le.HADM_ID,
      le.CHARTTIME,
      dle2.LABEL,
      le.VALUENUM,
  FROM
      `physionet-data.mimiciii_clinical.labevents` le
  LEFT JOIN (
      SELECT
          ITEMID,
          LABEL
      FROM
          `physionet-data.mimiciii_clinical.d_labitems` dle2
      LIMIT 100
  ) dle2
      ON dle2.ITEMID = le.ITEMID
  WHERE
      le.ITEMID IN (
          SELECT
              ITEMID
          FROM
              `physionet-data.mimiciii_clinical.d_labitems` dle2
          WHERE
              CATEGORY IN ("Blood Gas", "BLOOD GAS")
          LIMIT 100
      )
  ORDER BY
      le.SUBJECT_ID ASC,
      le.CHARTTIME ASC
  LIMIT 100;


Query is running:   0%|          |

Downloading:   0%|          |

In [6]:
df_example.head(20)

Unnamed: 0,SUBJECT_ID,HADM_ID,CHARTTIME,LABEL,VALUENUM
0,3,,2101-10-12 09:18:00,Glucose,106.0
1,3,,2101-10-12 09:18:00,Calculated Total CO2,27.0
2,3,,2101-10-12 09:18:00,Lactate,1.5
3,3,,2101-10-12 09:18:00,"Calculated Bicarbonate, Whole Blood",25.0
4,3,,2101-10-12 09:18:00,Intubated,
5,3,,2101-10-12 09:18:00,"Potassium, Whole Blood",3.8
6,3,,2101-10-12 09:18:00,Base Excess,2.0
7,3,,2101-10-12 09:18:00,pO2,244.0
8,3,,2101-10-12 09:18:00,Oxygen Saturation,98.0
9,3,,2101-10-12 09:18:00,Tidal Volume,620.0


## tabelone example

In [7]:
%%bigquery demographics --project PROJECT_ID

  SELECT *
  FROM
    `physionet-data.mimiciii_clinical.admissions` adm
  LEFT JOIN
    `physionet-data.mimiciii_clinical.patients` pat
  on
    adm.SUBJECT_ID = pat.SUBJECT_ID
  LEFT JOIN
    `physionet-data.mimiciii_clinical.icustays` iu
  on
    adm.SUBJECT_ID = iu.SUBJECT_ID

Query is running:   0%|          |

Downloading:   0%|          |

In [8]:
tableone.tableone(
    demographics,
    columns = [
        'GENDER',
        'LOS',
        'HOSPITAL_EXPIRE_FLAG',
        'ETHNICITY',
        ],
    categorical = [
        'ETHNICITY',
        'GENDER',
        ],
    groupby= 'HOSPITAL_EXPIRE_FLAG'
)

Unnamed: 0_level_0,Unnamed: 1_level_0,Grouped by HOSPITAL_EXPIRE_FLAG,Grouped by HOSPITAL_EXPIRE_FLAG,Grouped by HOSPITAL_EXPIRE_FLAG,Grouped by HOSPITAL_EXPIRE_FLAG
Unnamed: 0_level_1,Unnamed: 1_level_1,Missing,Overall,0,1
n,,,116471,107122,9349
"GENDER, n (%)",F,,52801 (45.3),48578 (45.3),4223 (45.2)
"GENDER, n (%)",M,,63670 (54.7),58544 (54.7),5126 (54.8)
"LOS, mean (SD)",,57.0,4.6 (8.4),4.5 (8.4),5.8 (8.3)
"ETHNICITY, n (%)",AMERICAN INDIAN/ALASKA NATIVE,,70 (0.1),65 (0.1),5 (0.1)
"ETHNICITY, n (%)",AMERICAN INDIAN/ALASKA NATIVE FEDERALLY RECOGNIZED TRIBE,,5 (0.0),3 (0.0),2 (0.0)
"ETHNICITY, n (%)",ASIAN,,2073 (1.8),1906 (1.8),167 (1.8)
"ETHNICITY, n (%)",ASIAN - ASIAN INDIAN,,336 (0.3),329 (0.3),7 (0.1)
"ETHNICITY, n (%)",ASIAN - CAMBODIAN,,48 (0.0),36 (0.0),12 (0.1)
"ETHNICITY, n (%)",ASIAN - CHINESE,,410 (0.4),369 (0.3),41 (0.4)


## Exploring vital signs

In [9]:
%%bigquery --project PROJECT_ID

  SELECT
    DISTINCT category,
    label,
    itemid
  FROM
    `physionet-data.mimiciii_clinical.d_items`
  WHERE (
      (linksto = 'chartevents')
      and
      category in ('Routine Vital Signs')
      and
      label in (
          'Heart Rate', 'Temperature Celsius', 'O2 saturation pulseoxymetry',
          'Arterial Blood Pressure diastolic', 'Arterial Blood Pressure mean', 'Arterial Blood Pressure systolic'
          )
      )
  order by category ASC, label ASC
  LIMIT 100

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,category,label,itemid
0,Routine Vital Signs,Arterial Blood Pressure diastolic,220051
1,Routine Vital Signs,Arterial Blood Pressure mean,220052
2,Routine Vital Signs,Arterial Blood Pressure systolic,220050
3,Routine Vital Signs,Heart Rate,220045
4,Routine Vital Signs,Temperature Celsius,223762
