In [1]:
from pathlib import Path
from pyicu.configs.load import load_src_cfg
from pyicu.sources import MIMIC
from pyicu.concepts import ConceptDict

## Clinical concepts
Clinical concepts represent the main unit of information in pyicu. Concepts can be diverse and include static information such as demography (e.g., age) or repeated measures such as  vital signs (e.g. heart rate), laboratory values (e.g., creatinine), medications (e.g., administration of antibiotics), etc. 

Concepts are meant to define stable, meaningful bits of information. However, this information may be stored differently across different data sources. For example, while MIMIC III stores heart rate as a row with `itemid in [211, 220045]`in the table chartevents, eICU stores it as a dedicated column `hr` in the vitalsperiodic table. How exactly the data for a concept is loaded from each data source is therefore defined by one or more Items. 

The below code demonstrates how pre-defined concepts and items in pyicu can be used to extract age and heart rate from the MIMIC III demo data (note that this requires the demo data to be downloaded, see `01_import_data.ipynb` for instructions).

In [2]:
data_dir = Path("examples/data/physionet.org/files/mimiciii-demo/1.4")
mimic_cfg = load_src_cfg("mimic_demo")
mimic = MIMIC(mimic_cfg, data_dir)

# Double check that all tables are loaded
mimic.print_available()

'mimic_demo: 25 of 25 tables available'

As in the previous notebook, we can inspect the underlying raw data as pyarrow datasets. Age information in MIMIC III can be calculated from the date of birth (dob) and admission time (admittime). You can already see that loading this relatively simple information from MIMIC would require you to join tables and do a computation. Furthermore, MIMIC stores the date of birth of patients aged >89 years as 300 years before the first admission of that patient. Naively calculating ages would lead to a couple of very unlikely Methusalahs. 

In [3]:
mimic.patients

# <SrcTbl>:  [100 x 8]
# Defaults:  `expire_flag` (val)
# Time vars: `dob`, `dod`, `dod_hosp`, `dod_ssn`
   row_id  subject_id gender        dob        dod   dod_hosp    dod_ssn  \
0    9467       10006      F 2094-03-05 2165-08-12 2165-08-12 2165-08-12   
1    9472       10011      F 2090-06-05 2126-08-28 2126-08-28        NaT   
2    9474       10013      F 2038-09-03 2125-10-07 2125-10-07 2125-10-07   
3    9478       10017      F 2075-09-21 2152-09-12        NaT 2152-09-12   
4    9479       10019      M 2114-06-20 2163-05-15 2163-05-15 2163-05-15   

   expire_flag  
0            1  
1            1  
2            1  
3            1  
4            1  
... with 95 more rows

In [4]:
mimic.admissions

# <SrcTbl>:  [129 x 19]
# Defaults:  `admission_type` (val)
# Time vars: `admittime`, `dischtime`, `deathtime`, `edregtime`, `edouttime`
   row_id  subject_id  hadm_id           admittime           dischtime  \
0   12258       10006   142345 2164-10-23 21:09:00 2164-11-01 17:15:00   
1   12263       10011   105331 2126-08-14 22:32:00 2126-08-28 18:59:00   
2   12265       10013   165520 2125-10-04 23:36:00 2125-10-07 15:13:00   
3   12269       10017   199207 2149-05-26 17:19:00 2149-06-03 18:42:00   
4   12270       10019   177759 2163-05-14 20:43:00 2163-05-15 12:00:00   

            deathtime admission_type         admission_location  \
0                 NaT      EMERGENCY       EMERGENCY ROOM ADMIT   
1 2126-08-28 18:59:00      EMERGENCY  TRANSFER FROM HOSP/EXTRAM   
2 2125-10-07 15:13:00      EMERGENCY  TRANSFER FROM HOSP/EXTRAM   
3                 NaT      EMERGENCY       EMERGENCY ROOM ADMIT   
4 2163-05-15 12:00:00      EMERGENCY  TRANSFER FROM HOSP/EXTRAM   

  discharge_loc

Note that these are not read into memory but are merely mapped. To read a full table into memory, the function `.to_pandas()` can be called. This should only be done for the demo data or for small tables of the full data.

## Concept dictionary
Before we can extract concepts from MIMIC, we must first load the respective concept definitions. In pyicu, these are stored in a `ConceptDict`. Expanding the definitions previously introduced in the R package ricu, pyicu already comes with a large number of pre-defined concepts, among them heart rate.

In [5]:
concepts = ConceptDict.from_defaults()

Heart rate is a numerical concept, i.e., the return value is a continuous number. It is also dynamic, so the target format is a time series represented by a `TsTbl` object, which is a pyicu-specific subtype of a pandas `DataFrame`.

In [6]:
print("Concept: ", concepts['hr'])
print("Items:   ", concepts['hr'].src_items(mimic))

Concept:  <pyicu.concepts.concept.NumConcept object at 0x127d18460>
Items:    [<SelItem:mimic> chartevents.itemid in [211, 220045]]


The data for the concept can be loaded by calling the `load_concepts` function of the concept dictionary, which returns a list with the results of all the concepts requested.

In [7]:
concepts.load_concepts("hr", mimic)

[# <TSTbl>:    15490 x 3
 # ID var:     icustay
 # Index var:  charttime
        icustay       charttime  valuenum
 0       206504 0 days 01:29:45     104.0
 1       206504 0 days 01:49:45      99.0
 2       206504 0 days 02:49:45      96.0
 3       206504 0 days 03:49:45      95.0
 4       206504 0 days 04:49:45      92.0
 ...        ...             ...       ...
 15485   217992 4 days 09:18:21      83.0
 15486   217992 4 days 10:18:21      81.0
 15487   217992 4 days 11:18:21      73.0
 15488   217992 4 days 12:18:21      77.0
 15489   217992 4 days 13:18:21      77.0
 
 [15490 rows x 3 columns]]

Alternatively, the `load()` of the concept can be called directly, which is also what happens under the hood of `load_concepts()`. In either case, the concept needs to be provided with the `Src` from which the data should be loaded.

In [8]:
concepts['hr'].load(mimic)

Unnamed: 0,icustay,charttime,valuenum
0,206504,0 days 01:29:45,104.0
1,206504,0 days 01:49:45,99.0
2,206504,0 days 02:49:45,96.0
3,206504,0 days 03:49:45,95.0
4,206504,0 days 04:49:45,92.0
...,...,...,...
15485,217992,4 days 09:18:21,83.0
15486,217992,4 days 10:18:21,81.0
15487,217992,4 days 11:18:21,73.0
15488,217992,4 days 12:18:21,77.0
