MIMIC_Sepsis
=================

# 1 Preparation

To run this document the following requirements must be satisfied:

- Implement the database mimic in **PostgreSQL** and start it. The instruction can be seen [here](https://github.com/MIT-LCP/mimic-code/tree/main/mimic-iv/buildmimic/postgres). (The name of this environment should be **mimiciv**)
- generate useful abstractions of raw MIMIC-IV data. The instruction be seen [here](https://github.com/MIT-LCP/mimic-code/tree/main/mimic-iv/concepts_postgres) 



To install all the required libraries, uncomment following and run:

In [1]:
#!conda create --name mimic_iv python=3.10
#!conda activate mimic_iv
#!pip install -r requirements.txt

Run the following cell to connect to the database.

In [2]:
%load_ext autoreload
%autoreload 2

import psycopg2
from psycopg2 import sql
import csv
import pandas as pd
import numpy as np
import os
import shutil
import csv
from datetime import timedelta
from sklearn.impute import KNNImputer
from sklearn.neighbors import KNeighborsRegressor

# implement the username, password and database name
conn = psycopg2.connect(host='', user='', password='', database='mimiciv')

# 2 Extract selected data from the original database 

We extract the `state space` and `action space` respectively from the mimiciv database. The table `itemid_info/mimic4 itemid.csv` lists all the items required.

***Uncomment the following cell if you first time run the code***

In [3]:
# uncomment the this cell if you first time run the code

# Read the SQL file

try:
    with open('sql/select_patients_cohort.sql', 'r') as file0:
        sql_script_select_patients_cohort = file0.read()
        
    with open('sql/state_from_chartevents.sql', 'r') as file1:
        sql_script_state = file1.read()

    with open('sql/action_from_inputevents.sql', 'r') as file2:
        sql_script_action_from_inputevents = file2.read()

    with open('sql/action_from_vasopressors_equivalent_dose.sql', 'r') as file3:
        sql_script_action_from_vasopressors_equivalent_dose = file3.read()

    # Execute the SQL script and create the tables in schema mimiciv_derived_sepsis
    cursor = conn.cursor()
    
    cursor.execute(sql.SQL(sql_script_select_patients_cohort))
    print("mimiciv_derived_sepsis.sepsis_patients_cohort is created")

    cursor.execute(sql.SQL(sql_script_state))
    print("mimiciv_derived_sepsis.sepsis_state is created")

    cursor.execute(sql.SQL(sql_script_action_from_inputevents))
    print("mimiciv_derived_sepsis.sepsis_action_inputevents is created")

    cursor.execute(sql.SQL(sql_script_action_from_vasopressors_equivalent_dose))
    print("mimiciv_derived_sepsis.sepsis_action_vasopressors_equivalent_dose is created")

    conn.commit()
    cursor.close()
    
except (Exception, psycopg2.DatabaseError) as error:
    print("Error executing SQL statement:", error)

mimiciv_derived_sepsis.sepsis_patients_cohort is created
mimiciv_derived_sepsis.sepsis_state is created
mimiciv_derived_sepsis.sepsis_action_inputevents is created
mimiciv_derived_sepsis.sepsis_action_vasopressors_equivalent_dose is created


Get the number of stay_ids

In [4]:
with conn.cursor() as cursor:
    command = "SELECT distinct stay_id FROM mimiciv_derived_sepsis.sepsis_patients_cohort;"
    cursor.execute(command)   
    result = cursor.fetchall()
    stay_ids= [row[0] for row in result]
    num_stay_ids = len(stay_ids)
    print('Number of stay_ids: ' + str(num_stay_ids))
    cursor.close()

Number of stay_ids: 6565


# 3 Data transfer

## 3.1 Data transfer of State Space
We transfer the data of State Space from Postgresql to csv.

In [5]:
# output to /output/data/data_raw/state/{state_name}.csv
from python.data_preprocessing.data_transfer import data_transfer_state

itemid_list_state, label_state = data_transfer_state(conn, num_stay_ids, threshold = 1000)

output:Heartrate.csv                           	number of stay_id:6565
output:ABPs.csv                                	number of stay_id:1870
output:NBPs.csv                                	number of stay_id:6426
output:ABPd.csv                                	number of stay_id:1871
output:NBPd.csv                                	number of stay_id:6425
output:ABPm.csv                                	number of stay_id:1903
output:NBPm.csv                                	number of stay_id:6425
output:RespiratoryRate.csv                     	number of stay_id:6564
output:TemperatureF.csv                        	number of stay_id:6366
output:TemperatureC.csv                        	number of stay_id:655
output:PH_A.csv                                	number of stay_id:3396
output:PH_V.csv                                	number of stay_id:2732
output:ABE.csv                                 	number of stay_id:3378
output:Hematocrit_serum.csv                    	number of stay_id:6492
output:

## 3.2 Data transfer of Action Space

### 3.2.1 Data transfer of Action Space for *IV fluid bolus*

 - IV fluid bolus
   - NaCl_0.9%
   - Dextrose_5%

In [6]:
# output to /output/data/data_raw/action/IV_fluid_bolus/{IV_fluid_bolus_name}.csv
from python.data_preprocessing.data_transfer import data_transfer_action_IV_fluid_bolus

data_transfer_action_IV_fluid_bolus(conn)

output action (IV_fluid_bolus):	NaCl_0_9%.csv
output action (IV_fluid_bolus):	Dextrose_5%.csv


### 3.2.2 Data transfer of Action Space for *Vasopressors*

we directly obtain `vasopressors_equivalent_dose` 

from `mimiciv_derived.norepinephrine_equivalent_dose` 

based on *"Vasopressor dose equivalence: A scoping review and suggested formula" by Goradia et al. 2020*.

In [7]:
# output to /output/data/data_raw/action/vasopressors/vasopressors_equivalent_dose.csv
from python.data_preprocessing.data_transfer import data_transfer_action_vasopressors_equivalent_dose

data_transfer_action_vasopressors_equivalent_dose(conn)

output action (vasopressors): vasopressors_equivalent_dose.csv


# 4 Hourly Sample

## 4.1 Hourly Sample on State Space

In [8]:
# output to /output/data/data_hourly_sample/state/stay_id_{selected_id}.csv
from python.data_preprocessing.hourly_sample import hourly_sample_state
import random
# if os.path.exists('./output/data/data_hourly_sample/state'):shutil.rmtree('./output/data/data_hourly_sample/state')

# selected_ids = random.sample(stay_ids, 5)
# print(f'Selected stay_id: {selected_ids}')
# for selected_id in selected_ids:
#     hourly_sample_state(selected_id, itemid_list_state, label_state, k = 5)

# more than 72 hours ICU stay in following stay_ids
# selected_id = 32217866
# selected_id = 32332328
# selected_id = 38362310
selected_id = 31872514
print(f'Selected stay_id: {selected_id}')
hourly_sample_state(selected_id, itemid_list_state, label_state, k = 5)

Selected stay_id: 31872514


## 4.2 Hourly Sample on Action Space

### 4.2.1 Hourly sample IV_fluid_bolus for both continuous and discrete action space

In [9]:
# output to /output/data/data_hourly_sample/action/IV_fluid_bolus/stay_id_{selected_id}.csv
from python.data_preprocessing.hourly_sample import hourly_sample_action_IV_fluid_bolus
# if os.path.exists('./output/data/data_hourly_sample/action/IV_fluid_bolus/'):shutil.rmtree('./output/data/data_hourly_sample/action/IV_fluid_bolus/')


# selected_id = 31872514 # more than 72 hours ICU stay 
# print(f'Selected stay_id: {selected_id}')
# hourly_sample_action_IV_fluid_bolus(selected_id)

count = 0
for selected_id in stay_ids:
    try:
        hourly_sample_action_IV_fluid_bolus(selected_id)
    except:
        # print(f'Error with {selected_id}')
        count += 1
print(f'Error count: {count}') # 847 out of 6565 stay_ids (12.9%) did not have IV_fluid_bolus. 6565 - 847 = 5718 stay_ids have IV_fluid_bolus.

Error with 37111273
Error with 32358344
Error with 35682949
Error with 36047928
Error with 30420855
Error with 32273111
Error with 35434730
Error with 31604497
Error with 38236658
Error with 38108133
Error with 32446965
Error with 38063091
Error with 31716261
Error with 39450911
Error with 37816133
Error with 31570704
Error with 39816550
Error with 39989105
Error with 39672008
Error with 32263761
Error with 30397533
Error with 39665700
Error with 30632096
Error with 38553297
Error with 32508102
Error with 39524149
Error with 36863901
Error with 32593154
Error with 33011280
Error with 39765292
Error with 38549794
Error with 35489747
Error with 32289051
Error with 35648262
Error with 39369843
Error with 35027491
Error with 33430306
Error with 32361904
Error with 32100826
Error with 33875451
Error with 32660070
Error with 32862166
Error with 31268948
Error with 30774040
Error with 35269156
Error with 38373149
Error with 32451567
Error with 38093573
Error with 37312347
Error with 39999172


### 4.2.2 Hourly sample vasopressors_equivalent_dose for both continuous and discrete action space

In [10]:
# output to /output/data/data_hourly_sample/action/vasopressors_equivalent_dose/stay_id_{selected_id}.csv
from python.data_preprocessing.hourly_sample import hourly_sample_action_vasopressors_equivalent_dose
# if os.path.exists('./output/data/data_hourly_sample/action/vasopressors_equivalent_dose'):shutil.rmtree('./output/data/data_hourly_sample/action/vasopressors_equivalent_dose')


# selected_id = 31872514 # more than 72 hours ICU stay 
# print(f'Selected stay_id: {selected_id}')
# hourly_sample_action_vasopressors_equivalent_dose(selected_id)

count = 0
for selected_id in stay_ids:
    try:
        hourly_sample_action_vasopressors_equivalent_dose(selected_id)
    except:
        # print(f'Error with {selected_id}')
        count += 1
print(f'Error count: {count}') # 3994 out of 6565 stay_ids (60.8%) did not have vasopressors. 6565 - 3994 = 2571 stay_ids had vasopressors.

Error with 33652589
Error with 36960229
Error with 30339244
Error with 31000888
Error with 38711594
Error with 39583514
Error with 30932864
Error with 37999486
Error with 34368740
Error with 34790582
Error with 30032418
Error with 30413406
Error with 33478876
Error with 33434455
Error with 36941890
Error with 38636014
Error with 37111273
Error with 34336536
Error with 39576461
Error with 36047947
Error with 31776148
Error with 30085094
Error with 33282613
Error with 39918742
Error with 32358344
Error with 35682949
Error with 32386069
Error with 36047928
Error with 30420855
Error with 38374351
Error with 32273111
Error with 31397948
Error with 31279934
Error with 38294239


Error with 37984499
Error with 33398606
Error with 31719617
Error with 33463118
Error with 39945306
Error with 35434730
Error with 39444062
Error with 32224359
Error with 31604497
Error with 38909265
Error with 38236658
Error with 31889885
Error with 38108133
Error with 37132783
Error with 32446965
Error with 38063091
Error with 33767175
Error with 34674807
Error with 30841844
Error with 33811581
Error with 33949923
Error with 35217981
Error with 36344664
Error with 38344390
Error with 39698129
Error with 31716261
Error with 37017734
Error with 37710440
Error with 33070852
Error with 35943443
Error with 39419087
Error with 39450911
Error with 32576577
Error with 38973645
Error with 37770460
Error with 38770011
Error with 37816133
Error with 33266392
Error with 31570704
Error with 34279325
Error with 38405237
Error with 39816550
Error with 39295476
Error with 39989105
Error with 39672008
Error with 35484834
Error with 39369475
Error with 34676524
Error with 32263761
Error with 39571846
