diff --git a/README.md b/README.md index 6591642..28c97c4 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,10 @@ This is a repository of code shared by the research community. The repository is intended to be a central hub for sharing, refining, and reusing code used for analysis of the [MIMIC critical care database](https://mimic.physionet.org). To find out more about MIMIC, please see: https://mimic.physionet.org +## Acknowledgement + +[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.821872.svg)](https://doi.org/10.5281/zenodo.821872) + ## How to contribute Our team has worked hard to create and share the MIMIC dataset. We encourage you to share the code that you use for data processing and analysis. Sharing code helps to make studies reproducible and promotes collaborative research. To contribute, please: @@ -12,6 +16,10 @@ Our team has worked hard to create and share the MIMIC dataset. We encourage you We encourage users to share concepts they have extracted by writing code which generates a materialized view. These materialized views can then be used by researchers around the world to speed up data extraction. For example, ventilation durations can be acquired by creating the ventdurations view in [etc/ventilation-durations.sql](https://github.com/MIT-LCP/mimic-code/blob/master/concepts/ventilation-durations.sql). +## License + +By committing your code to the [MIMIC Code Repository](https://github.com/mit-lcp/mimic-code) you agree to release the code under the [MIT License attached to the repository](https://github.com/mit-lcp/mimic-code/blob/master/LICENSE). + ## Coding style Please refer to the [style guide](https://github.com/MIT-LCP/mimic-code/blob/master/styleguide.md) for guidelines on formatting your code for the repository. diff --git a/buildmimic/postgres/Makefile b/buildmimic/postgres/Makefile index 4782879..2f2f1bc 100644 --- a/buildmimic/postgres/Makefile +++ b/buildmimic/postgres/Makefile @@ -1,26 +1,18 @@ # Config PHYSIONETURL=https://physionet.org/works/MIMICIIIClinicalDatabase/files/ -# The following check whether values are passed via environment, set to defaults if not -ifeq ($(DBNAME),) +# Set the following parameters to defaults +# These will be overwritten by settings passed to the makefile DBNAME := mimic -endif - -ifeq ($(DBUSER),) DBUSER := postgres -endif - -# Specify the password here -# If you don't specify a password, then the role will not require one to login via password authentication -#DBPASS= - -# Change "mimiciii" to specify a different schema +DBPASS := postgres DBSCHEMA := mimiciii # NOTE: you do not need to specify localhost/port -# in fact, this is detrimental if you want to use peer authentication, as "localhost" is not strictly local -#DBHOST := localhost -#DBPORT := 5432 +# in fact, this is detrimental if you want to use peer authentication +# "localhost" uses a loopback, so peer authentication doesn't work with it +DBHOST := +DBPORT := # when connecting, we use a single variable: DBSTRING # **do not modify this** @@ -66,7 +58,7 @@ create-user: @echo '------------------------' @echo '' @sleep 2 - MIMIC_USER="$(DBUSER)" MIMIC_DB="$(DBNAME)" MIMIC_PASSWORD="$(DBPASS)" MIMIC_SCHEMA="$(DBSCHEMA)" ./create_mimic_user.sh + MIMICUSER="$(DBUSER)" MIMIC_DB="$(DBNAME)" MIMIC_PASSWORD="$(DBPASS)" MIMIC_SCHEMA="$(DBSCHEMA)" ./create_mimic_user.sh mimic-build-gz: @echo '------------------------' diff --git a/buildmimic/postgres/README.md b/buildmimic/postgres/README.md index d20ca26..f8d608c 100644 --- a/buildmimic/postgres/README.md +++ b/buildmimic/postgres/README.md @@ -22,16 +22,27 @@ For example, to create MIMIC from a set of zipped CSV files in the "/path/to/dat $ make mimic datadir="/path/to/data/" ``` -If default connection parameters are not correct, specify in Makefile header or in environment, e.g.: +By default, the Makefile uses the following parameters: + +* Database name: `mimic` +* User name: `postgres` +* Password: `postgres` +* Schema: `mimiciii` +* Host: none (defaults to localhost) +* Port: none (defaults to 5432) + +If you would like to change any of these parameters, you can do so in the make call: ``` bash -$ DBNAME="my_db" DBPASS="my_pass" DBHOST="192.168.0.1" make mimic-build datadir="/path/to/data/" +$ make mimic datadir="/path/to/data/" DBNAME="my_db" DBPASS="my_pass" DBHOST="192.168.0.1" ``` -When using the database be sure to switch to the mimic namespace, +When using the database be sure to change the default search path to the mimic schema: ```bash -$ psql mimic +# connect to database mimic +$ psql -d mimic +# set default schema to mimiciii mimic=# SET search_path TO mimiciii; ``` @@ -45,3 +56,29 @@ LINE 1: CREATE SCHEMA IF NOT EXISTS mimiciii; ``` The `IF NOT EXISTS` syntax was introduced in PostgreSQL 9.3. Make sure you have the latest PostgreSQL version. While one possible option is to modify the code here to be function under earlier versions, we highly recommend upgrading as most of the code written in this repository uses materialized views (which were introduced in PostgreSQL version 9.4). + +## NOTICE + +```sql +NOTICE: materialized view "XXXXXX" does not exist, skipping +``` + +This is normal. By default, the script attempts to delete tables before rebuilding them. If it cannot find the table to delete, it outputs a notice letting the user know. + +## Stuck on copy + +Many users report that the scripts get stuck at the following point: + +``` +COPY 58976 +COPY 34499 +COPY 7567 +``` + +This is expected. The 4th table is CHARTEVENTS, and this table can take many hours to load. Give it time, and ensure that the computer does not automatically hibernate during this time. + +Also note that eventually, the 4th line will read `COPY 0`. This is expected, see https://github.com/MIT-LCP/mimic-code/issues/182 + +## Other + +Please see the issues page to discuss other issues you may be having: https://github.com/MIT-LCP/mimic-code/issues diff --git a/buildmimic/postgres/create_mimic_user.sh b/buildmimic/postgres/create_mimic_user.sh index e6bac1c..cbc7a64 100755 --- a/buildmimic/postgres/create_mimic_user.sh +++ b/buildmimic/postgres/create_mimic_user.sh @@ -1,4 +1,5 @@ #!/bin/bash +set -e if [ -z ${MIMIC_PASSWORD+x} ]; then echo "MIMIC_PASSWORD is unset"; @@ -16,9 +17,9 @@ fi if [ -z ${MIMIC_USER+x} ]; then MIMIC_USER=postgres - echo "MIMIC_USER is unset, using default '$MIMIC_USER'"; + echo "User is unset, using default '$MIMIC_USER'"; else - echo "MIMIC_USER is set to '$MIMIC_USER'"; + echo "User is set to '$MIMIC_USER'"; fi # if hash gosu 2>/dev/null; then @@ -27,9 +28,16 @@ fi # SUDO='sudo -u postgres' # fi -$SUDO psql postgres > /dev/null <<- EOSQL - CREATE USER $MIMIC_USER WITH PASSWORD '$MIMIC_PASSWORD'; - DROP DATABASE IF EXISTS $MIMIC_DB; - CREATE DATABASE $MIMIC_DB OWNER $MIMIC_USER; - CREATE SCHEMA $MIMIC_SCHEMA AUTHORIZATION $MIMIC_USER; -EOSQL +if [ "$MIMIC_USER" != "postgres" ]; then + # create user + psql postgres -c "DROP USER IF EXISTS $MIMIC_USER;" + psql postgres -c "CREATE USER $MIMIC_USER WITH PASSWORD '$MIMIC_PASSWORD';" +fi + +# create database +psql postgres -c "DROP DATABASE IF EXISTS $MIMIC_DB;" +psql postgres -c "CREATE DATABASE $MIMIC_DB OWNER $MIMIC_USER;" + +# create schema on database +export PGPASSWORD=$MIMIC_PASSWORD +psql -U $MIMIC_USER -d ${MIMIC_DB} -c "CREATE SCHEMA $MIMIC_SCHEMA AUTHORIZATION $MIMIC_USER;" diff --git a/concepts/durations/crrt-durations.sql b/concepts/durations/crrt-durations.sql index 4ab3f3a..da9cbb3 100644 --- a/concepts/durations/crrt-durations.sql +++ b/concepts/durations/crrt-durations.sql @@ -1,5 +1,5 @@ -DROP TABLE IF EXISTS crrtdurations; -CREATE TABLE crrtdurations as +DROP MATERIALIZED VIEW IF EXISTS crrtdurations; +CREATE MATERIALIZED VIEW crrtdurations as with crrt_settings as ( select ce.icustay_id, ce.charttime @@ -195,6 +195,8 @@ select icustay_id , ROW_NUMBER() over (partition by icustay_id order by num) as num , min(charttime) as starttime , max(charttime) as endtime + , extract(epoch from max(charttime)-min(charttime))/60/60 AS duration_hours + -- add durations from vd2 group by icustay_id, num having min(charttime) != max(charttime) diff --git a/concepts/durations/ventilation-durations.sql b/concepts/durations/ventilation-durations.sql index a5622c7..1d336fe 100644 --- a/concepts/durations/ventilation-durations.sql +++ b/concepts/durations/ventilation-durations.sql @@ -14,8 +14,8 @@ -- First, create a temporary table to store relevant data from CHARTEVENTS. -DROP TABLE IF EXISTS ventsettings CASCADE; -CREATE TABLE ventsettings AS +DROP MATERIALIZED VIEW IF EXISTS ventsettings CASCADE; +CREATE MATERIALIZED VIEW ventsettings AS select icustay_id, charttime -- case statement determining whether it is an instance of mech vent @@ -158,8 +158,8 @@ where itemid in --DROP MATERIALIZED VIEW IF EXISTS VENTDURATIONS CASCADE; -DROP TABLE IF EXISTS VENTDURATIONS CASCADE; -create table ventdurations as +DROP MATERIALIZED VIEW IF EXISTS VENTDURATIONS CASCADE; +create MATERIALIZED VIEW ventdurations as with vd0 as ( select @@ -255,5 +255,3 @@ having min(charttime) != max(charttime) -- in these cases, ventnum=0 and max(mechvent)=0, so they are ignored and max(mechvent) = 1 order by icustay_id, ventnum; - -DROP TABLE ventsettings; diff --git a/concepts/firstday/labs-first-day.sql b/concepts/firstday/labs-first-day.sql index c69110c..a50ca72 100644 --- a/concepts/firstday/labs-first-day.sql +++ b/concepts/firstday/labs-first-day.sql @@ -153,5 +153,3 @@ FROM ) pvt GROUP BY pvt.subject_id, pvt.hadm_id, pvt.icustay_id ORDER BY pvt.subject_id, pvt.hadm_id, pvt.icustay_id; - -commit; diff --git a/concepts/make-concepts.sql b/concepts/make-concepts.sql index 0419951..2e72f27 100644 --- a/concepts/make-concepts.sql +++ b/concepts/make-concepts.sql @@ -1,7 +1,13 @@ -- This file makes all materialized views in this subfolder -- Note that this may take a large amount of time and hard drive space +\echo '' +\echo '===' \echo 'Beginning to create materialized views for MIMIC database.' +\echo 'Any notices of the form "NOTICE: materialized view "XXXXXX" does not exist" can be ignored.' +\echo 'The scripts drop views before creating them, and these notices indicate nothing existed prior to creating the view.' +\echo '===' +\echo '' \echo 'Top level files..' \i code-status.sql diff --git a/tests/test_mysql_build.py b/tests/test_mysql_build.py index a1434c4..b13ca99 100644 --- a/tests/test_mysql_build.py +++ b/tests/test_mysql_build.py @@ -8,7 +8,7 @@ sqluser = 'root' testdbname = 'mimic_test_db' hostname = 'localhost' -datadir = 'testdata/v1_3/' +datadir = 'testdata/v1_4/' schema = 'mimiciii' # Set paths for scripts to be tested @@ -22,23 +22,23 @@ "ADMISSIONS": 58976, "CALLOUT": 34499, "CAREGIVERS": 7567, -"CHARTEVENTS": 263201375, +"CHARTEVENTS": 330712483, "CPTEVENTS": 573146, "D_CPT": 134, "D_ICD_DIAGNOSES": 14567, "D_ICD_PROCEDURES": 3882, "D_ITEMS": 12478, -"D_LABITEMS": 755, -"DATETIMEEVENTS": 4486049, +"D_LABITEMS": 753, +"DATETIMEEVENTS": 4485937, "DIAGNOSES_ICD": 651047, "DRGCODES": 125557, "ICUSTAYS": 61532, -"INPUTEVENTS_CV": 17528894, +"INPUTEVENTS_CV": 17527935, "INPUTEVENTS_MV": 3618991, -"LABEVENTS": 27872575, -"MICROBIOLOGYEVENTS": 328446, -"NOTEEVENTS": 2078705, -"OUTPUTEVENTS": 4349339, +"LABEVENTS": 27854055, +"MICROBIOLOGYEVENTS": 631726, +"NOTEEVENTS": 2083180, +"OUTPUTEVENTS": 4349218, "PATIENTS": 46520, "PRESCRIPTIONS": 4156848, "PROCEDUREEVENTS_MV": 258066, @@ -50,10 +50,10 @@ def run_mysql_build_scripts(cur): # Create tables and loads data fn = curpath + '../buildmimic/mysql/1-define.sql' cur.execute(open(fn, "r").read()) - if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': + if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': # use full dataset mimic_data_dir = '/home/mimicadmin/data/mimiciii_1_3/' - else: + else: mimic_data_dir = curpath+datadir call(['mysql','-f',fn,'-d',testdbname,'-U',sqluser,'-v','mimic_data_dir='+mimic_data_dir]) # # Add constraints @@ -74,7 +74,7 @@ def setUpClass(cls): cls.con = MySQLdb.connect(host=hostname, user=sqluser) cls.cur = cls.con.cursor() # Create test database - try: + try: cls.cur.execute('DROP DATABASE ' + testdbname) except MySQLdb.OperationalError: pass diff --git a/tests/test_oracle_build.py b/tests/test_oracle_build.py index 0a83b11..59dabc4 100644 --- a/tests/test_oracle_build.py +++ b/tests/test_oracle_build.py @@ -8,7 +8,7 @@ sqluser = 'root' testdbname = 'mimic_test_db' hostname = 'localhost' -datadir = 'testdata/v1_3/' +datadir = 'testdata/v1_4/' schema = 'mimiciii' # Set paths for scripts to be tested @@ -22,23 +22,23 @@ "ADMISSIONS": 58976, "CALLOUT": 34499, "CAREGIVERS": 7567, -"CHARTEVENTS": 263201375, +"CHARTEVENTS": 330712483, "CPTEVENTS": 573146, "D_CPT": 134, "D_ICD_DIAGNOSES": 14567, "D_ICD_PROCEDURES": 3882, "D_ITEMS": 12478, -"D_LABITEMS": 755, -"DATETIMEEVENTS": 4486049, +"D_LABITEMS": 753, +"DATETIMEEVENTS": 4485937, "DIAGNOSES_ICD": 651047, "DRGCODES": 125557, "ICUSTAYS": 61532, -"INPUTEVENTS_CV": 17528894, +"INPUTEVENTS_CV": 17527935, "INPUTEVENTS_MV": 3618991, -"LABEVENTS": 27872575, -"MICROBIOLOGYEVENTS": 328446, -"NOTEEVENTS": 2078705, -"OUTPUTEVENTS": 4349339, +"LABEVENTS": 27854055, +"MICROBIOLOGYEVENTS": 631726, +"NOTEEVENTS": 2083180, +"OUTPUTEVENTS": 4349218, "PATIENTS": 46520, "PRESCRIPTIONS": 4156848, "PROCEDUREEVENTS_MV": 258066, @@ -50,10 +50,10 @@ # # Create tables and loads data # fn = curpath + '../buildmimic/mysql/1-define.sql' # cur.execute(open(fn, "r").read()) -# if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': +# if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': # # use full dataset # mimic_data_dir = '/home/mimicadmin/data/mimiciii_1_3/' -# else: +# else: # mimic_data_dir = curpath+datadir # call(['mysql','-f',fn,'-d',testdbname,'-U',sqluser,'-v','mimic_data_dir='+mimic_data_dir]) # # # Add constraints @@ -74,7 +74,7 @@ # cls.con = MySQLdb.connect(host=hostname, user=sqluser) # cls.cur = cls.con.cursor() # # Create test database -# try: +# try: # cls.cur.execute('DROP DATABASE ' + testdbname) # except MySQLdb.OperationalError: # pass diff --git a/tests/test_postgres_build.py b/tests/test_postgres_build.py index 0a1c1fd..626c856 100644 --- a/tests/test_postgres_build.py +++ b/tests/test_postgres_build.py @@ -14,7 +14,7 @@ psqluser = 'postgres' testdbname = 'mimic_test_db' hostname = 'localhost' -datadir = 'testdata/v1_3/' +datadir = 'testdata/v1_4/' schema = 'mimiciii' # Set paths for scripts to be tested @@ -28,23 +28,23 @@ "ADMISSIONS": 58976, "CALLOUT": 34499, "CAREGIVERS": 7567, -"CHARTEVENTS": 263201375, +"CHARTEVENTS": 330712483, "CPTEVENTS": 573146, "D_CPT": 134, "D_ICD_DIAGNOSES": 14567, "D_ICD_PROCEDURES": 3882, "D_ITEMS": 12478, -"D_LABITEMS": 755, -"DATETIMEEVENTS": 4486049, +"D_LABITEMS": 753, +"DATETIMEEVENTS": 4485937, "DIAGNOSES_ICD": 651047, "DRGCODES": 125557, "ICUSTAYS": 61532, -"INPUTEVENTS_CV": 17528894, +"INPUTEVENTS_CV": 17527935, "INPUTEVENTS_MV": 3618991, -"LABEVENTS": 27872575, -"MICROBIOLOGYEVENTS": 328446, -"NOTEEVENTS": 2078705, -"OUTPUTEVENTS": 4349339, +"LABEVENTS": 27854055, +"MICROBIOLOGYEVENTS": 631726, +"NOTEEVENTS": 2083180, +"OUTPUTEVENTS": 4349218, "PATIENTS": 46520, "PRESCRIPTIONS": 4156848, "PROCEDUREEVENTS_MV": 258066, @@ -78,10 +78,10 @@ def run_postgres_build_scripts(cur): cur.execute(open(fn, "r").read()) # Loads data fn = curpath + '../buildmimic/postgres/postgres_load_data.sql' - if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': + if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': # use full dataset - mimic_data_dir = '/home/mimicadmin/data/mimiciii_1_3/' - else: + mimic_data_dir = '/home/mimicadmin/data/mimiciii_1_4/' + else: mimic_data_dir = curpath+datadir call(['psql','-f',fn,'-d',testdbname,'-U',psqluser,'-v','mimic_data_dir='+mimic_data_dir]) # Add constraints @@ -98,10 +98,10 @@ def run_postgres_build_scripts(cur): # cur.execute(open(fn, "r").read()) # # Loads data # fn = curpath + '../buildmimic/mysql/mysql_load_data.sql' -# if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': +# if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': # # use full dataset # mimic_data_dir = '/home/mimicadmin/data/mimiciii_1_3/' -# else: +# else: # mimic_data_dir = curpath+datadir # call(['psql','-f',fn,'-d',testdbname,'-U',psqluser,'-v','mimic_data_dir='+mimic_data_dir]) # # Add constraints @@ -121,7 +121,7 @@ def setUpClass(cls): cls.con.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT) cls.cur = cls.con.cursor() # Create test database - try: + try: cls.cur.execute('DROP DATABASE ' + testdbname) except psycopg2.ProgrammingError: pass @@ -180,7 +180,7 @@ def test_testddl(self): # Run a series of checks to ensure ITEMIDs are valid # All checks should return 0. # -------------------------------------------------- - + def test_itemids_in_inputevents_cv_are_shifted(self): query = """ -- prompt Number of ITEMIDs which were erroneously left as original value @@ -189,7 +189,7 @@ def test_itemids_in_inputevents_cv_are_shifted(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_inputevents_mv_are_shifted(self): query = """ -- prompt Number of ITEMIDs which were erroneously left as original value @@ -198,7 +198,7 @@ def test_itemids_in_inputevents_mv_are_shifted(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_outputevents_are_shifted(self): query = """ -- prompt Number of ITEMIDs which were erroneously left as original value @@ -207,7 +207,7 @@ def test_itemids_in_outputevents_are_shifted(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_inputevents_cv_are_in_range(self): query = """ -- prompt Number of ITEMIDs which are above the allowable range @@ -216,7 +216,7 @@ def test_itemids_in_inputevents_cv_are_in_range(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_outputevents_are_in_range(self): query = """ -- prompt Number of ITEMIDs which are not in the allowable range @@ -225,7 +225,7 @@ def test_itemids_in_outputevents_are_in_range(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_chartevents_are_in_range(self): query = """ -- prompt Number of ITEMIDs which are not in the allowable range @@ -234,7 +234,7 @@ def test_itemids_in_chartevents_are_in_range(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_procedureevents_mv_are_in_range(self): query = """ -- prompt Number of ITEMIDs which are not in the allowable range @@ -243,7 +243,7 @@ def test_itemids_in_procedureevents_mv_are_in_range(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_labevents_are_in_range(self): query = """ -- prompt Number of ITEMIDs which are not in the allowable range @@ -252,7 +252,7 @@ def test_itemids_in_labevents_are_in_range(self): """ queryresult = pd.read_sql_query(query,self.con) self.assertEqual(queryresult.values[0][0],0) - + def test_itemids_in_microbiologyevents_are_in_range(self): query = """ -- prompt Number of ITEMIDs which are not in the allowable range @@ -267,7 +267,7 @@ def test_itemids_in_microbiologyevents_are_in_range(self): # ---------------------------------------------------- # RUN THE FOLLOWING TESTS ON THE FULL DATASET ONLY --- # ---------------------------------------------------- - + if os.environ.has_key('USER') and os.environ['USER'] == 'jenkins': def test_row_counts_are_as_expected(self): for tablename,expectedrows in row_dict.iteritems(): @@ -279,20 +279,20 @@ def test_age_and_los_is_expected(self): query = \ """ WITH icuadmissions as ( - SELECT a.subject_id, a.hadm_id, i.icustay_id, - a.admittime as hosp_admittime, a.dischtime as hosp_dischtime, - i.first_careunit, + SELECT a.subject_id, a.hadm_id, i.icustay_id, + a.admittime as hosp_admittime, a.dischtime as hosp_dischtime, + i.first_careunit, DENSE_RANK() over(PARTITION BY a.hadm_id ORDER BY i.intime ASC) as icu_seq, - p.dob, p.dod, i.intime as icu_intime, i.outtime as icu_outtime, + p.dob, p.dod, i.intime as icu_intime, i.outtime as icu_outtime, i.los as icu_los, - round((EXTRACT(EPOCH FROM (a.dischtime-a.admittime))/60/60/24) :: NUMERIC, 4) as hosp_los, - p.gender, + round((EXTRACT(EPOCH FROM (a.dischtime-a.admittime))/60/60/24) :: NUMERIC, 4) as hosp_los, + p.gender, round((EXTRACT(EPOCH FROM (a.admittime-p.dob))/60/60/24/365.242) :: NUMERIC, 4) as age_hosp_in, round((EXTRACT(EPOCH FROM (i.intime-p.dob))/60/60/24/365.242) :: NUMERIC, 4) as age_icu_in, hospital_expire_flag, - CASE WHEN p.dod IS NOT NULL + CASE WHEN p.dod IS NOT NULL AND p.dod >= i.intime - interval '6 hour' - AND p.dod <= i.outtime + interval '6 hour' THEN 1 + AND p.dod <= i.outtime + interval '6 hour' THEN 1 ELSE 0 END AS icu_expire_flag FROM admissions a INNER JOIN icustays i @@ -300,8 +300,8 @@ def test_age_and_los_is_expected(self): INNER JOIN patients p ON a.subject_id = p.subject_id ORDER BY a.subject_id, i.intime) - SELECT round(avg(age_icu_in)) as avg_age_icu, - round(avg(hosp_los)) as avg_los_hosp, + SELECT round(avg(age_icu_in)) as avg_age_icu, + round(avg(hosp_los)) as avg_los_hosp, round(avg(icu_los)) as avg_los_icu FROM icuadmissions; """ diff --git a/tests/testdata/v1_4/ADMISSIONS.csv.gz b/tests/testdata/v1_4/ADMISSIONS.csv.gz new file mode 100644 index 0000000..b156907 Binary files /dev/null and b/tests/testdata/v1_4/ADMISSIONS.csv.gz differ diff --git a/tests/testdata/v1_4/CALLOUT.csv.gz b/tests/testdata/v1_4/CALLOUT.csv.gz new file mode 100644 index 0000000..b707668 Binary files /dev/null and b/tests/testdata/v1_4/CALLOUT.csv.gz differ diff --git a/tests/testdata/v1_4/CAREGIVERS.csv.gz b/tests/testdata/v1_4/CAREGIVERS.csv.gz new file mode 100644 index 0000000..b0218db Binary files /dev/null and b/tests/testdata/v1_4/CAREGIVERS.csv.gz differ diff --git a/tests/testdata/v1_4/CHARTEVENTS.csv.gz b/tests/testdata/v1_4/CHARTEVENTS.csv.gz new file mode 100644 index 0000000..92a9815 Binary files /dev/null and b/tests/testdata/v1_4/CHARTEVENTS.csv.gz differ diff --git a/tests/testdata/v1_4/CPTEVENTS.csv.gz b/tests/testdata/v1_4/CPTEVENTS.csv.gz new file mode 100644 index 0000000..901a657 Binary files /dev/null and b/tests/testdata/v1_4/CPTEVENTS.csv.gz differ diff --git a/tests/testdata/v1_4/DATETIMEEVENTS.csv.gz b/tests/testdata/v1_4/DATETIMEEVENTS.csv.gz new file mode 100644 index 0000000..aa1b273 Binary files /dev/null and b/tests/testdata/v1_4/DATETIMEEVENTS.csv.gz differ diff --git a/tests/testdata/v1_4/DIAGNOSES_ICD.csv.gz b/tests/testdata/v1_4/DIAGNOSES_ICD.csv.gz new file mode 100644 index 0000000..9bf3e32 Binary files /dev/null and b/tests/testdata/v1_4/DIAGNOSES_ICD.csv.gz differ diff --git a/tests/testdata/v1_4/DRGCODES.csv.gz b/tests/testdata/v1_4/DRGCODES.csv.gz new file mode 100644 index 0000000..61f9668 Binary files /dev/null and b/tests/testdata/v1_4/DRGCODES.csv.gz differ diff --git a/tests/testdata/v1_4/D_CPT.csv.gz b/tests/testdata/v1_4/D_CPT.csv.gz new file mode 100644 index 0000000..86e4dc6 Binary files /dev/null and b/tests/testdata/v1_4/D_CPT.csv.gz differ diff --git a/tests/testdata/v1_4/D_ICD_DIAGNOSES.csv.gz b/tests/testdata/v1_4/D_ICD_DIAGNOSES.csv.gz new file mode 100644 index 0000000..3eef82f Binary files /dev/null and b/tests/testdata/v1_4/D_ICD_DIAGNOSES.csv.gz differ diff --git a/tests/testdata/v1_4/D_ICD_PROCEDURES.csv.gz b/tests/testdata/v1_4/D_ICD_PROCEDURES.csv.gz new file mode 100644 index 0000000..fdd30e8 Binary files /dev/null and b/tests/testdata/v1_4/D_ICD_PROCEDURES.csv.gz differ diff --git a/tests/testdata/v1_4/D_ITEMS.csv.gz b/tests/testdata/v1_4/D_ITEMS.csv.gz new file mode 100644 index 0000000..47628fd Binary files /dev/null and b/tests/testdata/v1_4/D_ITEMS.csv.gz differ diff --git a/tests/testdata/v1_4/D_LABITEMS.csv.gz b/tests/testdata/v1_4/D_LABITEMS.csv.gz new file mode 100644 index 0000000..468fdda Binary files /dev/null and b/tests/testdata/v1_4/D_LABITEMS.csv.gz differ diff --git a/tests/testdata/v1_4/ICUSTAYS.csv.gz b/tests/testdata/v1_4/ICUSTAYS.csv.gz new file mode 100644 index 0000000..1df5d1a Binary files /dev/null and b/tests/testdata/v1_4/ICUSTAYS.csv.gz differ diff --git a/tests/testdata/v1_4/INPUTEVENTS_CV.csv.gz b/tests/testdata/v1_4/INPUTEVENTS_CV.csv.gz new file mode 100644 index 0000000..bf324ed Binary files /dev/null and b/tests/testdata/v1_4/INPUTEVENTS_CV.csv.gz differ diff --git a/tests/testdata/v1_4/INPUTEVENTS_MV.csv.gz b/tests/testdata/v1_4/INPUTEVENTS_MV.csv.gz new file mode 100644 index 0000000..7eb6583 Binary files /dev/null and b/tests/testdata/v1_4/INPUTEVENTS_MV.csv.gz differ diff --git a/tests/testdata/v1_4/LABEVENTS.csv.gz b/tests/testdata/v1_4/LABEVENTS.csv.gz new file mode 100644 index 0000000..6f07e45 Binary files /dev/null and b/tests/testdata/v1_4/LABEVENTS.csv.gz differ diff --git a/tests/testdata/v1_4/MICROBIOLOGYEVENTS.csv.gz b/tests/testdata/v1_4/MICROBIOLOGYEVENTS.csv.gz new file mode 100644 index 0000000..e4a3f2b Binary files /dev/null and b/tests/testdata/v1_4/MICROBIOLOGYEVENTS.csv.gz differ diff --git a/tests/testdata/v1_4/NOTEEVENTS.csv.gz b/tests/testdata/v1_4/NOTEEVENTS.csv.gz new file mode 100644 index 0000000..474d53d Binary files /dev/null and b/tests/testdata/v1_4/NOTEEVENTS.csv.gz differ diff --git a/tests/testdata/v1_4/OUTPUTEVENTS.csv.gz b/tests/testdata/v1_4/OUTPUTEVENTS.csv.gz new file mode 100644 index 0000000..d976231 Binary files /dev/null and b/tests/testdata/v1_4/OUTPUTEVENTS.csv.gz differ diff --git a/tests/testdata/v1_4/PATIENTS.csv.gz b/tests/testdata/v1_4/PATIENTS.csv.gz new file mode 100644 index 0000000..729a8f9 Binary files /dev/null and b/tests/testdata/v1_4/PATIENTS.csv.gz differ diff --git a/tests/testdata/v1_4/PRESCRIPTIONS.csv.gz b/tests/testdata/v1_4/PRESCRIPTIONS.csv.gz new file mode 100644 index 0000000..def8e98 Binary files /dev/null and b/tests/testdata/v1_4/PRESCRIPTIONS.csv.gz differ diff --git a/tests/testdata/v1_4/PROCEDUREEVENTS_MV.csv.gz b/tests/testdata/v1_4/PROCEDUREEVENTS_MV.csv.gz new file mode 100644 index 0000000..decdf74 Binary files /dev/null and b/tests/testdata/v1_4/PROCEDUREEVENTS_MV.csv.gz differ diff --git a/tests/testdata/v1_4/PROCEDURES_ICD.csv.gz b/tests/testdata/v1_4/PROCEDURES_ICD.csv.gz new file mode 100644 index 0000000..a7ab12c Binary files /dev/null and b/tests/testdata/v1_4/PROCEDURES_ICD.csv.gz differ diff --git a/tests/testdata/v1_4/SERVICES.csv.gz b/tests/testdata/v1_4/SERVICES.csv.gz new file mode 100644 index 0000000..b08671b Binary files /dev/null and b/tests/testdata/v1_4/SERVICES.csv.gz differ diff --git a/tests/testdata/v1_4/TRANSFERS.csv.gz b/tests/testdata/v1_4/TRANSFERS.csv.gz new file mode 100644 index 0000000..bea5bb2 Binary files /dev/null and b/tests/testdata/v1_4/TRANSFERS.csv.gz differ