# MIMIC-III Sepsis Feature Probe

There are many possible features and markers that could indicate the survivability of a patient with suspected sepsis. In this notebook I will probe and visualize the suspected features of import. I will conduct several techniques to select features and trim less significant ones using univariate analysis and other techniques.

Note that the dataset in its entirety has been set up locally on a Postgresql database. All machine learning will be done remotely to save time and resources down the road.

In [1]:
from sqlalchemy.ext.automap import automap_base
from sqlalchemy.orm import Session
from sqlalchemy import create_engine
import pandas as pd
from sklearn import preprocessing

In [2]:
# Set up SQL Alchemy engine and session
Base = automap_base()

# Doing basic probes on data locally
engine = create_engine("postgresql://mimic_user@localhost:5432/mimic")

# Reflect the tables
Base.prepare(engine, reflect=True, schema='mimiciii')

# mapped classes are now created with names by default
# matching that of the table name.
Admission = Base.classes.admissions
Patient = Base.classes.patients

session = Session(engine)

In [9]:
# session.query(User.name.label('name_label')).all()
# admission_query = session.query(Admission)
admission_query = session.query(Admission).filter_by(diagnosis='SEPSIS')
#  User.query.filter_by(username='peter').first()
df = pd.read_sql(admission_query.statement, admission_query.session.bind)

df.head()
# admission_type, hospital_expire_flag

Unnamed: 0,row_id,subject_id,hadm_id,admittime,dischtime,deathtime,admission_type,admission_location,discharge_location,insurance,language,religion,marital_status,ethnicity,edregtime,edouttime,diagnosis,hospital_expire_flag,has_chartevents_data
0,458,357,122609,2198-11-01 22:36:00,2198-11-14 14:20:00,,EMERGENCY,EMERGENCY ROOM ADMIT,REHAB/DISTINCT PART HOSP,Private,ENGL,NOT SPECIFIED,MARRIED,WHITE,2198-11-01 18:01:00,2198-11-01 23:06:00,SEPSIS,0,1
1,471,366,134462,2164-11-18 20:27:00,2164-11-22 15:18:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME HEALTH CARE,Medicare,ENGL,CATHOLIC,SINGLE,HISPANIC OR LATINO,2164-11-18 10:52:00,2164-11-18 21:31:00,SEPSIS,0,1
2,96,94,183686,2176-02-25 16:49:00,2176-02-29 17:45:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME HEALTH CARE,Medicare,CANT,NOT SPECIFIED,MARRIED,ASIAN,2176-02-25 10:35:00,2176-02-25 18:14:00,SEPSIS,0,1
3,20,21,111970,2135-01-30 20:50:00,2135-02-08 02:08:00,2135-02-08 02:08:00,EMERGENCY,EMERGENCY ROOM ADMIT,DEAD/EXPIRED,Medicare,,JEWISH,MARRIED,WHITE,2135-01-30 18:46:00,2135-01-30 22:05:00,SEPSIS,1,1
4,448,353,108923,2151-03-28 16:01:00,2151-04-13 16:10:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME,Medicare,PTUN,JEWISH,SINGLE,WHITE,2151-03-28 13:02:00,2151-03-28 17:46:00,SEPSIS,0,1


In [39]:
person = df.iloc[3]
print person

row_id                                    20
subject_id                                21
hadm_id                               111970
admittime                2135-01-30 20:50:00
dischtime                2135-02-08 02:08:00
deathtime                2135-02-08 02:08:00
admission_type                     EMERGENCY
admission_location      EMERGENCY ROOM ADMIT
discharge_location              DEAD/EXPIRED
insurance                           Medicare
language                                None
religion                              JEWISH
marital_status                       MARRIED
ethnicity                              WHITE
edregtime                2135-01-30 18:46:00
edouttime                2135-01-30 22:05:00
diagnosis                             SEPSIS
hospital_expire_flag                       1
has_chartevents_data                       1
Name: 3, dtype: object


In [37]:
def period(row, period):
    
    if row['deathtime'] == None:
        return 0
    elif row['deathtime'] - row['admittime'] > period:
        return 0
    else:
        return 1
    row['admittime']

In [40]:
df['death_period'] = df.apply (lambda row: period (row, pd.Timedelta('30 days')),axis=1)

In [41]:
df.head()

Unnamed: 0,row_id,subject_id,hadm_id,admittime,dischtime,deathtime,admission_type,admission_location,discharge_location,insurance,language,religion,marital_status,ethnicity,edregtime,edouttime,diagnosis,hospital_expire_flag,has_chartevents_data,death_period
0,458,357,122609,2198-11-01 22:36:00,2198-11-14 14:20:00,,EMERGENCY,EMERGENCY ROOM ADMIT,REHAB/DISTINCT PART HOSP,Private,ENGL,NOT SPECIFIED,MARRIED,WHITE,2198-11-01 18:01:00,2198-11-01 23:06:00,SEPSIS,0,1,0
1,471,366,134462,2164-11-18 20:27:00,2164-11-22 15:18:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME HEALTH CARE,Medicare,ENGL,CATHOLIC,SINGLE,HISPANIC OR LATINO,2164-11-18 10:52:00,2164-11-18 21:31:00,SEPSIS,0,1,0
2,96,94,183686,2176-02-25 16:49:00,2176-02-29 17:45:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME HEALTH CARE,Medicare,CANT,NOT SPECIFIED,MARRIED,ASIAN,2176-02-25 10:35:00,2176-02-25 18:14:00,SEPSIS,0,1,0
3,20,21,111970,2135-01-30 20:50:00,2135-02-08 02:08:00,2135-02-08 02:08:00,EMERGENCY,EMERGENCY ROOM ADMIT,DEAD/EXPIRED,Medicare,,JEWISH,MARRIED,WHITE,2135-01-30 18:46:00,2135-01-30 22:05:00,SEPSIS,1,1,1
4,448,353,108923,2151-03-28 16:01:00,2151-04-13 16:10:00,,EMERGENCY,EMERGENCY ROOM ADMIT,HOME,Medicare,PTUN,JEWISH,SINGLE,WHITE,2151-03-28 13:02:00,2151-03-28 17:46:00,SEPSIS,0,1,0
