# Sepsis-3 evaluation in the MIMIC-III database

This notebook goes over the evaluation of the new Sepsis-3 guidelines in the MIMIC database. The goals of this analysis include:

1. Evaluating the Sepsis-3 guidelines in MIMIC using the same methodology as in the research paper
2. Evaluating the Sepsis-3 guidelines against ANGUS criteria
3. Assessing if there are interesting subgroup(s) which are missed by the criteria

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import psycopg2
import sys
import statsmodels.api as sm

import sepsis_utils as su

from sklearn.pipeline import Pipeline

# used for train/test splits
from sklearn.cross_validation import train_test_split

# used to impute mean for data
from sklearn.preprocessing import Imputer

# normalize the data
from sklearn import preprocessing

# logistic regression is our model of choice
from sklearn.linear_model import LogisticRegression

# used to create confusion matrix
from sklearn.metrics import confusion_matrix

from sklearn.cross_validation import cross_val_score

# used to calculate AUROC/accuracy
from sklearn import metrics

# for calibration curve of severity scores
from sklearn.calibration import calibration_curve


# default colours for prettier plots
col = [[0.9047, 0.1918, 0.1988],
    [0.2941, 0.5447, 0.7494],
    [0.3718, 0.7176, 0.3612],
    [1.0000, 0.5482, 0.1000],
    [0.4550, 0.4946, 0.4722],
    [0.6859, 0.4035, 0.2412],
    [0.9718, 0.5553, 0.7741],
    [0.5313, 0.3359, 0.6523]];
marker = ['v','o','d','^','s','o','+']
ls = ['-','-','-','-','-','s','--','--']
%matplotlib inline

In [None]:
# create a database connection

# below config used on pc70
sqluser = 'alistairewj'
dbname = 'mimic'
schema_name = 'mimiciii'

# Connect to local postgres version of mimic
con = psycopg2.connect(dbname=dbname, user=sqluser)

In [None]:
# call functions to extract the severity scores
qsofa = su.get_qsofa(con)
sofa = su.get_sofa(con)
oasis = su.get_oasis(con)
sirs = su.get_sirs(con)
angus = su.get_angus(con)

# Time of suspected infection

Suspected infection is defined as:

* Antibiotics within 72 hours of a culture
* A culture within 24 hours of antibiotics

We can extract antibiotic usage from the, PRESCRIPTIONS, INPUTEVENTS_MV and INPUTEVENTS_CV tables. We can extract time of blood cultures from the MICROBIOLOGYEVENTS table. Detail is given in defining-suspected-infection.ipynb.

In [None]:
ab = su.get_suspected_infection_time(con)

# Other data

This query extracts other data of interest:

* Age
* Gender
* Immunosuppression
* BMI
* Metastatic cancer (Elixhauser comorbidity)
* Diabetes (Elixhauser comorbidity)


In [None]:
misc = su.get_other_data(con)

In [None]:
print('{} ICU stays.').format(misc.shape[0])
idx = misc.age > 1
print('{} adult ICU stays.').format(np.sum(idx))
demog_col = ['height','weight','bmi']
for c in demog_col:
    print('\t{:2.2f}% have {}.').format( (np.sum(idx) - misc[c][idx].isnull().sum())*100.0 / np.sum(idx), c )

# Clinical covariates

We are interesting in studying the relationship between "true" Sepsis cases (as defined by Angus criteria) and true/false positives according to qSOFA/SOFA. To accomplish this we extract a host of data from these patients.

In [None]:
dd = su.get_physiologic_data(con)
dd.head(n=5)

# Cohort

The below code creates our cohort of interest. This cohort is used to apply inclusion criteria by means of an inner join. Inclusion criteria are:

* Adult patient, i.e. age >= 16
* First ICU stay for the patient

In [None]:
cohort = su.get_cohort(con)

In [None]:
# close the database connection as we are finished extracting data
con.close()

# Create dataframe with all covariates extracted

In [None]:
# initialize our dataframe to the cohort
df_all_pt = cohort

# merge in the various severity scores
df_all_pt = df_all_pt.merge(qsofa, how='left', on='icustay_id',
                suffixes=('','_qsofa'))
df_all_pt = df_all_pt.merge(sofa, how='left', on='icustay_id',
                suffixes=('','_sofa'))
df_all_pt = df_all_pt.merge(sirs, how='left', on='icustay_id',
                suffixes=('','_sirs'))
df_all_pt = df_all_pt.merge(ab, how='left', on='icustay_id',
                suffixes=('','_ab'))
df_all_pt = df_all_pt.merge(misc, how='left', on='icustay_id',
                suffixes=('','_misc'))
df_all_pt = df_all_pt.merge(oasis, how='left', on='icustay_id',
                suffixes=('','_oasis'))

df_all_pt = df_all_pt.merge(angus, how='left', on='hadm_id',
                suffixes=('','_angus'))

# define sepsis-3 as: qSOFA >= 2 and SOFA >= 2
df_all_pt['sepsis3'] = (df_all_pt.qsofa >= 2) & (df_all_pt.sofa >=2)

df_all_pt.head()

In [None]:
df_all_pt.describe()

We can ask some pretty sensible questions of this data.

* What percentage of patients had antibiotics with a culture?
* What percentage of these cultures were positive?

In [None]:
print('{:5g} adult ICU stays (excluding subsequent ICU stays for the same patient).').format(
    df_all_pt.shape[0])

print('{:2.2f}% of patients with antibiotics/culture').format(
    df_all_pt['suspected_infection_time'].count().astype(float) / df_all_pt.shape[0] * 100)

print('{:2.2f}% of patients with positive cultures').format(
    df_all_pt['positiveculture'].sum().astype(float) / df_all_pt.shape[0] * 100)

print('{:2.2f}% of patients with antibiotics/culture had a positive culture').format(
    df_all_pt['positiveculture'].sum().astype(float) / df_all_pt['suspected_infection_time'].count().astype(float) * 100)

The Sepsis-3 guidelines exclusively evaluated patients with suspected infection, so we subselect to this population.

In [None]:
df = df_all_pt.loc[(~df_all_pt['suspected_infection_time'].isnull().values)]

# Study population

Demographics, etc.

In [None]:
def print_demographics(df):
    all_vars = ['age','gender','bmi','hospital_expire_flag','thirtyday_expire_flag',
      'icu_los','hosp_los','mech_vent'] # 

    for i, curr_var in enumerate(all_vars):
        if curr_var in df.columns:
            if curr_var in ['age','bmi','icu_los']: # report mean +- STD
                print('{:20s}\t{:2.2f} +- {:2.2f}').format(curr_var, df[curr_var].mean(), df[curr_var].std())
            elif curr_var in ['gender']: # convert from M/F
                print('{:20s}\t{:2.2f}%').format('male', 100.0*np.sum(df[curr_var].values=='M').astype(float) / df.shape[0])
            elif curr_var in ['hospital_expire_flag','thirtyday_expire_flag','mech_vent']:
                print('{:20s}\t{:2.2f}%').format(curr_var, 100.0*(df[curr_var].mean()).astype(float))
                # binary, report percentage

        else:
            print('{:20s}').format(curr_var)

In [None]:
# TODO: switch to su.print_demographics
print_demographics(df)

In [None]:
print('{:5g} have qSOFA >= 2 ({:2.2f}%).').format(
    (df.qsofa.values >= 2).sum(),100.0*(df.qsofa.values >= 2).mean())

print('{:5g} have SOFA >= 2 ({:2.2f}%).').format(
    (df.sofa.values >= 2).sum(),100.0*(df.sofa.values >= 2).mean())

print('{:5g} have Sepsis-3 ({:2.2f}%).').format(
    (df.sepsis3).sum(),100.0*(df.sepsis3).mean())

print('{:5g} have SIRS >= 2 ({:2.2f}%).').format(
    (df.sirs.values >= 2).sum(),100.0*(df.sirs.values >= 2).mean())

# Study questions

1. How well do the guidelines detect sepsis (Angus criteria) in the antibiotics/culture subset?
2. How well do the guidelines predict mortality (in-hospital) in the antibiotics/culture subset?
3. What factors would improve the sensitivity of the guidelines?
4. What factors would improve the specificity of the guidelines?

## Angus criteria evaluation

In [None]:
# define targets, angus critera
y = df.angus.values == 1

# define "predictions" according to the SEPSIS-3 guidelines:
#  suspicion of infection, qSOFA >= 2, and SOFA >= 2
yhat = (df.qsofa.values >= 2) & (df.sofa.values>=2)

print('\n SEPSIS-3 guidelines for Angus criteria sepsis \n')
# generate evaluation metrics
print 'Accuracy = {}'.format(metrics.accuracy_score(y, yhat))

su.print_cm(y, yhat) # print confusion matrix

In [None]:
y = df.angus.values == 1

# ROC for qSOFA
fpr_qsofa, tpr_qsofa, thresholds_qsofa = metrics.roc_curve(y, df.qsofa.values)
auc_qsofa = metrics.auc(fpr_qsofa, tpr_qsofa)

# ROC for SOFA
fpr_sofa, tpr_sofa, thresholds_sofa = metrics.roc_curve(y, df.sofa.values)
auc_sofa = metrics.auc(fpr_sofa, tpr_sofa)


# ROC for SEPSIS-3
fpr_s3, tpr_s3, thresholds_s3 = metrics.roc_curve(y, (df.qsofa.values >= 2) & (df.sofa.values >= 2))
auc_s3 = metrics.auc(fpr_s3, tpr_s3)

# ROC for SIRS
fpr_sirs, tpr_sirs, thresholds_sirs = metrics.roc_curve(y, df.sirs.values)
auc_sirs = metrics.auc(fpr_sirs, tpr_sirs)

# plot the data
plt.figure(figsize=[9,9])
plt.plot(fpr_qsofa, tpr_qsofa, 'o:',
         color=col[0], linewidth=2, markersize=10,
         label='qSOFA (AUC = %0.2f)' % auc_qsofa)
plt.plot(fpr_sofa, tpr_sofa, '^-',
         color=col[1], linewidth=2, markersize=10,
         label='SOFA (AUC = %0.2f)' % auc_sofa)
plt.plot(fpr_sirs, tpr_sirs, 's-',
         color=col[2], linewidth=2, markersize=10,
         label='SIRS (AUC = %0.2f)' % auc_sirs)

# add in the combination of SIRS/SOFA
#plt.plot(fpr_s3, tpr_s3, 'd--',
#         color=col[3], linewidth=2, markersize=10,
#         label='SEPSIS-3 (AUC = %0.2f)' % auc_s3)

plt.legend(loc="lower right")

plt.plot([0,1], [0,1], '--',
         color=[0,0,0], linewidth=2)
# reformat the plot
plt.xlim([-0.05, 1.05])
plt.ylim([-0.05, 1.05])
plt.xlabel('False Positive Rate',fontsize=14)
plt.ylabel('True Positive Rate',fontsize=14)
plt.title('ROC against Angus criteria',fontsize=14)
plt.show()

In [None]:


# define "predictions" according to the SEPSIS-3 guidelines:
#  suspicion of infection, qSOFA >= 2, and SOFA >= 2
yhat_all = [df.qsofa.values >= 2,
            df.sofa.values >= 2,
            df.sepsis3.values,
            df.sirs.values >= 2]
yhat_names = ['qsofa','sofa','seps3','SIRS']

# define "targets", angus critera
y_all = [df.angus.values == 1,
         df.angus.values == 1,
         df.angus.values == 1,
         df.angus.values == 1]


stats_all = su.print_op_stats(yhat_all, y_all,
               yhat_names=yhat_names,
               header=['angus criteria sepsis'])

# Hospital mortality evaluation

In [None]:
# define targets, angus critera
y = df.hospital_expire_flag.values == 1

# define "predictions" according to the SEPSIS-3 guidelines:
#  suspicion of infection, qSOFA >= 2, and SOFA >= 2
yhat = (df.qsofa.values >= 2) & (df.sofa.values>=2)

print('\n SEPSIS-3 guidelines for hospital mortality \n')
# generate evaluation metrics
print 'Accuracy = {}'.format(metrics.accuracy_score(y, yhat))

su.print_cm(y, yhat) # print confusion matrix

In [None]:
y = df.hospital_expire_flag.values == 1

# ROC for qSOFA
fpr_qsofa, tpr_qsofa, thresholds_qsofa = metrics.roc_curve(y, df.qsofa.values)
auc_qsofa = metrics.auc(fpr_qsofa, tpr_qsofa)

# ROC for SOFA
fpr_sofa, tpr_sofa, thresholds_sofa = metrics.roc_curve(y, df.sofa.values)
auc_sofa = metrics.auc(fpr_sofa, tpr_sofa)


# ROC for SEPSIS-3
fpr_s3, tpr_s3, thresholds_s3 = metrics.roc_curve(y, (df.qsofa.values >= 2) & (df.sofa.values >= 2))
auc_s3 = metrics.auc(fpr_s3, tpr_s3)

# ROC for SIRS
fpr_sirs, tpr_sirs, thresholds_sirs = metrics.roc_curve(y, df.sirs.values)
auc_sirs = metrics.auc(fpr_sirs, tpr_sirs)

# plot the data
plt.figure(figsize=[9,9])
plt.plot(fpr_qsofa, tpr_qsofa, 'o:',
         color=col[0], linewidth=2, markersize=10,
         label='qSOFA (AUC = %0.2f)' % auc_qsofa)
plt.plot(fpr_sofa, tpr_sofa, '^-',
         color=col[1], linewidth=2, markersize=10,
         label='SOFA (AUC = %0.2f)' % auc_sofa)
plt.plot(fpr_sirs, tpr_sirs, 's-',
         color=col[2], linewidth=2, markersize=10,
         label='SIRS (AUC = %0.2f)' % auc_sirs)

# add in the combination of SIRS/SOFA
#plt.plot(fpr_s3, tpr_s3, 'd--',
#         color=col[3], linewidth=2, markersize=10,
#         label='SEPSIS-3 (AUC = %0.2f)' % auc_s3)

plt.legend(loc="lower right")

plt.plot([0,1], [0,1], '--',
         color=[0,0,0], linewidth=2)
# reformat the plot
plt.xlim([-0.05, 1.05])
plt.ylim([-0.05, 1.05])
plt.xlabel('False Positive Rate',fontsize=14)
plt.ylabel('True Positive Rate',fontsize=14)
plt.title('ROC against hospital mortality',fontsize=14)
plt.show()

In [None]:
# define "predictions" according to the SEPSIS-3 guidelines:
#  suspicion of infection, qSOFA >= 2, and SOFA >= 2
yhat_all = [df.qsofa.values >= 2,
            df.sofa.values >= 2,
            df.sepsis3.values,
            df.sirs.values >= 2]
yhat_names = ['qsofa','sofa','seps3','SIRS']

# define "targets", angus critera
y_all = [df.hospital_expire_flag.values == 1,
         df.hospital_expire_flag.values == 1,
         df.hospital_expire_flag.values == 1,
         df.hospital_expire_flag.values == 1]


stats_all = su.print_op_stats(yhat_all, y_all,
               yhat_names=yhat_names,
               header=['in-hospital mortality'])

## What factors would improve Sepsis-3 guidelines?

In [None]:
# initialize our dataframe to the cohort
dm = cohort
dm = dm.merge(dd, how='inner', on='icustay_id',suffixes=('','_dd'))
dm = dm.merge(ab, how='inner', on='icustay_id',suffixes=('','_ab'))
dm = dm.merge(angus, how='inner', on='hadm_id',suffixes=('','_angus'))
dm = dm.merge(qsofa, how='inner', on='icustay_id',suffixes=('','_qsofa'))
dm = dm.merge(sofa, how='inner', on='icustay_id',suffixes=('','_sofa'))
dm = dm.merge(sirs, how='inner', on='icustay_id',suffixes=('','_sirs'))

dm.set_index('icustay_id',inplace=True)

# only look at icustay_ids with suspected infection
iid_suspected = df_all_pt.loc[(~df_all_pt['suspected_infection_time'].isnull().values),'icustay_id'].values
dm = dm.loc[iid_suspected]

# we subselect to patients classified as positive by sepsis-3
idxData = (dm.qsofa.values >= 2) & (dm.sofa.values>=2)

# define targets using angus criteria
y = dm.angus.values == 1

# create an iterator to get all but the first column
idx = [i for i in range(dm.columns.values.size) if dm.columns[i] in dd.columns]
X_data = dm[idx].values

# create the header from the column index we made earlier
X_header = [dm.columns[i] for i in idx]

X = X_data[idxData,:]
y = y[idxData]

# get feature/predictor matrix as numpy array
#X_nan = df[idx].isnull().values
# combine the arrays
#X = np.column_stack((X_data,X_nan))
#X_header = [X_header,[s + '_NaN' for s in X_header]]

# # flatten the list of lists into a single list
#X_header = [item for sublist in X_header for item in sublist]

# impute mean for missing values
imp = Imputer(missing_values='NaN', strategy='mean', axis=0)
imp.fit(X)

# custom scaling of data to avoid normalizing to missing data
sigma = np.ones(X.shape[1])

for i in range(X.shape[1]):
    tmp = X[~np.isnan(X[:,i]),i]
    if tmp.size > 1:
        sigma[i] = np.sqrt(np.var(tmp))

# # print the imp statistics
# print('{:20s}: {:14s} {:10s}').format('Header','     Mean','stdev')
# for i in range(len(X_header)):
#     print('{:20s}: {:10.4f} {:10.4f}').format(X_header[i], imp.statistics_[i], sigma[i])

X_tr = imp.transform(X)

# Logit Model
model = sm.Logit(y, X_tr)
results = model.fit()

y_hat = results.predict(exog=X_tr, transform=False)
y_pred = np.round(y_hat)


# generate evaluation metrics
print 'Accuracy = {}'.format(metrics.accuracy_score(y, y_pred))
print 'AUROC = {}'.format(metrics.roc_auc_score(y, y_hat))



su.print_cm(y, y_pred) # print confusion matrix

# train logit model using scikit
#model = LogisticRegression(fit_intercept=True)
#results = model.fit(X_train, y)
# predict class labels for the test set
#y_pred = results.predict(X_train)
#y_hat = results.predict_proba(X_train)
#y_hat = y_hat[:,1]


# let's look at adjusted odds ratios - everything is in standard deviation units
# N.B. if using sklearn, change to results.coef_.flatten()
oddsratio = np.exp(results.params * sigma)

# sort by value of odds ratio
sort_indices = np.argsort(oddsratio, axis=0)
oddsratio = oddsratio[sort_indices]

# create p-value array
pvalue = results.pvalues
pvalue = pvalue[sort_indices]

# also create a labels vector which is sorted
lbls = [X_header[i] for i in sort_indices]


# split into two vectors:
#   (i)  significant at p<0.05
#   (ii) insignificant at p<0.05
ytick = np.asarray(range(oddsratio.size))

idxSignificant = pvalue<=0.05

or_sig = oddsratio[idxSignificant]
lbl_sig = [lbl for i, lbl in enumerate(lbls) if idxSignificant[i] == True]
ytick_sig = ytick[idxSignificant]

or_insig = oddsratio[~idxSignificant]
lbl_insig = [lbl for i, lbl in enumerate(lbls) if idxSignificant[i] == False]
ytick_insig = ytick[~idxSignificant]

# now plot these odds ratios
plt.figure(figsize=[12,20])


plt.plot(or_insig, ytick_insig, 's', markersize=8, color=col[0],label='p >  0.05') # insignificant
plt.plot(or_sig, ytick_sig, 'o', markersize=8, color=col[1],label='p <= 0.05') # significant
plt.legend(loc='lower right')
plt.plot([1.,1.],[0,oddsratio.size],'k--')


ax = plt.gca()
ax.set_yticks(range(oddsratio.size))
ax.set_yticklabels(lbls,fontsize=14,fontweight='bold')
ax.set_ylim([-1,oddsratio.size])
ax.set_xticklabels( ['%2.2f' % i for i in ax.get_xticks()], fontsize=14 )
plt.xlabel('Odds ratios (exponentiated coefficient)')
plt.grid()
plt.show()

# Subsequent analyses

It would be interesting to evaluate if SIRS/Sepsis-3 differ in performance after subgrouping the data into categories based upon:

* WBC
* Temperature
* Age
* Gender
* Immunosuppression
    * Prednisone, Prednisolone (Orapred), Methylprednisolone (Medrol), Dexamethasone (Decadron), Hydrocortisone (Cortef), Cortisone
    * Cyclophosphamide (Cytoxan)
    * Cisplatin (Platinol), Carboplatin (Paraplatin)
    * Azathioprine  (Imuran)
    * Mercaptopurine (Purinethol)
    * Methotrexate/MTX (Trexall, Rasuvo)
    * Rituximab (Rituxan, MabThera, Zytux)
    * Basiliximab (Simulect)
    * Daclizumab (Zenapax)
    * Cyclosporin/Ciclosporin (Neoral, Sandimmune)
    * Tacrolimus (Prograf, Advagraf, Protopic)
    * Sirolimus (Rapamune)
    * Infliximab (Remicade)
    * Etanercept (Enbrel)
    * Adalimumab (Humira)
    * Mycophenolate (CellCept, Myfortic)
* BMI
    * < 18.5
    * 18.5 - 24.9
    * 25 - 29.9
    * 30 - 49.9
    * \> 50
* Metastatic cancer (Elixhauser comorbidity)
* Diabetes (Elixhauser comorbidity)

In [None]:
# test age

# define "targets", angus critera
y = df.angus.values == 1

# define "predictions" according to the SEPSIS-3 guidelines:
#  suspicion of infection, qSOFA >= 2, and SOFA >= 2
yhat_all = [ df.qsofa.values >= 2, df.qsofa.values >= 2, df.qsofa.values >= 2 ]
yhat_names = ['qSOFA','old','young']

# the below filters each group to a subset of patients
idx_group = [ ~np.isnan(df.qsofa.values), df.age.values >= 70, df.age.values < 70 ]


su.print_op_stats(yhat_all, y_all, yhat_names=yhat_names,
                  idx=idx_group)

# Appendix

In [None]:
# debug plot for outliers

plt.figure(figsize=[12,9])
# the histogram of the data
n, bins, patches = plt.hist(dm.pao2fio2_min.dropna().values, bins=np.asarray(range(200))*10, normed=True, facecolor='green', alpha=0.75)

plt.xlabel('pao2fio2_min')
plt.ylabel('Probability')
plt.grid(True)

plt.show()