### Predicting Patient Risk of Meeting SIRS4 Criteria In A 4 Hour Window Within the Next 24 Hours.
Austin Mishoe, Data Scientist, Medical University of South Carolina  
October 8th, 2018  

In [1]:
import pickle
import pandas as pd
import numpy as np
from sklearn.metrics import f1_score,bidev_corrcoef
from multiprocessing import cpu_count
import pathos.pools as pp
num_cores = cpu_count()
import seaborn as sns
import matplotlib.pyplot as plt
import datetime
from prettytable import PrettyTable
from IPython.display import display
from IPython.display import HTML
import IPython.core.display as di
di.display_html('<script>jQuery(function() {if (jQuery("body.notebook_app").length == 0) { jQuery(".input_area").toggle(); jQuery(".prompt").toggle();}});</script>', raw=True)
pd.set_option('display.expand_frame_repr', False)

In [4]:
#data is created from sql script, query_data
data = pd.read_pickle('C://Python/PyProjects/LabsVitalsHourly_AlertSimulations/data/raw_data.p')

mgb = pickle.load(open('C://Python/PyProjects/LabsVitalsHourly_AlertSimulations/Hourly_Predictions_Bundle.p','rb'))
data['REPORTING_DATE'] = pd.to_datetime(data['REPORTING_TIME'].dt.strftime('%Y-%m-%d'))
data['REPORTING_WEEK'] = list(map(lambda x: x.isocalendar()[1],data['REPORTING_DATE']))


key_col = 'PAT_ENC_CSN_ID'
time_col = 'REPORTING_TIME'
date_col = 'REPORTING_DATE'
date_type = 'Day'
# target_cols = ['MetSIRS4_4hr_8', 'MetSIRS4_4hr_24','MetSIRS4_4hr_48', 'MetMEWS4_8', 'MetMEWS4_24','MetMEWS4_48']
# pred_cols = ['MetSIRS4_4hr_8_preds', 'MetSIRS4_4hr_24_preds', 'MetSIRS4_4hr_48_preds', 'MetMEWS4_8_preds', 'MetMEWS4_24_preds', 'MetMEWS4_48_preds']
target_cols = ['MetSIRS4_4hr_24']
pred_cols = ['MetSIRS4_4hr_24_preds_Uniform']
sim_start_date = datetime.datetime(2018,5,1)
sim_end_date = datetime.datetime(2018,7,31)


# #['U4PC PEDIATRIC CARDIOVASCULAR ICU','C8PI PEDIATRIC ICU','A4CV CVICU','U4ST SURGICAL TRAUMA ICU','C8NN NEONATAL ICU', 'U6MI MICU', 'A3MS MSICU', 'U8NI NEURO SCIENCE ICU', 'U9NI 9TH FL NEURO ICU']
icu_depts = [dept for dept in data['DepartmentName'].unique() if dept !=None and dept.lower().__contains__('icu')]

## apply any filtering, in this case we already have the adult population -- we want to now filter out all icu hours
# icu_times = np.where(data['DepartmentName'].isin(icu_depts))[0]
# print('The percentage of ICU patient hours '+str(len(icu_times)/data.shape[0]))
# data = data.iloc[np.where(data['DepartmentName'].isin(icu_depts)==False)[0],:].reset_index(drop=True)

## Introduction
Systemic Inflammatory Response Syndrome (SIRS) is a life-threatening condition related to systemic inflammation, organ
dysfunction, and organ failure. The consensus definition of severe sepsis in adults requires suspected or proven
infection, organ failure, and signs that meet two or more criteria for the systemic inflammatory response syndrome (SIRS).
SIRS criteria is listed below:

- Body Temperature < 36째C (96.8째F) or > 38째C (100.4째F)
- Heart rate > 90 beats per minute
- Respiratory rate > 20 breaths per minute
- White blood cell count > 12,000/mm3 or <4,000/ mm3


## Objective to Predict Probability of Adult Patient at MUSC Meeting All of the SIRS Criteria Within A 4 Hour In The Next 24 Hours.
The objective of this model is to flag patients who will meet all four of the SIRS criteria in a 4hour window within 24
hours. This model will generate predictions **hourly** for all inpatients at MUSC.


## Variables Considered
The training data for the model was developed from the MUSC 'Hourly Labs and Vitals' pipeline which contains approximately
75 labs and vitals information for each inpatient at MUSC. The labs and vitals used in this study were chosen based on
frequency of occurrence. This dataset is grained at the patient-hour for all patients in the hospital.
The following list of variables were used to build the model. Please refer to the model markdown for more information
on model performance and the actual features that were used in the model.


### Results Table for SIRS4_4hr Adults Per Day
The following table breaks down the number of actual SIRS4_4hr patients at MUSC per day. 
These numbers were calculated by looking at historic data between May 1, 2018 to July 31, 2018. Each time interval has a 90% CI, 5% lowerbound and 95% upperbound,
 and median number of patients with SIRS4_4hr historically.

In [5]:
targets_per_date_df = pd.DataFrame(np.zeros((len(target_cols),6)),index=[target_cols])

# iterate through each of the dates and find the number of unique pat_id's with each of the model alerts
#tmp_df has dimension of (num_dates,num_models) to keep track of how many unique patients were alerted on each of the date
date_cols = ['REPORTING_DATE','REPORTING_WEEK']
for date_col_ind in range(len(date_cols)):
    date_col = date_cols[date_col_ind]
    calculation_df = np.zeros((len(data[date_col].unique()),len(target_cols)))
    all_dates = data[date_col].unique()
    for date_ind in range(len(all_dates)):
        date = all_dates[date_ind]
        tmp_data = data.iloc[np.where(data[date_col]==date)[0],:].reset_index(drop=True)
        for target_ind in range(len(target_cols)):
            target_col = target_cols[target_ind]
            calculation_df[date_ind,target_ind] = len(np.unique(tmp_data[key_col][np.where(tmp_data[target_col]==1)[0]]))


    for target_ind in range(len(target_cols)):
        target_col = target_cols[target_ind]
        targets_per_date_df.iloc[0,(date_col_ind*3):((date_col_ind+1)*3)] = np.percentile(calculation_df[:,target_ind],np.array([5,50,95]))

cols = ['Day'+' '+percent for percent in [' 5%',' 50%',' 95%']]
cols.extend(['Week'+' '+percent for percent in [' 5%',' 50%',' 95%']])
targets_per_date_df.columns = cols

targets_per_date_df.style.apply(lambda x: ['' for i in x], axis=1)

Unnamed: 0,Day 5%,Day 50%,Day 95%,Week 5%,Week 50%,Week 95%
"('MetSIRS4_4hr_24',)",10,15.5,24.45,26.7,47,55.4


### Determining the Optimal Cutoff for Alerts

The optimal cutoff is typically found by either doing cost analysis (if the cost of intervention is known) or by using an evaluation metric to balance the Sensitivity vs. Specificity.  

Our team was not given the cost of False Positives/False negatives, so the optimal cutoff was found using the F1 score, harmonic mean of the sensitivity and specificity. 
The optimal cutoff is found by maximizing each of these scores.

***An optimal cutoff of .45 was chosen.***

In [6]:
optimal_f1_dict  = {}
optimal_mcc_dict = {}

p = pp.ProcessPool(num_cores)
cutoffs = np.arange(90,100,1)
cutoffs = np.round(np.arange(.15,.61,.05),2)
for target_ind in range(len(target_cols)):
    pred_col = pred_cols[target_ind]
    target_col = target_cols[target_ind]
    preds_binary_list = [1*(data[pred_col]>=cutoff).values for cutoff in cutoffs]
    true_binary_list = [data[target_col].values for cutoff in cutoffs]
    f1_scores = p.map(f1_score,true_binary_list,preds_binary_list)
    mcc_scores = p.map(bidev_corrcoef, true_binary_list, preds_binary_list)
    optimal_f1_dict[target_col]=cutoffs[f1_scores.index(max(f1_scores))]
    optimal_mcc_dict[target_col] = cutoffs[mcc_scores.index(max(mcc_scores))]

opt_cutoff=.45

### Alert Results By Cutoff
The following table shows the results of the model over various cutoff values on the entire MUSC inpatient population between
May 1, 2018 and July 31,2018. This data was part of the holdout dataset, i.e. during the training phase, the model never saw this data.

For the following analysis, all patients who met the SIRS4_4hr criteria were identified and the
time of their first SIRS4_4hr instance was calculated. Using this information, we calculated whether the model would have
alerted before, during/after, or entirely missed the patient using various cutoffs. The results over various cutoff values are shown below.

In [7]:
target_col = target_cols[target_ind]
pred_col = pred_cols[target_ind]
alert_results_table_columns = ['Total Patient Alerts Per '+date_type,'Non-ICU Patient Alerts Per '+date_type,'Proportion Alerts Prior','Proportion Alerts During/After','Proportion Alerts Missed','Average Alert Time Before Target(Hours)','Proportion Patient Wrong Alerts']
alert_results_table = pd.DataFrame(np.zeros((len(cutoffs),len(alert_results_table_columns))),columns=alert_results_table_columns,index=cutoffs)

alert_time_bins = pd.DataFrame(np.zeros((len(cutoffs),6)),columns=['>24 Hours','24-13 Hours','12-9 Hours','8-5 Hours','4-1 Hours','Alerted During/After'],index=cutoffs)

for cutoff in cutoffs:
    alerts_dict = {col: 0 for col in alert_results_table_columns}
    data['preds_binary']= 1*(data[pred_col]>=cutoff).values
    ppd_list = []
    nonicu_ppd_list = []
    for day in data['REPORTING_DATE'].unique():
        tmp_data = data.iloc[np.where(data['REPORTING_DATE']==day)[0],:].reset_index(drop=True)
        ppd_list.append(len(tmp_data.loc[np.where(tmp_data['preds_binary']==1)[0],'PAT_ENC_CSN_ID'].unique()))
        nonicu_ppd_list.append(len(tmp_data.loc[np.where((tmp_data['preds_binary']==1) & (tmp_data['DepartmentName'].isin(icu_depts)==False))[0],'PAT_ENC_CSN_ID'].unique()))
    alerts_dict['Total Patient Alerts Per '+date_type] = np.mean(ppd_list).round(2)
    alerts_dict['Non-ICU Patient Alerts Per ' + date_type] = np.mean(nonicu_ppd_list).round(2)
    unique_target_keys = data.loc[np.where(data['MetSIRS4_4hr']==1)[0],key_col].unique()
    unique_alert_keys = data.loc[np.where(data['preds_binary']==1)[0],key_col].unique()
    for target_key in unique_target_keys:
        first_target_time = min(data.loc[np.where((data[key_col]==target_key) & (data['MetSIRS4_4hr']==1))[0],time_col])
        alert_inds = np.where((data[key_col]==target_key) & (data['preds_binary']==1))[0]
        if len(alert_inds)==0:
            alerts_dict['Proportion Alerts Missed'] += 1
        else:
            first_alert_time = min(data.loc[alert_inds, time_col])
            alert_diff = (first_target_time.to_pydatetime() - first_alert_time.to_pydatetime()).total_seconds() / 3600
            if first_alert_time<first_target_time:
                alerts_dict['Proportion Alerts Prior']+=1
                if alert_diff<=4:
                    alert_time_bins.loc[cutoff, '4-1 Hours'] += 1
                elif alert_diff<=8:
                    alert_time_bins.loc[cutoff, '8-5 Hours'] += 1
                elif alert_diff<=12:
                    alert_time_bins.loc[cutoff, '12-9 Hours'] += 1
                elif alert_diff <= 24:
                    alert_time_bins.loc[cutoff, '24-13 Hours'] += 1
                else:
                    alert_time_bins.loc[cutoff, '>24 Hours'] += 1
            else:
                alerts_dict['Proportion Alerts During/After'] += 1
                alert_time_bins.loc[cutoff, 'Alerted During/After'] += 1

            alerts_dict['Average Alert Time Before Target(Hours)'] += alert_diff
    alerts_dict['Proportion Patient Wrong Alerts'] = len(
        (set(unique_alert_keys) - set(unique_alert_keys).intersection(unique_target_keys))) / len(unique_alert_keys)
    alerts_dict['Proportion Patient Wrong Alerts'] = len((set(unique_alert_keys) - set(unique_alert_keys).intersection(unique_target_keys)))/len(unique_alert_keys)
    for col in ['Proportion Alerts Missed','Proportion Alerts Prior','Proportion Alerts During/After','Average Alert Time Before Target(Hours)']:
        alert_results_table.loc[cutoff,col] = alerts_dict.get(col)/len(unique_target_keys)
    for col in ['Non-ICU Patient Alerts Per '+date_type,'Total Patient Alerts Per '+date_type,'Proportion Patient Wrong Alerts']:
        alert_results_table.loc[cutoff, col] = alerts_dict.get(col)

alert_time_bins = alert_time_bins/len(unique_target_keys)

alert_results_table = alert_results_table.round(2)
alert_results_table.style.apply(lambda x: ['background: lightgreen' if x.name == opt_cutoff else '' for i in x], axis=1)

Unnamed: 0,Total Patient Alerts Per Day,Non-ICU Patient Alerts Per Day,Proportion Alerts Prior,Proportion Alerts During/After,Proportion Alerts Missed,Average Alert Time Before Target(Hours),Proportion Patient Wrong Alerts
0.15,81.8,43.95,0.93,0.07,0.0,56.11,0.79
0.2,69.37,34.7,0.9,0.1,0.0,48.39,0.76
0.25,60.16,28.16,0.87,0.13,0.0,45.59,0.72
0.3,53.17,23.41,0.85,0.15,0.0,43.03,0.69
0.35,47.29,19.76,0.8,0.19,0.0,37.6,0.66
0.4,42.49,17.01,0.78,0.21,0.01,36.34,0.63
0.45,38.05,14.48,0.75,0.24,0.01,33.47,0.59
0.5,34.21,12.27,0.72,0.27,0.01,31.1,0.55
0.55,30.62,10.6,0.68,0.3,0.01,27.5,0.51
0.6,27.27,8.97,0.63,0.35,0.02,25.5,0.47


### Breakdown of Alert Times for SIRS4_4hr Patients

The following table further breaks down how many hours in advance this model would have fired for each of the SIRS4_4hr patients by cutoff.  

'Alerted During/After' is defined as the model either firing as soon as the patient met the SIRS4_4hr criteria or after. The '4-1 Hours' column gives the relative proportion of patients that would have been alerted on 1-4 hours prior to them meeting the SIRS4_4hr criteria. Each of the subsequent columns are defined similarly.

In [8]:
alert_time_bins = alert_time_bins.round(2)
alert_time_bins.style.apply(lambda x: ['background: lightgreen' if x.name == opt_cutoff else '' for i in x], axis=1)

Unnamed: 0,>24 Hours,24-13 Hours,12-9 Hours,8-5 Hours,4-1 Hours,Alerted During/After
0.15,0.41,0.09,0.08,0.09,0.26,0.07
0.2,0.37,0.08,0.08,0.1,0.27,0.1
0.25,0.35,0.08,0.08,0.1,0.26,0.13
0.3,0.33,0.08,0.09,0.1,0.26,0.15
0.35,0.3,0.07,0.09,0.1,0.24,0.19
0.4,0.29,0.06,0.08,0.11,0.25,0.21
0.45,0.26,0.07,0.06,0.11,0.25,0.24
0.5,0.24,0.06,0.06,0.1,0.25,0.27
0.55,0.21,0.06,0.06,0.1,0.24,0.3
0.6,0.2,0.06,0.05,0.08,0.24,0.35


### Top Non-ICU Departments By Number of Alerts Per Week

This table stratifies how the model will alert by department each week using the optimal cutoff.  

The first column, 'Number MetSIRS4_4hr_24 Patients Per Week', gives the average number of MetSIRS4_4hr_24 patients in each department per week, the second gives the average number of patient's alerted on each week, and the third column give the total number of unique patients that come through each department on a week by week basis.


In [9]:
data[target_col+'_Alerts'] = list(map(lambda x: 1 if x> opt_cutoff else 0,data[pred_cols[0]]))
### these are the true number of individuals in each department with the flags
#agg_data_target = data[['DepartmentName','REPORTING_DATE',target_col, 'MetSIRS4_4hr']].groupby(['DepartmentName','REPORTING_DATE']).sum().groupby('DepartmentName').mean().sort_values(by=[target_col],ascending=False)
agg_data_target = data[['DepartmentName', 'REPORTING_DATE',target_col, target_col+'_Alerts', 'MetSIRS4_4hr']].groupby(
    ['DepartmentName', 'REPORTING_DATE']).sum().groupby('DepartmentName').mean().sort_values(by=[target_col],
                                                                                             ascending=False)
#agg_data_target = agg_data_target[:25]
agg_data_target['NumberUniquePatientsDaily'] = 0
for curr_dept in list(agg_data_target.index):
    agg_data_target.loc[curr_dept,'NumberUniquePatientsDaily'] = sum(1*data['DepartmentName']==curr_dept)/len(data['REPORTING_DATE'].unique())
#print(agg_data_target[:20])

depts = [dept for dept in agg_data_target.index if dept not in icu_depts][:15]
agg_data_df_cols = ['Number MetSIRS4_4hr_24 Patients Per Week','Number MetSIRS4_4hr_24 Alerts Per Week','Number Unique Patients Per Week By Dept']
agg_data_df = pd.DataFrame(np.zeros((len(depts),len(agg_data_df_cols))),columns=agg_data_df_cols,index=depts)
unique_weeks = data['REPORTING_WEEK'].unique()
for dept in depts:
    agg_dict ={col: 0 for col in agg_data_df_cols}
    for week in unique_weeks:
        agg_dict['Number MetSIRS4_4hr_24 Patients Per Week'] +=len(data.loc[np.where((data['DepartmentName']==dept)&(data['REPORTING_WEEK']==week)&(data[target_col]==1))[0],key_col].unique())
        agg_dict['Number MetSIRS4_4hr_24 Alerts Per Week'] += len(data.loc[np.where((data['DepartmentName']==dept)& (data['REPORTING_WEEK'] == week) & (data[target_col+'_Alerts'] == 1))[0], key_col].unique())
        agg_dict['Number Unique Patients Per Week By Dept']+=len(data.loc[np.where((data['DepartmentName']==dept)& (data['REPORTING_WEEK'] == week))[0], key_col].unique())
    for key in agg_dict.keys():
        agg_data_df.loc[dept,key] = agg_dict.get(key)/len(unique_weeks)

agg_data_df = agg_data_df.round(2).sort_values('Number MetSIRS4_4hr_24 Patients Per Week',ascending=False)
agg_data_df.style.apply(lambda x: ['' for i in x],axis=1)

Unnamed: 0,Number MetSIRS4_4hr_24 Patients Per Week,Number MetSIRS4_4hr_24 Alerts Per Week,Number Unique Patients Per Week By Dept
MUSC ED 1 WEST (ADULT ED),6.5,14.71,813.07
MUSC ED CPC,2.64,5.29,209.86
A07W HEMATOLOGY ONCOLOGY,2.5,5.93,37.57
A05W SPECIALTY,2.0,3.93,53.07
MUSC ART OR,1.86,1.57,122.36
A07E HEMATOLOGY ONCOLOGY,1.07,3.14,41.43
U06E RENAL TRANS NEPHR,1.0,2.07,60.57
U07W GYN ONC,0.93,2.14,61.79
A06E DDC IP UNIT,0.86,2.43,52.21
U09E PROGRESSIVE NEURO,0.86,2.07,57.57


### Model Performance on Patients with Sepsis Primary DRG
The next section applies this model specifically to the population of individuals that had sepsis during their stay.
All 'PAT_ENC_CSN_ID' between July 1, 2014 and August 30,2018 that were found to have sepsis were included
in the dataset. The number of patient visits with Sepsis as their primary diagnosis: 6802. The number of patient visits with Sepsis as their primary diagnosis
who also passed away: 1469.  

The table below shows how the model would have performed by looking only at the population of septic patients. SIRSdttm, the exact time at which each septic individual
met the sepsis SIRS criteria, is assumed to be time-zero for the sepsis infection(This time-zero was time was established by our analytics team). The following results
display how the model would have performed at alerting before/after the earliest time that each individual met the sepsis SIRS criteria.  

The proportion of alert prior/after was calculated by looking the each sepsis patient's SIRSdttm
 and determining what proportion of these patients would have been alerted on before their first SIRSdttm.

In [10]:
sepsis_data = pd.read_pickle('C://Python/PyProjects/LabsVitalsHourly_AlertSimulations/data/sepsis_patient_data.p')
sepsis_keys = sepsis_data[key_col].unique()
mort_sepsis_keys = sepsis_data.loc[np.where(sepsis_data['Death_Num']==1)[0],key_col].unique()
sirs4_4hr_count = 0
for key in sepsis_keys:
    sirs4_4hr_count+= 1 if sepsis_data.loc[np.where(sepsis_data[key_col]==key)[0],'MetSIRS4_4hr'].sum()>0 else 0

target_col = target_cols[target_ind]
pred_col = pred_cols[target_ind]
alert_results_table_columns = ['Total Sepsis Primary DRGs','Number Sepsis Patients Alerted On','Total Deceased Sepsis Primary DRGs','Number Deceased Sepsis Patients Alerted On','Proportion Alerts Prior','Proportion Alerts During/After','Proportion Alerts Missed','Average Alert Time Before Target(Hours)','Proportion Sepsis Patients Missed']
alert_results_table = pd.DataFrame(np.zeros((len(cutoffs),len(alert_results_table_columns))),columns=alert_results_table_columns,index=cutoffs)
alert_time_bins = pd.DataFrame(np.zeros((len(cutoffs),6)),columns=['>24 Hours','24-13 Hours','12-9 Hours','8-5 Hours','4-1 Hours','Alerted After'],index=cutoffs)

for cutoff in cutoffs:
    alerts_dict = {col: 0 for col in alert_results_table_columns}
    sepsis_data['preds_binary']= 1*(sepsis_data[pred_col]>=cutoff).values
    alerts_dict['Total Sepsis Primary DRGs'] = len(sepsis_keys)
    alerts_dict['Number Sepsis Patients Alerted On'] = len(sepsis_data.loc[np.where(sepsis_data['preds_binary']==1)[0],key_col].unique())
    alerts_dict['Proportion Sepsis Patients Missed'] = (alerts_dict['Total Sepsis Primary DRGs'] - alerts_dict['Number Sepsis Patients Alerted On'])/alerts_dict['Total Sepsis Primary DRGs']
    alerts_dict['Total Deceased Sepsis Primary DRGs'] = len(mort_sepsis_keys)
    alerts_dict['Number Deceased Sepsis Patients Alerted On'] = len(sepsis_data.loc[np.where((sepsis_data['preds_binary'] == 1) & (sepsis_data['Death_Num']==1))[0], key_col].unique())
    unique_target_keys = sepsis_data.loc[np.where(sepsis_data['SIRS_dttm'].isnull()==False)[0],key_col].unique()
    unique_alert_keys = sepsis_data.loc[np.where(sepsis_data['preds_binary']==1)[0],key_col].unique()
    for target_key in unique_target_keys:
        first_target_time = sepsis_data.loc[np.where(sepsis_data[key_col] == target_key)[0], 'SIRS_dttm'].unique()[0]
        alert_inds = np.where((sepsis_data[key_col]==target_key) & (sepsis_data['preds_binary']==1))[0]
        if len(alert_inds)==0:
            alerts_dict['Proportion Alerts Missed'] += 1
        else:
            first_alert_time = min(sepsis_data.loc[alert_inds, time_col])
            alert_diff = (datetime.datetime.utcfromtimestamp(first_target_time.tolist() / 1e9) - first_alert_time.to_pydatetime()).total_seconds() / 3600
            if first_alert_time<first_target_time:
                alerts_dict['Proportion Alerts Prior']+=1
                if alert_diff<=4:
                    alert_time_bins.loc[cutoff, '4-1 Hours'] += 1
                elif alert_diff<=8:
                    alert_time_bins.loc[cutoff, '8-5 Hours'] += 1
                elif alert_diff<=12:
                    alert_time_bins.loc[cutoff, '12-9 Hours'] += 1
                elif alert_diff <= 24:
                    alert_time_bins.loc[cutoff, '24-13 Hours'] += 1
                else:
                    alert_time_bins.loc[cutoff, '>24 Hours'] += 1
            else:
                alerts_dict['Proportion Alerts During/After'] += 1
                alert_time_bins.loc[cutoff, 'Alerted After'] += 1

            alerts_dict['Average Alert Time Before Target(Hours)'] += alert_diff
    # alerts_dict['Proportion Patient Wrong Alerts'] = len(
    #     (set(unique_alert_keys) - set(unique_alert_keys).intersection(unique_target_keys))) / len(unique_alert_keys)
    # alerts_dict['Proportion Patient Wrong Alerts'] = len((set(unique_alert_keys) - set(unique_alert_keys).intersection(unique_target_keys)))/len(unique_alert_keys)
    for col in ['Proportion Alerts Missed','Proportion Alerts Prior','Proportion Alerts During/After','Average Alert Time Before Target(Hours)']:
        alert_results_table.loc[cutoff,col] = alerts_dict.get(col)/len(unique_target_keys)
    for col in ['Total Sepsis Primary DRGs','Number Sepsis Patients Alerted On','Total Deceased Sepsis Primary DRGs','Number Deceased Sepsis Patients Alerted On','Proportion Sepsis Patients Missed']:
        alert_results_table.loc[cutoff, col] = alerts_dict.get(col)

alert_results_table = alert_results_table.round(2)
alert_results_table.style.apply(lambda x: ['background: lightgreen' if x.name == opt_cutoff else '' for i in x], axis=1)

Unnamed: 0,Total Sepsis Primary DRGs,Number Sepsis Patients Alerted On,Total Deceased Sepsis Primary DRGs,Number Deceased Sepsis Patients Alerted On,Proportion Alerts Prior,Proportion Alerts During/After,Proportion Alerts Missed,Average Alert Time Before Target(Hours),Proportion Sepsis Patients Missed
0.15,6801,5942,1469,1446,0.43,0.5,0.07,-1.09,0.13
0.2,6801,5729,1469,1435,0.37,0.53,0.1,-5.46,0.16
0.25,6801,5510,1469,1425,0.32,0.55,0.12,-8.26,0.19
0.3,6801,5320,1469,1401,0.29,0.56,0.15,-11.62,0.22
0.35,6801,5156,1469,1391,0.27,0.56,0.17,-13.8,0.24
0.4,6801,4970,1469,1374,0.24,0.56,0.2,-16.65,0.27
0.45,6801,4822,1469,1361,0.22,0.56,0.22,-19.89,0.29
0.5,6801,4644,1469,1339,0.2,0.55,0.24,-21.46,0.32
0.55,6801,4453,1469,1319,0.19,0.54,0.27,-23.7,0.35
0.6,6801,4259,1469,1298,0.17,0.53,0.3,-26.2,0.37


### Breakdown of Alert Times for SIRS4_4hr Patients

The following table further breaks down how many hours in advance this model would have fired for each of the Sepsis, SIRS4_4hr patients by cutoff.  

'Alerted During/After' is defined as the model either firing as soon as the patient met the SIRS4_4hr criteria or after. The '4-1 Hours' column
gives the relative proportion of patients that would have been alerted on 1-4 hours prior to them meeting the SIRS4_4hr criteria. Each of
the subsequent columns are defined similarly.

In [11]:
alert_time_bins = alert_time_bins/len(unique_target_keys)
alert_time_bins = alert_time_bins.round(2)
alert_time_bins.style.apply(lambda x: ['background: lightgreen' if x.name == opt_cutoff else '' for i in x], axis=1)

Unnamed: 0,>24 Hours,24-13 Hours,12-9 Hours,8-5 Hours,4-1 Hours,Alerted After
0.15,0.08,0.04,0.03,0.05,0.23,0.5
0.2,0.08,0.03,0.02,0.04,0.2,0.53
0.25,0.07,0.03,0.02,0.04,0.17,0.55
0.3,0.06,0.03,0.02,0.03,0.15,0.56
0.35,0.06,0.02,0.02,0.03,0.14,0.56
0.4,0.06,0.02,0.01,0.03,0.13,0.56
0.45,0.05,0.02,0.01,0.02,0.11,0.56
0.5,0.05,0.02,0.01,0.02,0.1,0.55
0.55,0.04,0.02,0.01,0.02,0.09,0.54
0.6,0.04,0.02,0.01,0.02,0.09,0.53


### Concluding Remarks

- A patient's risk score will be updated hourly. If any of their labs and vitals have changed, this will reflect in their subsequent risk score  

- The cutoff used to trigger alerts can vary based on the number of resources at each department. This analysis was meant more to give a general overview of how the model performs on hour-to-hour basis for various departments, and the hospital as a whole.  

- Our model is able to provide patient specific insight as to why a given person has a high/low risk score. With an alert, we seek to provide the top 10 features that are attributed to a patient's high prediction  