# Assessment of vancomycin vs. another antibiotic

In clinical practice, the pertinent question is often not "should I use vancomycin?", but rather "should I use vancomycin or this other antibiotic?". This notebook aims to quantify the risk of nephrotoxicity when using vancomycin over the other antibiotic.


## Definitions

* **drug on admission:** patient received medication order -12 to 12 hours upon admission to the ICU
* **baseline creatinine:** first creatinine value between -12 to 12 hours upon admission to the ICU
* **AKI:** following KDIGO guidelines using only creatinine, any instance of AKI between 2-7 days after their ICU admission.

KDIGO guidelines for AKI are: >= 50% change from baseline over 7 days, or absolute increase of 0.3 in creatinine over 48 hours.

## 0. Setup

In [1]:
# Must install pandas-gbq. Link: https://pandas-gbq.readthedocs.io/en/latest/install.html#pip
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# helper functions stored in local py file
import utils

project_id='lcp-internal'

# Helper function to read data from BigQuery into pandas dataframes.
def run_query(query):
    return pd.io.gbq.read_gbq(query,
                              project_id=project_id, verbose=False,
                              dialect='standard')

# 1. Extract data and apply exclusions

For more detail on exclusions, see the main vancomycin analysis notebook.

In [2]:
# cohort with exclusions applied
query = """
SELECT co.*
FROM `lcp-internal.vanco.cohort` co
"""
co = run_query(query)

# covariates from APACHE table
query = """
SELECT dem.*
FROM `hst-953-2018.team_i.demographics` dem
"""
dem = run_query(query)

# abx drug doses
query = "SELECT * FROM `lcp-internal.vanco.vanco`"
va = run_query(query)

query = "SELECT * FROM `lcp-internal.vanco.cefepime`"
ce = run_query(query)

query = "SELECT * FROM `lcp-internal.vanco.zosyn`"
zo = run_query(query)

# other drugs
query = "SELECT * FROM `lcp-internal.vanco.nsaids`"
nsaid = run_query(query)

query = "SELECT * FROM `lcp-internal.vanco.loop_diuretics`"
ld = run_query(query)

# AKI
query = """
SELECT 
  patientunitstayid
  , chartoffset
  , creatinine, creatinine_reference, creatinine_baseline
  , aki_48h, aki_7d
FROM `lcp-internal.vanco.aki`
"""
aki = run_query(query)

## Merge data

The antibiotic administration data extracted above has all administrations from time 0.
Thus, three dataframes need to be extracted before we can merge data:

1. Extract vanco_adm and vanco_wk
2. Extract cefepime_adm and cefepime_wk
3. Extract zosyn_adm and zosyn_wk

In [3]:
v_df = utils.extract_adm_and_wk(va, 'vanco')
c_df = utils.extract_adm_and_wk(ce, 'cefepime')
z_df = utils.extract_adm_and_wk(zo, 'zosyn')

v_df.head()

Unnamed: 0_level_0,vanco_adm,vanco_wk
patientunitstayid,Unnamed: 1_level_1,Unnamed: 2_level_1
233176,1,1
233253,1,0
233338,1,1
233365,1,1
233381,1,1


Repeat the above for NSAIDs/loop diuretics.

In [4]:
nsaid_df = utils.extract_adm_and_wk(nsaid, 'nsaid')
ld_df = utils.extract_adm_and_wk(zo, 'diuretic')
ld_df.head()

Unnamed: 0_level_0,diuretic_adm,diuretic_wk
patientunitstayid,Unnamed: 1_level_1,Unnamed: 2_level_1
142917,1,0
143689,1,0
143775,0,1
143776,0,1
157326,1,1


## Apply exclusions and create final dataframe

In [5]:
# drop exclusions
idxKeep = co['patientunitstayid'].notnull()
for c in co.columns:
    if c.startswith('exclude_'):
        idxKeep = idxKeep & (co[c]==0)

# combine data into single dataframe
df = co.loc[idxKeep, ['patientunitstayid']].merge(dem, how='inner', on='patientunitstayid')

# add abx administration
df = df.merge(v_df, how='left', on='patientunitstayid')
df = df.merge(c_df, how='left', on='patientunitstayid')
df = df.merge(z_df, how='left', on='patientunitstayid')

# if the patient unit stay ID was not present for above join, they did not receive the abx
abx_columns = list(v_df.columns) + list(c_df.columns) + list(z_df.columns)
for c in abx_columns:
    df[c].fillna(0, inplace=True)
    df[c] = df[c].astype(int)

# add nsaid/loop diuretics
df = df.merge(nsaid_df, how='left', on='patientunitstayid')
df = df.merge(ld_df, how='left', on='patientunitstayid')

# if the patient unit stay ID was not present for above join, they did not receive the drug
d_columns = list(nsaid_df.columns) + list(ld_df.columns)
for c in d_columns:
    df[c].fillna(0, inplace=True)
    df[c] = df[c].astype(int)
    
aki_grp = aki.groupby('patientunitstayid')[['creatinine', 'aki_48h', 'aki_7d']].max()
aki_grp.reset_index(inplace=True)
df = df.merge(aki_grp, how='inner', on='patientunitstayid')

df['aki'] = ((df['aki_48h'] == 1) | (df['aki_7d'] == 1)).astype(int)

print('{} patients.'.format(df.shape[0]))

print('Antibiotic use on admission to ICU:')
for abx in ['vanco', 'cefepime', 'zosyn']:
    N = df[abx + '_adm'].sum()
    mu = N/df.shape[0]*100.0
    print(f'  {N:5d} ({mu:4.1f}%) - {abx}')
    
    
print('\nConcurrent antibiotic use:')
for abx1, abx2 in [['vanco', 'cefepime'], ['vanco', 'zosyn']]:
    N = ((df[abx1 + '_adm'] == 1) & (df[abx2 + '_adm'] == 1)).sum()
    mu = N/df.shape[0]*100.0
    print(f'  {N:5d} ({mu:4.1f}%) - {abx1} & {abx2}')
    

print('\nOther drug use on admission to ICU:')
for c in ['nsaid', 'diuretic']:
    N = df[c + '_adm'].sum()
    mu = N/df.shape[0]*100.0
    print(f'  {N:5d} ({mu:4.1f}%) - {c}')
    
df.head()

30251 patients.
Antibiotic use on admission to ICU:
   4404 (14.6%) - vanco
   1348 ( 4.5%) - cefepime
   3796 (12.5%) - zosyn

Concurrent antibiotic use:
    764 ( 2.5%) - vanco & cefepime
   2009 ( 6.6%) - vanco & zosyn

Other drug use on admission to ICU:
   7490 (24.8%) - nsaid
   3796 (12.5%) - diuretic


Unnamed: 0,patientunitstayid,unitdischargeoffset,age,gender,weight,height,BMI,BMI_group,apachescore,apache_group,...,zosyn_adm,zosyn_wk,nsaid_adm,nsaid_wk,diuretic_adm,diuretic_wk,creatinine,aki_48h,aki_7d,aki
0,3036317,3980,72,Male,84.8,185.4,25.0,overweight,51,51-60,...,0,0,1,1,0,0,1.22,0,0,0
1,3052627,4260,71,Male,57.51,172.7,19.0,normal,47,41-50,...,0,0,0,0,0,0,2.19,0,0,0
2,3054721,2495,37,Male,72.2,190.5,20.0,normal,53,51-60,...,0,0,0,0,0,0,0.57,0,1,1
3,3072232,555,> 89,Female,66.8,124.5,43.0,overweight,61,61-70,...,0,0,1,0,0,0,1.4,0,0,0
4,3079003,8825,88,Male,87.0,175.3,28.0,overweight,85,81-90,...,0,0,1,1,0,0,3.39,1,0,1


# Propensity matching

Define the dataframes used for (1) vanco only, (2) vanco + zosyn, and (3) vanco + cefepime.

### Vanco + Zosyn vs. Vanco + Cefepime

First, perform the comparison looking only at ICU admission drug administration.

In [6]:
vanco_zosyn = df[(df['vanco_adm'] == 1) & (df['zosyn_adm'] == 1) & (df['cefepime_adm'] == 0)]
vanco_cefepime = df[(df['vanco_adm'] == 1) & (df['zosyn_adm'] == 0) & (df['cefepime_adm'] == 1)]
utils.match_and_print_or(exposure=vanco_zosyn, control=vanco_cefepime, seed=842)

584 in control group.
1829 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		0	0
11-20		4	19
21-30		11	71
31-40		50	169
41-50		84	249
51-60		111	291
61-70		95	264
71-80		78	207
81-90		33	158
91-100		30	103
101-110		22	62
111-120		8	43
121-130		6	28
131-140		0	15
>140		9	10

Absolute Mean Difference of APACHE Score: 0.48956973045370944

=== Match groups on APACHE ===

Shape of treatment group: (541, 24)
Shape of control group: (541, 24)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		0	0
11-20		4	4
21-30		11	11
31-40		50	50
41-50		84	84
51-60		111	111
61-70		95	95
71-80		78	78
81-90		33	33
91-100		30	30
101-110		22	22
111-120		8	8
121-130		6	6
131-140		0	0
>140		9	9

Absolute Mean Difference of APACHE Score: 0.4990757855822494

=== Odds ratio of exposure ===

Diseased + Exposed: 180
Healthy + Exposed: 361
Diseased + Nonexp

Next, perform the same analysis requiring the administration to last at least from ICU admission to 48 hours post ICU admission.

In [7]:
idx0 = (df['vanco_adm'] == 1) & (df['zosyn_adm'] == 1) & (df['cefepime_adm'] == 0)
idx1 = (df['vanco_wk'] == 1) & (df['zosyn_wk'] == 1) & (df['cefepime_wk'] == 0)
vanco_zosyn = df.loc[idx0 & idx1, :]

idx0 = (df['vanco_adm'] == 1) & (df['zosyn_adm'] == 0) & (df['cefepime_adm'] == 1)
idx1 = (df['vanco_wk'] == 1) & (df['zosyn_wk'] == 0) & (df['cefepime_wk'] == 1)
vanco_cefepime = df.loc[idx0 & idx1, :]

utils.match_and_print_or(exposure=vanco_zosyn, control=vanco_cefepime, seed=34246)

218 in control group.
646 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		0	0
11-20		3	5
21-30		4	25
31-40		15	65
41-50		31	85
51-60		47	109
61-70		34	83
71-80		28	76
81-90		19	49
91-100		11	45
101-110		6	24
111-120		3	14
121-130		2	13
131-140		0	7
>140		4	4

Absolute Mean Difference of APACHE Score: 0.018919111872634176

=== Match groups on APACHE ===

Shape of treatment group: (207, 24)
Shape of control group: (207, 24)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		0	0
11-20		3	3
21-30		4	4
31-40		15	15
41-50		31	31
51-60		47	47
61-70		34	34
71-80		28	28
81-90		19	19
91-100		11	11
101-110		6	6
111-120		3	3
121-130		2	2
131-140		0	0
>140		4	4

Absolute Mean Difference of APACHE Score: 0.3333333333333286

=== Odds ratio of exposure ===

Diseased + Exposed: 63
Healthy + Exposed: 144
Diseased + Nonexposed: 57
Healthy + 

### Vanco vs. Vanco + Zosyn Analysis

In [8]:
vanco_only = df[(df['vanco_adm'] == 1) & (df['zosyn_adm'] == 0)]
vanco_zosyn = df[(df['vanco_adm'] == 1) & (df['zosyn_adm'] == 1)]
utils.match_and_print_or(exposure=vanco_zosyn, control=vanco_only, seed=5513)

2395 in control group.
2009 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		2	0
11-20		16	20
21-30		83	81
31-40		210	179
41-50		336	267
51-60		385	318
61-70		341	291
71-80		266	236
81-90		187	178
91-100		122	116
101-110		75	70
111-120		48	48
121-130		43	32
131-140		7	15
>140		24	10

Absolute Mean Difference of APACHE Score: -0.1668084661568372

=== Match groups on APACHE ===

Shape of treatment group: (1849, 24)
Shape of control group: (1849, 24)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		0	0
11-20		16	16
21-30		81	81
31-40		179	179
41-50		267	267
51-60		318	318
61-70		291	291
71-80		236	236
81-90		178	178
91-100		116	116
101-110		70	70
111-120		48	48
121-130		32	32
131-140		7	7
>140		10	10

Absolute Mean Difference of APACHE Score: 0.14656571119523676

=== Odds ratio of exposure ===

Diseased + Exposed: 627
Health

### Vanco vs. Vanco + Cefepime Analysis

In [9]:
# Vanco + Cefepime Analysis
vanco_only = df[(df['vanco_adm'] == 1) & (df['cefepime_adm'] == 0)]
vanco_cefepime = df[(df['vanco_adm'] == 1) & (df['cefepime_adm'] == 1)]
utils.match_and_print_or(exposure=vanco_cefepime, control=vanco_only, seed=543289)

3640 in control group.
764 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		2	0
11-20		31	5
21-30		143	21
31-40		329	60
41-50		501	102
51-60		565	138
61-70		510	122
71-80		395	107
81-90		312	53
91-100		195	43
101-110		115	30
111-120		83	13
121-130		65	10
131-140		22	0
>140		25	9

Absolute Mean Difference of APACHE Score: -1.2296067459474074

=== Match groups on APACHE ===

Shape of treatment group: (713, 24)
Shape of control group: (713, 24)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		0	0
11-20		5	5
21-30		21	21
31-40		60	60
41-50		102	102
51-60		138	138
61-70		122	122
71-80		107	107
81-90		53	53
91-100		43	43
101-110		30	30
111-120		13	13
121-130		10	10
131-140		0	0
>140		9	9

Absolute Mean Difference of APACHE Score: -0.25666199158484915

=== Odds ratio of exposure ===

Diseased + Exposed: 194
Healthy + Exposed: 51

# Other drugs

We compare the use of vancomycin with the use of other drugs (either NSAIDs or loop diuretics).

## Unmatched comparisons

### Vanco vs. Vanco + NSAID (on admission)

In [10]:
vanco_only = df[(df['vanco_adm'] == 1) & (df['nsaid_adm'] == 0)]
vanco_drug = df[(df['vanco_adm'] == 1) & (df['nsaid_adm'] == 1)]

print('\n=== Odds ratio of exposure ===\n')
diseased_exposed = len(vanco_drug[vanco_drug['aki'] == 1])
healthy_exposed = len(vanco_drug[vanco_drug['aki'] == 0])
diseased_nonexposed = len(vanco_only[vanco_only['aki'] == 1])
healthy_nonexposed = len(vanco_only[vanco_only['aki'] == 0])

utils.get_odds_ratio(diseased_exposed, healthy_exposed, diseased_nonexposed, healthy_nonexposed)


=== Odds ratio of exposure ===

Diseased + Exposed: 357
Healthy + Exposed: 767
Diseased + Nonexposed: 1026
Healthy + Nonexposed: 2254
Odds Ratio: 1.022537874455805
95% CI: (0.8839309752669485, 1.1828793581770654)


### Vanco vs. Vanco + NSAID (admission + week)

In [11]:
idx0 = (df['vanco_adm'] == 1) & (df['nsaid_adm'] == 0)
idx1 = (df['vanco_wk'] == 1) & (df['nsaid_wk'] == 0)
vanco_only = df.loc[idx0 & idx1, :]

idx0 = (df['vanco_adm'] == 1) & (df['nsaid_adm'] == 1)
idx1 = (df['vanco_wk'] == 1) & (df['nsaid_wk'] == 1)
vanco_drug = df.loc[idx0 & idx1, :]

print('\n=== Odds ratio of exposure ===\n')
diseased_exposed = len(vanco_drug[vanco_drug['aki'] == 1])
healthy_exposed = len(vanco_drug[vanco_drug['aki'] == 0])
diseased_nonexposed = len(vanco_only[vanco_only['aki'] == 1])
healthy_nonexposed = len(vanco_only[vanco_only['aki'] == 0])

utils.get_odds_ratio(diseased_exposed, healthy_exposed, diseased_nonexposed, healthy_nonexposed)


=== Odds ratio of exposure ===

Diseased + Exposed: 102
Healthy + Exposed: 261
Diseased + Nonexposed: 431
Healthy + Nonexposed: 963
Odds Ratio: 0.8731898551884152
95% CI: (0.6763060949671102, 1.1273896965856045)


### Vanco vs. Vanco + Loop diuretic (on admission)

In [12]:
vanco_only = df[(df['vanco_adm'] == 1) & (df['diuretic_adm'] == 0)]
vanco_drug = df[(df['vanco_adm'] == 1) & (df['diuretic_adm'] == 1)]

print('\n=== Odds ratio of exposure ===\n')
diseased_exposed = len(vanco_drug[vanco_drug['aki'] == 1])
healthy_exposed = len(vanco_drug[vanco_drug['aki'] == 0])
diseased_nonexposed = len(vanco_only[vanco_only['aki'] == 1])
healthy_nonexposed = len(vanco_only[vanco_only['aki'] == 0])

utils.get_odds_ratio(diseased_exposed, healthy_exposed, diseased_nonexposed, healthy_nonexposed)


=== Odds ratio of exposure ===

Diseased + Exposed: 684
Healthy + Exposed: 1325
Diseased + Nonexposed: 699
Healthy + Nonexposed: 1696
Odds Ratio: 1.2525321888412018
95% CI: (1.102507703176998, 1.4229713584427162)


In [13]:
idx0 = (df['vanco_adm'] == 1) & (df['diuretic_adm'] == 0)
idx1 = (df['vanco_wk'] == 1) & (df['diuretic_wk'] == 0)
vanco_only = df.loc[idx0 & idx1, :]

idx0 = (df['vanco_adm'] == 1) & (df['diuretic_adm'] == 1)
idx1 = (df['vanco_wk'] == 1) & (df['diuretic_wk'] == 1)
vanco_drug = df.loc[idx0 & idx1, :]

print('\n=== Odds ratio of exposure ===\n')
diseased_exposed = len(vanco_drug[vanco_drug['aki'] == 1])
healthy_exposed = len(vanco_drug[vanco_drug['aki'] == 0])
diseased_nonexposed = len(vanco_only[vanco_only['aki'] == 1])
healthy_nonexposed = len(vanco_only[vanco_only['aki'] == 0])

utils.get_odds_ratio(diseased_exposed, healthy_exposed, diseased_nonexposed, healthy_nonexposed)


=== Odds ratio of exposure ===

Diseased + Exposed: 248
Healthy + Exposed: 463
Diseased + Nonexposed: 281
Healthy + Nonexposed: 747
Odds Ratio: 1.4239179726831819
95% CI: (1.1583668881789158, 1.7503456060607097)


## Matching on APACHE diagnosis

Only looks at ICU admission.

### Vanco vs. Vanco + NSAID (admission)

In [14]:
vanco_only = df[(df['vanco_adm'] == 1) & (df['nsaid_adm'] == 0)]
vanco_zosyn = df[(df['vanco_adm'] == 1) & (df['nsaid_adm'] == 1)]
utils.match_and_print_or(exposure=vanco_zosyn, control=vanco_only, seed=64324)

3280 in control group.
1124 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		1	1
11-20		32	4
21-30		130	34
31-40		292	97
41-50		454	149
51-60		502	201
61-70		476	156
71-80		373	129
81-90		269	96
91-100		184	54
101-110		102	43
111-120		68	28
121-130		52	23
131-140		18	4
>140		25	9

Absolute Mean Difference of APACHE Score: -1.1374704092528418

=== Match groups on APACHE ===

Shape of treatment group: (1028, 24)
Shape of control group: (1028, 24)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		1	1
11-20		4	4
21-30		34	34
31-40		97	97
41-50		149	149
51-60		201	201
61-70		156	156
71-80		129	129
81-90		96	96
91-100		54	54
101-110		43	43
111-120		28	28
121-130		23	23
131-140		4	4
>140		9	9

Absolute Mean Difference of APACHE Score: -0.046692607003890885

=== Odds ratio of exposure ===

Diseased + Exposed: 332
Healthy + Exposed

### Vanco vs. Vanco + Loop diuretic (admission)

In [15]:
vanco_only = df[(df['vanco_adm'] == 1) & (df['diuretic_adm'] == 0)]
vanco_zosyn = df[(df['vanco_adm'] == 1) & (df['diuretic_adm'] == 1)]
utils.match_and_print_or(exposure=vanco_zosyn, control=vanco_only, seed=4557)

2395 in control group.
2009 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		2	0
11-20		16	20
21-30		83	81
31-40		210	179
41-50		336	267
51-60		385	318
61-70		341	291
71-80		266	236
81-90		187	178
91-100		122	116
101-110		75	70
111-120		48	48
121-130		43	32
131-140		7	15
>140		24	10

Absolute Mean Difference of APACHE Score: -0.1668084661568372

=== Match groups on APACHE ===

Shape of treatment group: (1849, 24)
Shape of control group: (1849, 24)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment
0-10		0	0
11-20		16	16
21-30		81	81
31-40		179	179
41-50		267	267
51-60		318	318
61-70		291	291
71-80		236	236
81-90		178	178
91-100		116	116
101-110		70	70
111-120		48	48
121-130		32	32
131-140		7	7
>140		10	10

Absolute Mean Difference of APACHE Score: 0.14656571119523676

=== Odds ratio of exposure ===

Diseased + Exposed: 627
Health