# Assessment of nephrotoxicity of vancomycin

The aim of this study was to quantify the association between nephrotoxicity and vancomycin in a large, multi-center US database.
The study matches patients who were admitted to the emergency department and received vancomycin on ICU admission versus those who did not receive vancomycin on admission. The matching is done using the APACHE-IV score component (which is in fact equivalent to the APACHE-III score).


## Definitions

* **drug on admission:** patient received medication order -12 to 12 hours upon admission to the ICU
* **baseline creatinine:** first creatinine value between -12 to 12 hours upon admission to the ICU
* **AKI:** following KDIGO guidelines using only creatinine, any instance of AKI between 2-7 days after their ICU admission.

KDIGO guidelines for AKI are: >= 50% change from baseline over 7 days, or absolute increase of 0.3 in creatinine over 48 hours.

## 0. Setup

In [1]:
# Must install pandas-gbq. Link: https://pandas-gbq.readthedocs.io/en/latest/install.html#pip
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import psycopg2

# helper functions stored in local py file
import utils

# Helper function to read data from database
conn_info = "host='localhost' dbname='eicu' user='alistairewj' port=5647"
con = psycopg2.connect(conn_info)

def run_query(query):
    return pd.read_sql_query(query, con)

# 1. Summarize cohort

In [2]:
co = run_query("select * from vanco.cohort")

In [3]:
print('== EXCLUSIONS - TOTAL ==')
N = co.shape[0]
print(f'{N:6d} unique unit stays.')
for c in co.columns:
    if c.startswith('exclude_'):
        N = co[c].sum()
        mu = co[c].mean()*100.0
        print(f'  {N:6d} ({mu:4.1f}%) - {c}')
        
print('\n== EXCLUSIONS - SEQUENTIAL ==')
N = co.shape[0]
print(f'{N:7d} unique unit stays.')
idx = co['patientunitstayid'].notnull()
for c in co.columns:
    if c.startswith('exclude_'):
        # index patients removed by this exclusion
        idxRem = (co[c]==1)
        # calculate number of patients being removed, after applying prev excl
        N = (idx & idxRem).sum()
        mu = N/co.shape[0]*100.0
        idx = idx & (~idxRem)
        n_rem = idx.sum()
        
        print(f'- {N:5d} = {n_rem:6d} ({mu:4.1f}% removed) - {c}')

== EXCLUSIONS - TOTAL ==
3336449 unique unit stays.
   15630 ( 0.5%) - exclude_before_2005
  456518 (13.7%) - exclude_sdu
  124485 ( 3.7%) - exclude_short_stay
  1990155 (59.6%) - exclude_non_ed_admit
  296774 ( 8.9%) - exclude_secondary_stay
  514249 (15.4%) - exclude_no_med_interface
  121025 ( 3.6%) - exclude_dialysis_chronic
  324262 ( 9.7%) - exclude_dialysis_first_week
  548095 (16.4%) - exclude_cr_missing_baseline
  1281815 (38.4%) - exclude_cr_missing_followup

== EXCLUSIONS - SEQUENTIAL ==
3336449 unique unit stays.
- 15630 = 3320819 ( 0.5% removed) - exclude_before_2005
- 456328 = 2864491 (13.7% removed) - exclude_sdu
- 85391 = 2779100 ( 2.6% removed) - exclude_short_stay
- 1531330 = 1247770 (45.9% removed) - exclude_non_ed_admit
- 270504 = 977266 ( 8.1% removed) - exclude_secondary_stay
- 166876 = 810390 ( 5.0% removed) - exclude_no_med_interface
- 26287 = 784103 ( 0.8% removed) - exclude_dialysis_chronic
- 54367 = 729736 ( 1.6% removed) - exclude_dialysis_first_week
- 37592

# 3. Analysis


## Get data from the database

In [4]:
# covariates from APACHE table
dem = run_query("SELECT dem.* FROM vanco.demographics dem")

# vancomycin drug doses
v = run_query("SELECT v.* FROM vanco.vanco v")

# AKI
query = """
SELECT 
  patientunitstayid
  , chartoffset
  , creatinine, creatinine_reference, creatinine_baseline
  , aki_48h, aki_7d
FROM vanco.aki
"""
aki = run_query(query)

## Collapse vancomycin data

The `v` dataframe has every vancomycin administration for a patient.

Here we collapse it into two binary columns:

* 'vanco_adm' - vancomycin was administered on ICU admission (between hours -12 and 12)
* 'vanco_wk' - vancomycin was administered sometime between 2-7 days after ICU admission

In [5]:
v_df = utils.extract_adm_and_wk(v, 'vanco')

Print out the proportion of patients with/without vancomycin after exclusions.

In [6]:
# get patient unit stay ID after applying exclusions
idxKeep = co['patientunitstayid'].notnull()
for c in co.columns:
    if c.startswith('exclude_'):
        idxKeep = idxKeep & (co[c]==0)
        
ptid = co.loc[idxKeep, 'patientunitstayid'].values
n_pt = len(ptid)
# limit to those in vancomycin dataframe
ptid = [x for x in ptid if x in v_df.index]

N = len(ptid)
print(f'{n_pt} stays after exclusions.')
for c in v_df.columns:
    N = v_df.loc[ptid, c].sum()
    mu = N / n_pt * 100.0
    print(f'  {N} ({mu:3.1f}%) with {c}')
    
# if they have both adm, then row-wise sum must be greater than 1
N = (v_df.loc[ptid, :] == 1).sum(axis=1)
N = (N>1).sum()
mu = N / n_pt * 100.0
print(f'  {N} ({mu:3.1f}%) with both')

394373 stays after exclusions.
  59935 (15.2%) with vanco_adm
  56968 (14.4%) with vanco_wk
  31769 (8.1%) with both


## Create a dataframe for analysis

The below code block:

* Applies exclusions
* Adds vancomycin binary flags
* Adds AKI flag

In [7]:
# drop exclusions
idxKeep = co['patientunitstayid'].notnull()
for c in co.columns:
    if c.startswith('exclude_'):
        idxKeep = idxKeep & (co[c]==0)

# combine data into single dataframe
df = co.loc[idxKeep, ['patientunitstayid']].merge(dem, how='inner', on='patientunitstayid')

# add vanco adminisdtration
df = df.merge(v_df, how='left', on='patientunitstayid')
# if ptid missing in vanco dataframe, then no vanco was received
# therefore impute 0
for c in v_df.columns:
    df[c].fillna(0, inplace=True)
    df[c] = df[c].astype(int)

aki_grp = aki.groupby('patientunitstayid')[['creatinine', 'aki_48h', 'aki_7d']].max()
aki_grp.reset_index(inplace=True)
df = df.merge(aki_grp, how='inner', on='patientunitstayid')

df['aki'] = ((df['aki_48h'] == 1) | (df['aki_7d'] == 1)).astype(int)
print(df.shape)
df.head()

(394373, 16)


Unnamed: 0,patientunitstayid,unitdischargeoffset,age,gender,weight,height,bmi,bmi_group,apachescore,apache_group,vanco_adm,vanco_wk,creatinine,aki_48h,aki_7d,aki
0,2849416,16284,66,Male,60.2,170.0,21.0,normal,89.0,81-90,0,0,1.2,1,1,1
1,2852869,2262,55,Male,75.0,183.0,22.0,normal,25.0,21-30,0,0,1.7,0,1,1
2,2853177,2641,40,Male,96.6,173.0,32.0,overweight,43.0,41-50,0,0,3.3,0,1,1
3,2855935,7635,88,Male,64.1,160.0,25.0,overweight,,,0,0,1.9,0,0,0
4,2856186,53333,22,Male,115.6,183.0,35.0,overweight,115.0,111-120,0,0,2.1,1,0,1


## Propensity matching

In [8]:
# Vanco + No Vanco Analysis
print('\n=== Cross-tabulation of vanco on admission vs. vanco during the week (days 2-7) ===')
display(pd.crosstab(df['vanco_adm'], df['vanco_wk'], margins=True))
print('Normalized:')
display(pd.crosstab(df['vanco_adm'], df['vanco_wk'], margins=True, normalize=True))


=== Cross-tabulation of vanco on admission vs. vanco during the week (days 2-7) ===


vanco_wk,0,1,All
vanco_adm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,309239,25199,334438
1,28166,31769,59935
All,337405,56968,394373


Normalized:


vanco_wk,0,1,All
vanco_adm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0,0.784128,0.063896,0.848025
1,0.07142,0.080556,0.151975
All,0.855548,0.144452,1.0


### Primary analysis

* **exposure**: treated with vancomycin on admission to the ICU
* **control**: *not* treated with vancomycin, for the first 7 days of stay, starting at unit admit time
* **excluded**: patients treated with vanco later in the ICU stay, but not on admission

In [9]:
# INITIAL VANCO vs. NO VANCO
novanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 0)]
vanco = df[(df['vanco_adm'] == 1)]
utils.match_and_print_or(exposure=vanco, control=novanco, seed=12938)

309239 in control group.
59935 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment	Control (%)	Treat (%)
0-10		497	33	0.2	0.1
11-20		6481	490	2.1	0.8
21-30		22489	1996	7.3	3.3
31-40		38328	4623	12.4	7.7
41-50		44218	7042	14.3	11.7
51-60		40071	8158	13.0	13.6
61-70		29186	7752	9.4	12.9
71-80		18299	6273	5.9	10.5
81-90		10659	4486	3.4	7.5
91-100		6286	2935	2.0	4.9
101-110		3846	1820	1.2	3.0
111-120		2203	1297	0.7	2.2
121-130		1387	802	0.4	1.3
131-140		732	442	0.2	0.7
>140		731	482	0.2	0.8

Absolute Mean Difference of APACHE Score: -12.661493025285132

=== Match groups on APACHE ===

Shape of treatment group: (48631, 16)
Shape of control group: (48631, 16)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment	Control (%)	Treat (%)
0-10		33	33	0.1	0.1
11-20		490	490	1.0	1.0
21-30		1996	1996	4.1	4.1
31-40		4623	4623	9.5	9.5
41-50		7042	7042	14.5	

### Secondary analysis

Slight alterations in how the exposure and treatment groups are defined.

* **exposure**: treated with vancomycin on admission *and* during the week to the ICU
* **control**: *not* treated with vancomycin, for the first 7 days of stay, starting at unit admit time
* **excluded**: patients treated with vanco later in the ICU stay, but not on admission
* **excluded**: (also) patients treated with vanco on admission, but not later in the week

In [10]:
# Vanco + No Vanco Analysis
novanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 0)]
vanco = df[(df['vanco_wk'] == 1) & (df['vanco_adm'] == 1)]
utils.match_and_print_or(exposure=vanco, control=novanco, seed=12301)

309239 in control group.
31769 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment	Control (%)	Treat (%)
0-10		497	19	0.2	0.1
11-20		6481	268	2.1	0.8
21-30		22489	1037	7.3	3.3
31-40		38328	2409	12.4	7.6
41-50		44218	3742	14.3	11.8
51-60		40071	4372	13.0	13.8
61-70		29186	4164	9.4	13.1
71-80		18299	3501	5.9	11.0
81-90		10659	2558	3.4	8.1
91-100		6286	1705	2.0	5.4
101-110		3846	1064	1.2	3.3
111-120		2203	766	0.7	2.4
121-130		1387	488	0.4	1.5
131-140		732	284	0.2	0.9
>140		731	290	0.2	0.9

Absolute Mean Difference of APACHE Score: -13.660911951997939

=== Match groups on APACHE ===

Shape of treatment group: (26667, 16)
Shape of control group: (26667, 16)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment	Control (%)	Treat (%)
0-10		19	19	0.1	0.1
11-20		268	268	1.0	1.0
21-30		1037	1037	3.9	3.9
31-40		2409	2409	9.0	9.0
41-50		3742	3742	14.0	1

In [11]:
# Vanco + No Vanco Analysis
novanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 0)]
vanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 1)]
utils.match_and_print_or(exposure=vanco, control=novanco, seed=4765)

309239 in control group.
28166 in exposure group.

=== APACHE distribution, unmatched data ===

Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment	Control (%)	Treat (%)
0-10		497	14	0.2	0.0
11-20		6481	222	2.1	0.8
21-30		22489	959	7.3	3.4
31-40		38328	2214	12.4	7.9
41-50		44218	3300	14.3	11.7
51-60		40071	3786	13.0	13.4
61-70		29186	3588	9.4	12.7
71-80		18299	2772	5.9	9.8
81-90		10659	1928	3.4	6.8
91-100		6286	1230	2.0	4.4
101-110		3846	756	1.2	2.7
111-120		2203	531	0.7	1.9
121-130		1387	314	0.4	1.1
131-140		732	158	0.2	0.6
>140		731	192	0.2	0.7

Absolute Mean Difference of APACHE Score: -11.446612507453551

=== Match groups on APACHE ===

Shape of treatment group: (21964, 16)
Shape of control group: (21964, 16)
Counts of Apache Scores for Control Group and Treatment Group

ApacheGroups	Control	Treatment	Control (%)	Treat (%)
0-10		14	14	0.1	0.1
11-20		222	222	1.0	1.0
21-30		959	959	4.4	4.4
31-40		2214	2214	10.1	10.1
41-50		3300	3300	15.0	15.0

In [12]:
con.close()