# Assessment of nephrotoxicity of vancomycin

The aim of this study was to quantify the association between nephrotoxicity and vancomycin in a large, multi-center US database.
The study matches patients who were admitted to the emergency department and received vancomycin on ICU admission versus those who did not receive vancomycin on admission. The matching is done using the APACHE-IV score component (which is in fact equivalent to the APACHE-III score).


## Definitions

* **drug on admission:** patient received medication order -12 to 12 hours upon admission to the ICU
* **baseline creatinine:** first creatinine value between -12 to 12 hours upon admission to the ICU
* **AKI:** following KDIGO guidelines using only creatinine, any instance of AKI between 2-7 days after their ICU admission.

KDIGO guidelines for AKI are: >= 50% change from baseline over 7 days, or absolute increase of 0.3 in creatinine over 48 hours.

## 0. Setup

In [None]:
# Must install pandas-gbq. Link: https://pandas-gbq.readthedocs.io/en/latest/install.html#pip
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# helper functions stored in local py file
import utils

project_id='lcp-internal'

# Helper function to read data from BigQuery into pandas dataframes.
def run_query(query):
    return pd.io.gbq.read_gbq(query,
                              project_id=project_id, verbose=False,
                              dialect='standard')

# 1. Summarize cohort

In [None]:
query = """
select *
from `lcp-internal.vanco.cohort`
"""
co = run_query(query)

In [None]:
print('== EXCLUSIONS - TOTAL ==')
N = co.shape[0]
print(f'{N:6d} unique unit stays.')
for c in co.columns:
    if c.startswith('exclude_'):
        N = co[c].sum()
        mu = co[c].mean()*100.0
        print(f'  {N:6d} ({mu:4.1f}%) - {c}')
        
print('\n== EXCLUSIONS - SEQUENTIAL ==')
N = co.shape[0]
print(f'{N:7d} unique unit stays.')
idx = co['patientunitstayid'].notnull()
for c in co.columns:
    if c.startswith('exclude_'):
        # index patients removed by this exclusion
        idxRem = (co[c]==1)
        # calculate number of patients being removed, after applying prev excl
        N = (idx & idxRem).sum()
        mu = N/co.shape[0]*100.0
        idx = idx & (~idxRem)
        n_rem = idx.sum()
        
        print(f'- {N:5d} = {n_rem:6d} ({mu:4.1f}% removed) - {c}')

# 3. Analysis


## Get data from BigQuery

In [None]:
# covariates from APACHE table
query = """
SELECT dem.*
FROM `hst-953-2018.team_i.demographics` dem
"""
dem = run_query(query)

# vancomycin drug doses
query = """
SELECT v.*
FROM `lcp-internal.vanco.vanco` v
"""
v = run_query(query)

# AKI
query = """
SELECT 
  patientunitstayid
  , chartoffset
  , creatinine, creatinine_reference, creatinine_baseline
  , aki_48h, aki_7d
FROM `lcp-internal.vanco.aki`
"""
aki = run_query(query)

## Collapse vancomycin data

The `v` dataframe has every vancomycin administration for a patient.

Here we collapse it into two binary columns:

* 'vanco_adm' - vancomycin was administered on ICU admission (between hours -12 and 12)
* 'vanco_wk' - vancomycin was administered sometime between 2-7 days after ICU admission

In [None]:
v_df = utils.extract_adm_and_wk(v, 'vanco')

Print out the proportion of patients with/without vancomycin after exclusions.

In [None]:
# get patient unit stay ID after applying exclusions
idxKeep = co['patientunitstayid'].notnull()
for c in co.columns:
    if c.startswith('exclude_'):
        idxKeep = idxKeep & (co[c]==0)
        
ptid = co.loc[idxKeep, 'patientunitstayid'].values
n_pt = len(ptid)
# limit to those in vancomycin dataframe
ptid = [x for x in ptid if x in v_df.index]

N = len(ptid)
print(f'{n_pt} stays after exclusions.')
for c in v_df.columns:
    N = v_df.loc[ptid, c].sum()
    mu = N / n_pt * 100.0
    print(f'  {N} ({mu:3.1f}%) with {c}')
    
# if they have both adm, then row-wise sum must be greater than 1
N = (v_df.loc[ptid, :] == 1).sum(axis=1)
N = (N>1).sum()
mu = N / n_pt * 100.0
print(f'  {N} ({mu:3.1f}%) with both')

## Create a dataframe for analysis

The below code block:

* Applies exclusions
* Adds vancomycin binary flags
* Adds AKI flag

In [None]:
# drop exclusions
idxKeep = co['patientunitstayid'].notnull()
for c in co.columns:
    if c.startswith('exclude_'):
        idxKeep = idxKeep & (co[c]==0)

# combine data into single dataframe
df = co.loc[idxKeep, ['patientunitstayid']].merge(dem, how='inner', on='patientunitstayid')

# add vanco adminisdtration
df = df.merge(v_df, how='left', on='patientunitstayid')
# if ptid missing in vanco dataframe, then no vanco was received
# therefore impute 0
for c in v_df.columns:
    df[c].fillna(0, inplace=True)
    df[c] = df[c].astype(int)

aki_grp = aki.groupby('patientunitstayid')[['creatinine', 'aki_48h', 'aki_7d']].max()
aki_grp.reset_index(inplace=True)
df = df.merge(aki_grp, how='inner', on='patientunitstayid')

df['aki'] = ((df['aki_48h'] == 1) | (df['aki_7d'] == 1)).astype(int)
print(df.shape)
df.head()

## Propensity matching

In [None]:
# Vanco + No Vanco Analysis
print('\n=== Cross-tabulation of vanco on admission vs. vanco during the week (days 2-7) ===')
display(pd.crosstab(df['vanco_adm'], df['vanco_wk'], margins=True))
print('Normalized:')
display(pd.crosstab(df['vanco_adm'], df['vanco_wk'], margins=True, normalize=True))

### Primary analysis

* exposure: treated with vancomycin on admission to the ICU
* control: *not* treated with vancomycin, for the first 7 days of stay, starting at unit admit time
* excluded
  * patients treated with vanco later in the ICU stay, but not on admission

In [None]:
# INITIAL VANCO vs. NO VANCO
novanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 0)]
vanco = df[(df['vanco_adm'] == 1)]
utils.match_and_print_or(exposure=vanco, control=novanco, seed=12938)

### Secondary analysis

Slight alterations in how the exposure and treatment groups are defined.

* exposure: treated with vancomycin on admission *and* during the week to the ICU
* control: *not* treated with vancomycin, for the first 7 days of stay, starting at unit admit time
* excluded
  * patients treated with vanco later in the ICU stay, but not on admission
  * patients treated with vanco on admission, but not later in the week

In [None]:
# Vanco + No Vanco Analysis
novanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 0)]
vanco = df[(df['vanco_wk'] == 1) & (df['vanco_adm'] == 1)]
utils.match_and_print_or(exposure=vanco, control=novanco)

In [None]:
# Vanco + No Vanco Analysis
novanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 0)]
vanco = df[(df['vanco_wk'] == 0) & (df['vanco_adm'] == 1)]
utils.match_and_print_or(exposure=vanco, control=novanco)