In [None]:
# Inverse Probability of Treatment Weights
In the last tutorial, we were interested in the average treatment effect. We will now switch to a slightly different target estimand, the average treatment effect in the treated. It is defined as
$$E[Y^{a=1}|A=1] - E[Y^{a=0}|A=1]$$
By causal consistency, this reduces to 
$$E[Y|A=1] - E[Y^{a=0}|A=1]$$
which means we only need a slightly weaker version of conditional exchangeability, namely that $Y^{a=0} \amalg A$

Weights to estimate this effect are generally referred to as standardized mortality ratios. The important thing to remember is the target estimand of interest in our study

## Standardized Mortality Ratio
The SMR weights are slightly different in form. Among those who are treated ($A=1$), their weight is 1. We don't need to change their weight in the pseudo population. We do need to re-weight the untreated ($A=0$). The unstabilized weights take the following form
$$\frac{\widehat{\Pr}(A=1|L=l)}{\widehat{\Pr}(A=0|L=l)}$$
Technically, these are inverse odds weights, but I will ignore these semantics. Stabilized weights look like
$$\frac{\widehat{\Pr}(A=1|L=l)}{\widehat{\Pr}(A=0|L=l)} \frac{\widehat{\Pr}(A=0)}{\widehat{\Pr}(A=1)}$$

For the average effect of the treatment in the untreated, we construct weights using a similar approach. To motivate our example, we will use a simulated data set included with *zEpid*. In the data set, we have a cohort of HIV-positive individuals. We are interested in the sample average treatment effect of antiretroviral therapy (ART) on all-cause mortality at 45-weeks. Based on substantive background knowledge, we believe that the treated and untreated population are exchangeable based on gender, age, CD4 T-cell count, and detectable viral load.

In [1]:
%matplotlib inline

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import zepid
from zepid import load_sample_data, spline
from zepid.causal.ipw import IPTW

print(zepid.__version__)

0.9.0


In [2]:
df = load_sample_data(False)
df.info()

df[['age_rs1', 'age_rs2']] = spline(df, 'age0', n_knots=3, term=2, restricted=True)
df[['cd4_rs1', 'cd4_rs2']] = spline(df, 'cd40', n_knots=3, term=2, restricted=True)

<class 'pandas.core.frame.DataFrame'>
Int64Index: 547 entries, 0 to 546
Data columns (total 9 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   id        547 non-null    int64  
 1   male      547 non-null    int64  
 2   age0      547 non-null    int64  
 3   cd40      547 non-null    int64  
 4   dvl0      547 non-null    int64  
 5   art       547 non-null    int64  
 6   dead      517 non-null    float64
 7   t         547 non-null    float64
 8   cd4_wk45  460 non-null    float64
dtypes: float64(3), int64(6)
memory usage: 42.7 KB


## Average Treatment Effect in the Treated
To start, we will estimate the average treatment effect in the treated. We can do that by using `IPTW` and specifying the option `standardize='exposed'`, which will calculate the appropriate weights for our target estimand. 

In [3]:
iptw = IPTW(df.drop(columns='cd4_wk45'), treatment='art', outcome='dead', standardize='exposed')



Afterwards, we proceed with the same process to calculate the weights and fit the marginal structural model detailed in the previous IPTW tutorial. Below are the results

In [4]:
iptw.treatment_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0', 
                     stabilized=False, print_results=False)
iptw.marginal_structural_model('art')
iptw.fit()
iptw.summary()



              Inverse Probability of Treatment Weights                
Treatment:        art             No. Observations:     547                 
Outcome:          dead            No. Missing Outcome:  30                  
g-Model:          Logistic        Missing Model:        None                
Risk Difference
----------------------------------------------------------------------
              RD  SE(RD)  95%LCL  95%UCL
labels                                  
Intercept  0.221   0.025   0.172   0.269
art       -0.091   0.046  -0.180  -0.002
----------------------------------------------------------------------
Risk Ratio
              RR  SE(log(RR))  95%LCL  95%UCL
labels                                       
Intercept  0.221        0.112   0.177   0.275
art        0.588        0.315   0.317   1.092
----------------------------------------------------------------------
Odds Ratio
              OR  SE(log(OR))  95%LCL  95%UCL
labels                                       
Interce

### Stabilized
Additionally, we can calculate the stabilized weights

In [5]:
iptw = IPTW(df.drop(columns='cd4_wk45'), treatment='art', outcome='dead', standardize='exposed')
iptw.treatment_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0', 
                     print_results=False)
iptw.marginal_structural_model('art')
iptw.fit()
iptw.summary()



              Inverse Probability of Treatment Weights                
Treatment:        art             No. Observations:     547                 
Outcome:          dead            No. Missing Outcome:  30                  
g-Model:          Logistic        Missing Model:        None                
Risk Difference
----------------------------------------------------------------------
              RD  SE(RD)  95%LCL  95%UCL
labels                                  
Intercept  0.221   0.025   0.172   0.269
art       -0.091   0.046  -0.180  -0.002
----------------------------------------------------------------------
Risk Ratio
              RR  SE(log(RR))  95%LCL  95%UCL
labels                                       
Intercept  0.221        0.112   0.177   0.275
art        0.588        0.315   0.317   1.092
----------------------------------------------------------------------
Odds Ratio
              OR  SE(log(OR))  95%LCL  95%UCL
labels                                       
Interce

The results, as expected, are the same between the unstabilized and stabilized weights. We can also use the same process to estimate the effect of ART on continuous treatments detailed in the IPTW tutorial. I leave that as a challenge for you

## Average Treatment Effect in the Untreated
We can also standardize to the untreated. Below is our estimand
$$E[Y^{a=1}|A=0] - E[Y|A=0]$$
Instead of setting `standardize` to exposed, we instead set `standardize='unexposed'`. Let's look at an example with unstabilized weights

In [6]:
iptw = IPTW(df.drop(columns='cd4_wk45'), treatment='art', outcome='dead', standardize='unexposed')
iptw.treatment_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0', 
                     stabilized=False, print_results=False)
iptw.marginal_structural_model('art')
iptw.fit()
iptw.summary()



              Inverse Probability of Treatment Weights                
Treatment:        art             No. Observations:     547                 
Outcome:          dead            No. Missing Outcome:  30                  
g-Model:          Logistic        Missing Model:        None                
Risk Difference
----------------------------------------------------------------------
              RD  SE(RD)  95%LCL  95%UCL
labels                                  
Intercept  0.175   0.018   0.139   0.211
art       -0.080   0.038  -0.154  -0.007
----------------------------------------------------------------------
Risk Ratio
              RR  SE(log(RR))  95%LCL  95%UCL
labels                                       
Intercept  0.175        0.104   0.143   0.214
art        0.543        0.361   0.267   1.101
----------------------------------------------------------------------
Odds Ratio
              OR  SE(log(OR))  95%LCL  95%UCL
labels                                       
Interce

Now with stabilized weights...

In [7]:
iptw = IPTW(df.drop(columns='cd4_wk45'), treatment='art', outcome='dead', standardize='unexposed')
iptw.treatment_model('male + age0 + age_rs1 + age_rs2 + cd40 + cd4_rs1 + cd4_rs2 + dvl0', 
                     stabilized=False, print_results=False)
iptw.marginal_structural_model('art')
iptw.fit()
iptw.summary()



              Inverse Probability of Treatment Weights                
Treatment:        art             No. Observations:     547                 
Outcome:          dead            No. Missing Outcome:  30                  
g-Model:          Logistic        Missing Model:        None                
Risk Difference
----------------------------------------------------------------------
              RD  SE(RD)  95%LCL  95%UCL
labels                                  
Intercept  0.175   0.018   0.139   0.211
art       -0.080   0.038  -0.154  -0.007
----------------------------------------------------------------------
Risk Ratio
              RR  SE(log(RR))  95%LCL  95%UCL
labels                                       
Intercept  0.175        0.104   0.143   0.214
art        0.543        0.361   0.267   1.101
----------------------------------------------------------------------
Odds Ratio
              OR  SE(log(OR))  95%LCL  95%UCL
labels                                       
Interce

So, why are the results different? Well it is simply because our target estimand is different. The distribution of potential modifiers will differ between the treated and untreated. The difference in the distribution of modifiers will result in different average treatment effects. This is why it is essential to clearly communicate the target estimand of your analysis.

# Conclusion
In this tutorial, I went through the basics of inverse probability of treatment weights modified to estimate the average treatment effect in the (un)treated and using them to estimate marginal structural models. See the below reference for further details on these weights

## References
Sato T, Matsuyama Y. (2003). Marginal structural models as a tool for standardization. Epidemiology, 14(6), 680-686.