# Lecture 11: Inverse Probability Treatment Weighting (IPTW)

[!["Open In Colab"](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/<ORG>/<REPO>/blob/main/lectures/L11_IP_Weighting/L11_IP_Weighting_student.ipynb)

## Learning Objectives
1. Define the **propensity score** and its role in reweighting.
2. Calculate **Inverse Probability Treatment Weights (IPTW)**.
3. Use **Love plots** (SMD) and **overlap plots** to diagnose the quality of the weights.
4. Estimate the Average Treatment Effect (ATE) using weighted models.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
from phs564_ci.datasets import load_data
from phs564_ci.diagnostics.balance import calculate_smd

# Load dataset with multiple confounders
df = load_data("l11_iptw.csv")
df.head()

--- 
## üõë Activity 1: Compute weights & balance (Slide 13)

We will walk through the IPTW pipeline.

### Step 1: Fit the Propensity Score Model

In [None]:
# Probability of treatment A given L1, L2, L3
ps_model = smf.logit("A ~ L1 + L2 + L3", data=df).fit()
df['ps'] = ps_model.predict(df)

### Step 2: Calculate IPTW Weights

In [None]:
# Formula: w = A/ps + (1-A)/(1-ps)
df['weights'] = np.where(df['A'] == 1, 1/df['ps'], 1/(1-df['ps']))
print(df[['A', 'ps', 'weights']].head())

--- 
### üñºÔ∏è Figure Generation: Overlap Plot (Slide 07)
Check for positivity violations.

In [None]:
plt.figure(figsize=(10, 6))
sns.kdeplot(data=df[df['A']==1], x='ps', label='Treated (A=1)', fill=True)
sns.kdeplot(data=df[df['A']==0], x='ps', label='Untreated (A=0)', fill=True)
plt.title("Propensity Score Overlap")
plt.xlabel("Propensity Score Pr(A=1|L)")
plt.legend()
plt.savefig("figures/L11/overlap.png")
plt.show()

--- 
### üñºÔ∏è Figure Generation: Balance (Love Plot) (Slide 09)
Compare SMDs before and after weighting.

In [None]:
confounders = ['L1', 'L2', 'L3']
smd_unweighted = calculate_smd(df, 'A', confounders)
smd_weighted = calculate_smd(df, 'A', confounders, weights='weights')

love_df = pd.DataFrame({
    'Variable': confounders * 2,
    'SMD': list(smd_unweighted.values()) + list(smd_weighted.values()),
    'Type': ['Unweighted'] * 3 + ['Weighted'] * 3
})

plt.figure(figsize=(8, 6))
sns.scatterplot(data=love_df, x='SMD', y='Variable', hue='Type', style='Type', s=100)
plt.axvline(0.1, color='gray', linestyle='--')
plt.axvline(0, color='black')
plt.title("Love Plot: Covariate Balance")
plt.savefig("figures/L11/love_plot.png")
plt.show()

--- 
### 3. Estimating the Effect
Using weighted least squares (WLS) for the final estimate.

In [None]:
# Robust standard errors are essential!
weighted_model = smf.wls("Y ~ A", data=df, weights=df['weights']).fit(cov_type='HC1')
print(weighted_model.summary().tables[1])

### 4. Summary
- Propensity scores reduce all confounders into a single probability.
- IPTW uses this score to balance the groups.
- Diagnostics (SMD < 0.1) are the most important part of the analysis.