# Lecture 04: Effect Modification and Interaction

[!["Open In Colab"](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/<ORG>/<REPO>/blob/main/lectures/L04_Effect_Modification/L04_Effect_Modification_student.ipynb)

## Learning Objectives
1. Distinguish between **effect modification** and **interaction**.
2. Understand **scale-dependence** (heterogeneity on RD vs RR scale).
3. Calculate stratum-specific effects and interpret them.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from phs564_ci.datasets import load_data

# Load data with a potential modifier 'V'
df = load_data("l04_effect_modification.csv")
df.head()

--- 
### 1. Stratum-Specific Risks
We calculate the risk of outcome `Y` by treatment `A` and modifier `V`.

In [None]:
risks = df.groupby(['V', 'A'])['Y'].mean().unstack()
print("Risk Table:")
print(risks)

--- 
### üñºÔ∏è Figure Generation: Risk by Group (Slide 08)
Visualizing the interaction.

In [None]:
plt.figure(figsize=(8, 6))
risks.plot(kind='bar', ax=plt.gca())
plt.title("Risk of Outcome by Treatment and Subgroup V")
plt.ylabel("Risk (Mean of Y)")
plt.xlabel("Subgroup V")
plt.legend(title="Treatment A")
plt.savefig("figures/L04/risk_by_group.png")
plt.show()

--- 
### 2. Calculating Effect Measures by Stratum
Is there effect modification on the RD scale? On the RR scale?

In [None]:
risks['RD'] = risks[1] - risks[0]
risks['RR'] = risks[1] / risks[0]

print("Effect Measures by Stratum:")
print(risks[['RD', 'RR']])

--- 
## üõë Activity 1: Choose scale for 2 questions (Slide 12)

**Question 1:** "If we have limited resources, should we prioritize giving this intervention to Group V=0 or Group V=1?"
- Which scale (RD or RR) tells you the 'bang for your buck' in terms of lives saved?

**Question 2:** "Is the drug's biological mechanism of action consistent across groups?"
- Which scale is often used to assess biological consistency?

--- 
### 3. Interaction in Regression
Using a linear model to see the interaction term.

In [None]:
import statsmodels.formula.api as smf

# Model with interaction term A*V
model = smf.ols("Y ~ A * V", data=df).fit()
print(model.summary().tables[1])

**Interpretation:**
- The coefficient for `A:V` is the change in the Risk Difference when moving from $V=0$ to $V=1$.

### 4. Summary
- Effect modification is scale-dependent.
- RD is crucial for public health decisions.
- Regression models use interaction terms to test for heterogeneity.