# 03 – Confounders, Colliders, and Mediators

In this notebook, we explore:
- What are confounders, colliders, and mediators?
- Why controlling for the wrong variable can backfire
- How to use DAGs to think through relationships
- Examples using our dataset


## 🧠 Directed Acyclic Graphs (DAGs)

To understand confounding, colliders, and mediation, we use **DAGs**. These are visual tools to represent causal relationships.

- **Confounder**: A variable that affects both exposure and outcome.
- **Collider**: A variable that is affected by two variables.
- **Mediator**: A variable that is in the causal path between exposure and outcome.

We recommend using [DAGitty](http://www.dagitty.net/) to try these out yourself.


## ⚗️ Example Setup

Let’s say we’re interested in the effect of **energy intake** on **BMI**.

We suspect:
- **Age** is a confounder (affects both).
- **Sex** may modify the relationship.
- **SBP** (systolic blood pressure) is a *collider* if affected by both BMI and age.


In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf

df = pd.read_csv('https://raw.githubusercontent.com/ggkuhnle/FB2NEP_datascience/main/data/fb2nep_synthetic.csv')

### Step 1 – Unadjusted model

In [None]:
model_unadj = smf.ols('bmi ~ energy_kcal', data=df).fit()
print(model_unadj.summary())

### Step 2 – Adjusting for a Confounder (age)

In [None]:
model_adj = smf.ols('bmi ~ energy_kcal + age', data=df).fit()
print(model_adj.summary())

### Step 3 – Adjusting for a Collider (sbp)

In [None]:
model_collider = smf.ols('bmi ~ energy_kcal + sbp', data=df).fit()
print(model_collider.summary())

### Step 4 – Mediation (Exploratory)

In [None]:
# Check: does age affect energy?
smf.ols('energy_kcal ~ age', data=df).fit().summary()

In [None]:
# Full model: BMI ~ age + energy_kcal
smf.ols('bmi ~ age + energy_kcal', data=df).fit().summary()

## 🧠 Exercise

1. Identify a potential confounder in the relationship between `sex` and `sbp`.  
2. Fit models with and without this confounder.  
3. Interpret how the coefficient for `sex` changes.  
4. Try a model where you (incorrectly) adjust for a collider.

✍️ Add your comments and reasoning below.


## 🧪 Playground – experiment here

In [None]:
# Your code here