# FB2NEP Workbook 12 – Causal Inference in Nutritional Epidemiology: Where Next?

This workbook provides a conceptual overview of methods that go beyond traditional regression:

- Mendelian randomisation (MR).
- Negative control designs.
- G‑methods.
- Trial emulation with cohort data.

## 1. Mendelian randomisation (MR)

Idea: use genetic variants as instruments for an exposure (for example, LDL cholesterol).

Key assumptions (simplified):

1. The variant is associated with the exposure.
2. The variant is not associated with confounders of the exposure–outcome relation.
3. The variant affects the outcome only through the exposure.

In [None]:
from __future__ import annotations

import numpy as np
import pandas as pd
import statsmodels.api as sm

np.random.seed(11088)
n = 1000
G = np.random.binomial(2, 0.3, size=n)
X = 0.5 * G + np.random.normal(0, 1, size=n)
Y = 0.8 * X + np.random.normal(0, 1, size=n)
df_iv = pd.DataFrame({"G": G, "X": X, "Y": Y})

# Naïve regression of Y on X
X_design = sm.add_constant(df_iv["X"])
model_naive = sm.OLS(df_iv["Y"], X_design).fit()

# Two‑stage least squares (very simple)
X1 = sm.add_constant(df_iv["G"])
stage1 = sm.OLS(df_iv["X"], X1).fit()
X_hat = stage1.predict(X1)
X2 = sm.add_constant(X_hat)
stage2 = sm.OLS(df_iv["Y"], X2).fit()

print("Naïve association (Y on X):")
print(model_naive.params)
print("\nIV estimate (two‑stage):")
print(stage2.params)

## 2. Negative control designs

Negative controls are variables for which no causal effect is expected.

- A **negative control outcome** should not be affected by the exposure.
- A **negative control exposure** should not cause the outcome but share similar confounding.

Observing an association where none should exist suggests residual confounding or bias.

## 3. G‑methods (very brief)

G‑methods, such as marginal structural models, address time‑varying exposures and confounders.
They rely on weighting strategies to estimate the effect of sustained interventions.

Implementation requires longitudinal data and careful model specification.

## 4. Trial emulation with cohort data

Using observational cohort data, we can emulate a target trial by clearly specifying:

- Eligibility criteria.
- Exposure strategies and timing.
- Outcome definition and follow‑up.
- Analysis approach (for example, intention‑to‑treat analogue).

This helps reduce biases such as immortal time bias and time‑lag bias.

## 5. Reflection

- Which of these modern methods do you find most convincing?
- How might they change your interpretation of nutritional epidemiology findings?
- If you were advising a policymaker, which types of evidence would you prioritise?