# 🩺 4.8 Advanced Clinical Trial Analysis: Pre/Post Blood Pressure Study

In this notebook, we simulate and analyse a **within-subject clinical trial**.  
This is common in nutrition, where we measure outcomes **before and after** an intervention.

---

### 🧪 Hypothetical Study

We simulate a study testing the effect of a **dietary intervention** on **systolic blood pressure (SBP)**.

- Participants are randomly assigned to **Control** or **Intervention**.
- **SBP is measured before and after** the 12-week trial.

---

### 📘 You Will Learn to:
- Simulate repeated measures (pre/post)
- Visualise individual change
- Use **paired t-tests** and **Bayesian paired comparisons**
- Estimate effect sizes using repeated-measures logic


## 📊 Step 1: Simulate Data

We simulate 100 participants:
- `group`: 0 (Control) or 1 (Intervention)
- `sbp_pre`: Baseline systolic BP (mean 140 ± 10)
- `sbp_post`: Post-intervention values with:
  - No change in Control group
  - Reduction (mean -5 mmHg) in Intervention group


In [None]:
import pandas as pd
import numpy as np

np.random.seed(123)
n = 100
group = np.random.choice([0, 1], size=n)  # 0 = Control, 1 = Intervention
sbp_pre = np.random.normal(140, 10, n)

# Post values depend on group
sbp_post = sbp_pre + np.where(group == 0,
                              np.random.normal(0, 5, n),     # Control
                              np.random.normal(-5, 5, n))    # Intervention

df = pd.DataFrame({
    'participant_id': range(1, n+1),
    'group': group,
    'sbp_pre': sbp_pre,
    'sbp_post': sbp_post
})

df.head()

## 📈 Step 2: Visualise Change

We plot SBP before and after for each group to explore:
- Individual trajectories
- Average change in each group

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

sns.set(style='whitegrid')
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot individual lines
for g, ax in zip([0, 1], axes):
    sub = df[df['group'] == g]
    for i in range(len(sub)):
        ax.plot(['Pre', 'Post'], [sub.iloc[i]['sbp_pre'], sub.iloc[i]['sbp_post']], color='grey', alpha=0.3)
    sns.pointplot(data=sub.melt(id_vars='participant_id', value_vars=['sbp_pre', 'sbp_post']),
                  x='variable', y='value', ci='sd', color='red', ax=ax)
    ax.set_title('Control Group' if g == 0 else 'Intervention Group')
    ax.set_ylabel('Systolic BP (mmHg)')

plt.tight_layout()
plt.show()

## 📏 Step 3: Paired t-tests

We compare the **within-subject change** in each group separately,
and then compare the **change between groups**.

In [None]:
from scipy.stats import ttest_rel, ttest_ind

df['change'] = df['sbp_post'] - df['sbp_pre']
control_change = df[df['group'] == 0]['change']
intervention_change = df[df['group'] == 1]['change']

# Paired t-tests within each group
t_ctrl, p_ctrl = ttest_rel(df[df['group'] == 0]['sbp_post'], df[df['group'] == 0]['sbp_pre'])
t_int, p_int = ttest_rel(df[df['group'] == 1]['sbp_post'], df[df['group'] == 1]['sbp_pre'])

# Independent t-test of changes
t_diff, p_diff = ttest_ind(intervention_change, control_change)

print("Within-group change:")
print(f"  Control: t = {t_ctrl:.2f}, p = {p_ctrl:.3f}")
print(f"  Intervention: t = {t_int:.2f}, p = {p_int:.3f}")
print("\nBetween-group difference in change:")
print(f"  t = {t_diff:.2f}, p = {p_diff:.3f}")

## 🧠 Step 4: Bayesian Analysis of Change

We now use a Bayesian approach to compare **change in SBP** between groups, using the same logic as before.

This time we model the difference in **change** scores.

In [None]:
import pymc as pm
import arviz as az

with pm.Model() as model:
    mu = pm.Normal('mu', mu=0, sigma=10, shape=2)  # One mean for each group
    sigma = pm.HalfNormal('sigma', sigma=5)
    y_obs = pm.Normal('y_obs', mu=mu[df['group']], sigma=sigma, observed=df['change'])
    diff = pm.Deterministic('diff', mu[1] - mu[0])
    trace = pm.sample(1000, tune=1000, return_inferencedata=True)

az.plot_posterior(trace, var_names=['diff'], ref_val=0)
plt.title("Posterior Difference in SBP Change")
plt.show()

posterior_diff = trace.posterior['diff'].values.flatten()
posterior_mean = posterior_diff.mean()
hdi = az.hdi(posterior_diff, hdi_prob=0.95)

print(f"Posterior mean difference: {posterior_mean:.2f} mmHg")
print(f"95% HDI: [{hdi[0]:.2f}, {hdi[1]:.2f}]")

### 🔁 Step 4b: Visualise Sampling Chains

In [None]:
az.plot_trace(trace, var_names=['diff'])
plt.suptitle("Trace Plot for Posterior Difference", y=1.02)
plt.show()

## ✅ Summary

You have:
- Simulated a pre/post BP trial dataset
- Visualised change by group
- Compared outcomes using paired frequentist and Bayesian methods

**Takeaway**:  
Paired designs reduce noise and can increase power.  
Bayesian methods help understand the magnitude and certainty of change.

---

### 🔁 Optional Exercises
1. Increase the sample size to 200. What happens to the results?
2. Change the intervention effect to -10 mmHg. What do you observe?
3. Model SBP with a hierarchical structure (e.g. clinic-level effects).
