# fMRI Lab: Day 2 Validation

## HYPOTHESIS-DRIVEN

Welcome back. Today you find out whether the finding you pre-registered on Day 1 holds up in an independent dataset: data collected separately and never used in your original analysis.

---

## Before You Begin

Fill in the cell below with your pre-registered finding exactly as it appeared in your Day 1 notebook. Then run all cells from top to bottom.

- **`TOPIC`**: your assigned topic (same as Day 1)
- **`my_edge`**: the two ROI names you presented in class
- **`covariates`**: any variables you pre-registered as covariates, or `None`
- **`outlier_threshold`**: the z-score cutoff you used, or `None`
- **`subgroup`**: any subgroup filter you pre-registered, or `None`

The point of filling these in first is that you are committing to your analysis plan before seeing the validation results. That is what makes this a real test.

In [None]:
# ============================================================
# FILL IN YOUR FINDINGS FROM DAY 1 (then run all cells)
# ============================================================

TOPIC             = 'depression'          # 'depression' or 'pain'
my_edge           = ('ROI_A', 'ROI_B')   # replace with your pre-registered edge
covariates        = None                  # e.g. ['Sleep_Quality'] or None
outlier_threshold = None                  # e.g. 2 or None
subgroup          = None                  # e.g. {'Sex': 0} or None

In [None]:
# Install packages and download data files
import subprocess, sys
subprocess.check_call([sys.executable, '-m', 'pip', 'install', 'nilearn', 'statsmodels', '-q'])

import os, urllib.request
base_url = 'https://raw.githubusercontent.com/cmahlen/python-stats-demo/main/'
files_needed = [
    'lab_helpers.py', 'atlas_labels.txt', 'data/roi_mni_coords.npy',
    f'data/{TOPIC}_discovery.npz', f'data/{TOPIC}_validation.npz',
]
os.makedirs('data', exist_ok=True)
for f in files_needed:
    if not os.path.exists(f):
        urllib.request.urlretrieve(base_url + f, f)

import lab_helpers as helpers
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Outcome variable is determined by topic
BEHAVIOR_COL = {'depression': 'PHQ9', 'pain': 'Pain_VAS', 'anxiety': 'Anxiety_GAD7'}[TOPIC]

print("Setup complete!")
print(f"Topic: {TOPIC}  |  Outcome: {BEHAVIOR_COL}")
print(f"Edge: {my_edge[0]} <-> {my_edge[1]}")
print(f"Covariates: {covariates}  |  Outlier threshold: {outlier_threshold}  |  Subgroup: {subgroup}")

---

## Part 1: Confirm Your Discovery Result

Before touching the validation data, let's confirm your Day 1 analysis runs correctly here. The result should match what you presented in class.

In [None]:
helpers.load_dataset(TOPIC, 'discovery')

r_disc, p_disc, n_disc = helpers.test_edge(
    my_edge[0], my_edge[1],
    behavior_col=BEHAVIOR_COL,
    covariates=covariates,
    exclude_outliers=outlier_threshold,
    subgroup=subgroup
)

print(f"Discovery:  r = {r_disc:.3f},  p = {p_disc:.4f},  n = {n_disc}")

helpers.plot_edge(
    my_edge[0], my_edge[1],
    behavior_col=BEHAVIOR_COL,
    covariates=covariates,
    exclude_outliers=outlier_threshold,
    subgroup=subgroup
)

---

## Part 2: Reflection Before Validation

Before you see the validation results, take a moment to think through these questions. Write your answers in the cell below.

1. How confident are you that your finding will replicate? Why?
2. What could still go wrong, even with a pre-registered analysis?
3. How does your approach differ from the exploratory group's approach? Why might that matter?
4. Your r value from the discovery dataset: would you consider that a large or small effect? How does that affect your expectations?

*Write your thoughts here before running the next section.*

---

## Part 3: Validate Your Finding

Now we test the same edge, with the exact same analytic choices, in the independent validation dataset. This dataset was not used at any point in your Day 1 analysis.

You do not need to understand every line of code below; just run the cell and focus on the output.

In [None]:
helpers.load_dataset(TOPIC, 'validation')

r_val, p_val, n_val = helpers.test_edge(
    my_edge[0], my_edge[1],
    behavior_col=BEHAVIOR_COL,
    covariates=covariates,
    exclude_outliers=outlier_threshold,
    subgroup=subgroup
)

print(f"Validation: r = {r_val:.3f},  p = {p_val:.4f},  n = {n_val}")

helpers.plot_edge(
    my_edge[0], my_edge[1],
    behavior_col=BEHAVIOR_COL,
    covariates=covariates,
    exclude_outliers=outlier_threshold,
    subgroup=subgroup
)

In [None]:
# Replication verdict
same_direction = (r_disc is not None) and (r_val is not None) and (r_disc * r_val > 0)

print("=" * 55)
print(f"  Discovery:  r = {r_disc:.3f},  p = {p_disc:.4f}")
print(f"  Validation: r = {r_val:.3f},  p = {p_val:.4f}")
print("=" * 55)

if p_val < 0.05 and same_direction:
    print("  REPLICATED: significant in the same direction")
elif p_val < 0.05 and not same_direction:
    print("  FLIPPED: significant, but direction reversed")
else:
    print(f"  Did not replicate (p = {p_val:.4f} in validation)")
print("=" * 55)

---

## Part 4: Your Turn

Pick one of the edges that did **not** reach significance in your Day 1 analysis and test it in the validation dataset. Does it replicate?

<details><summary>Hint</summary>

```python
helpers.load_dataset(TOPIC, 'validation')
r, p, n = helpers.test_edge('ROI_A', 'ROI_B', behavior_col=BEHAVIOR_COL)
print(f"r = {r:.3f}, p = {p:.4f}, n = {n}")
```

</details>

In [None]:
# Your code here


---

## Part 5: Reflection Questions

1. Did your finding replicate? Does the outcome match what you predicted in Part 2?

2. Pre-registration means committing to your analysis before seeing the data. How did that feel compared to the flexibility the exploratory group had?

3. Even with pre-registration, there are still researcher degrees of freedom (choice of covariate, outlier threshold, subgroup). How did you decide on yours? Could those choices have been influenced by wanting to find something?

4. Effect sizes: your r values are probably in the 0.2 to 0.4 range. In psychology and neuroscience, these are considered small-to-medium effects. What does that mean for how confident we should be in any single study?