# 04 · Stats Primer
Quick reference on confidence intervals, multiple testing (Benjamini–Hochberg), and effect sizes.

## Bootstrap Confidence Intervals
We use percentile intervals from 1000 bootstrap resamples:
1. Sample with replacement from evaluation pairs.
2. Recompute AUROC/AUPRC.
3. Take the 2.5th and 97.5th percentiles as the 95% CI.

In [None]:
import numpy as np
from biofm.eval.metrics import bootstrap_metric, compute_auroc
labels = np.array([0, 1, 0, 1, 0, 1])
scores = np.array([0.2, 0.8, 0.1, 0.9, 0.4, 0.6])
bootstrap_metric(compute_auroc, labels, scores, n_bootstrap=100, seed=7)

## Benjamini–Hochberg FDR
1. Sort p-values ascending.
2. For each index *i*, compute `i/m * q` (where *m* is number of tests, *q* target FDR).
3. Select the largest *i* where `p_i <= i/m * q`; declare all earlier hypotheses significant.

In [None]:
def benjamini_hochberg(p_values, q=0.1):
    m = len(p_values)
    ranked = sorted((p, i) for i, p in enumerate(p_values, 1))
    threshold = 0
    for i, (p, _) in enumerate(ranked, 1):
        if p <= i / m * q:
            threshold = i
    return [idx <= threshold for _, idx in ranked]
benjamini_hochberg([0.01, 0.02, 0.2, 0.15])

## Effect Sizes
For binary outcomes we often report:
- Cohen's *d* for continuous biomarkers
- Risk difference or odds ratio for clinical metrics
Always accompany the effect size with its uncertainty estimate (CI or bootstrap).