# 04 — Statistical Arbitrage Overlay

Implements Part 4:
- Start with correlation on returns to find candidate pairs
- Then test cointegration (Engle–Granger)
- Visualize spreads / z-scores

Deliverable here is analysis + a mathematically backed overlay idea (not necessarily full code integration).

In [None]:
import pandas as pd

from at.statsarb.pairs import top_correlated_pairs, cointegration_scan
from at.utils.paths import get_paths

In [None]:
paths = get_paths()
df = pd.read_parquet(paths.data_processed / 'features.parquet')
prices = df.pivot(index='date', columns='ticker', values='close').sort_index()
prices.shape

In [None]:
pairs = top_correlated_pairs(prices, top_k=50, method='spearman')
pairs.head(10)

In [None]:
coint = cointegration_scan(prices, pairs, max_pairs=30)
coint.head(10)

## Overlay idea (write-up)

Example: if your main strategy is long-only, use statsarb as:
- A hedging overlay: when a strong cointegrated pair diverges, reduce gross exposure
- A relative-value sleeve: allocate a small risk budget to mean reversion trades

Add justification: spread stationarity tests, half-life, and regime dependence.