# CUSUM Regime Detection on S&P 500
## GARCH-Filtered Residuals · Portfolio-Level Signal · Calibrated Cooldown · Backtest

**Methodology:**
1. Simulate S&P 500-like returns with known regime structure (swap `yfinance` in for live data)
2. Fit GARCH(1,1) and use standardised residuals to remove volatility clustering
3. Run two-sided CUSUM with threshold *h* calibrated to a target ARL
4. Calibrate cooldown period from Monte Carlo inter-signal gap distribution
5. Backtest as regime-switching strategy and evaluate signal quality vs true regimes


## 0. Imports & Style

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import matplotlib.gridspec as gridspec
import warnings
warnings.filterwarnings("ignore")
from scipy import stats
from scipy.optimize import minimize

np.random.seed(42)
plt.rcParams.update({
    "figure.facecolor": "#0f1117", "axes.facecolor": "#0f1117",
    "axes.edgecolor": "#444", "axes.labelcolor": "#ccc",
    "xtick.color": "#aaa", "ytick.color": "#aaa", "text.color": "#eee",
    "grid.color": "#2a2a2a", "grid.linestyle": "--",
    "lines.linewidth": 1.4, "font.family": "monospace",
})
ACCENT = "#00d4aa"; RED = "#ff4d6d"; YELLOW = "#ffd166"
BLUE = "#4cc9f0"; PURPLE = "#9d4edd"
print("Imports OK")


## 1. Data — Simulated S&P 500 (2000-2024)

Returns are simulated with **explicit regime structure** matching S&P 500 history.
Each regime has its own (mu, sigma) and GARCH-like vol dynamics.

| Regime | Char |
|---|---|
| Dot-com bust 2000-2002 | Negative drift, elevated vol |
| Bull 2004-2007 | Low vol, steady gains |
| GFC crash 2008-2009 | Severe crash, extreme vol |
| QE recovery 2010-2019 | Long bull market |
| COVID crash Q1 2020 | Fastest bear on record |
| Rate hike bear 2022 | Negative, elevated vol |
| AI bull 2023-2024 | Strong positive |

> **Real data**: Uncomment the `yfinance` block and comment out `simulate_sp500()`.


In [None]:
# ---- Real data swap-in (requires: pip install yfinance arch) ----
# import yfinance as yf
# raw     = yf.download("^GSPC", start="2000-01-01", end="2024-12-31", auto_adjust=True)
# prices  = raw["Close"].dropna()
# returns = np.log(prices / prices.shift(1)).dropna()
# true_regimes = None   # no ground-truth labels for real data
# -----------------------------------------------------------------

def simulate_sp500(seed=42):
    """Simulate S&P 500-like daily log-returns with realistic regime structure."""
    rng = np.random.default_rng(seed)
    # (label, trading_days, daily_mu, daily_sigma, GARCH_alpha, GARCH_beta)
    regimes = [
        ("Dot-com bust",   504, -0.0008, 0.016, 0.12, 0.82),
        ("Recovery 2003",  252,  0.0005, 0.010, 0.07, 0.88),
        ("Bull 2004-07",  1008,  0.0004, 0.008, 0.06, 0.90),
        ("GFC crash",      504, -0.0015, 0.025, 0.18, 0.76),
        ("QE recovery",   2520,  0.0005, 0.009, 0.07, 0.89),
        ("COVID crash",     60, -0.0040, 0.040, 0.22, 0.70),
        ("COVID recovery", 192,  0.0015, 0.015, 0.10, 0.84),
        ("Bull 2021",      252,  0.0006, 0.010, 0.06, 0.90),
        ("Rate hike bear", 252, -0.0008, 0.017, 0.13, 0.81),
        ("AI bull",        504,  0.0007, 0.010, 0.07, 0.89),
    ]
    all_rets, all_labels, all_dates = [], [], []
    date = pd.Timestamp("2000-01-03")
    for label, ndays, mu, sigma, alpha, beta in regimes:
        h = sigma**2
        omega = sigma**2 * (1 - alpha - beta)
        eps = rng.standard_normal(ndays)
        rets = np.zeros(ndays)
        for t in range(ndays):
            h = omega + alpha*(rets[t-1]**2 if t > 0 else 0) + beta*h
            h = max(h, 1e-8)
            rets[t] = mu + np.sqrt(h)*eps[t]
        for r in rets:
            while date.weekday() >= 5:
                date += pd.Timedelta(days=1)
            all_rets.append(r); all_labels.append(label); all_dates.append(date)
            date += pd.Timedelta(days=1)
    idx = pd.DatetimeIndex(all_dates)
    return (pd.Series(all_rets, index=idx, name="SP500"),
            pd.Series(all_labels, index=idx, name="regime"))

returns, true_regimes = simulate_sp500()
price_index = 1000 * np.exp(returns.cumsum())

print(f"Date range    : {returns.index[0].date()} to {returns.index[-1].date()}")
print(f"Observations  : {len(returns):,}")
print(f"Ann. return   : {returns.mean()*252:.2%}")
print(f"Ann. vol      : {returns.std()*np.sqrt(252):.2%}")
print(f"Skewness      : {returns.skew():.3f}")
print(f"Excess kurt.  : {returns.kurt():.3f}")


In [None]:
regime_palette = {
    "Dot-com bust": "#ff4d6d22", "Recovery 2003": "#00d4aa22",
    "Bull 2004-07": "#00d4aa22", "GFC crash": "#ff4d6d22",
    "QE recovery":  "#00d4aa22", "COVID crash": "#ff4d6d22",
    "COVID recovery": "#ffd16622", "Bull 2021": "#00d4aa22",
    "Rate hike bear": "#ff4d6d22", "AI bull": "#00d4aa22",
}

fig, axes = plt.subplots(2, 1, figsize=(16, 8), sharex=True)

ax = axes[0]
ax.plot(price_index.index, price_index.values, color=ACCENT, lw=1.2)
ax.set_ylabel("Price Index (log)"); ax.set_yscale("log")
ax.set_title("Simulated S&P 500 — Price Index 2000-2024", fontsize=13)
prev_r = true_regimes.iloc[0]; seg_start = true_regimes.index[0]
for i, (dt, r) in enumerate(true_regimes.items()):
    if r != prev_r or i == len(true_regimes)-1:
        ax.axvspan(seg_start, dt, color=regime_palette.get(prev_r, "#fff1"), lw=0)
        seg_start = dt; prev_r = r
ax.grid(True, alpha=0.4)

ax = axes[1]
ax.fill_between(returns.index, returns.values, 0,
    where=returns.values >= 0, color=ACCENT, alpha=0.6, lw=0)
ax.fill_between(returns.index, returns.values, 0,
    where=returns.values < 0, color=RED, alpha=0.6, lw=0)
ax.set_ylabel("Log Return"); ax.set_title("Daily Log Returns"); ax.grid(True, alpha=0.4)
plt.tight_layout(); plt.show()


## 2. GARCH(1,1) Filtering

Raw returns exhibit **volatility clustering** — sustained high-vol periods create large
deviations that trigger CUSUM even without a mean regime change.

We fit GARCH(1,1) by maximum likelihood and use the **standardised residuals**:
$$z_t = \varepsilon_t / \sigma_t, \quad \text{where } \sigma_t^2 = \omega + \alpha\varepsilon_{t-1}^2 + \beta\sigma_{t-1}^2$$

After filtering, $z_t \approx \mathcal{N}(0,1)$ under stable conditions. Regime changes
then appear as persistent shifts in the *mean* of $z_t$.


In [None]:
def garch_loglik(params, r):
    """Negative log-likelihood for GARCH(1,1) with Gaussian innovations."""
    mu, omega, alpha, beta = params
    if omega <= 0 or alpha < 0 or beta < 0 or alpha+beta >= 1:
        return 1e10
    n = len(r); e = r - mu
    h = np.full(n, np.var(r))
    ll = 0.0
    for t in range(1, n):
        h[t] = omega + alpha*e[t-1]**2 + beta*h[t-1]
        if h[t] <= 0: return 1e10
        ll += -0.5*(np.log(2*np.pi) + np.log(h[t]) + e[t]**2/h[t])
    return -ll

def fit_garch(r):
    """Fit GARCH(1,1) via L-BFGS-B. Returns params, conditional vol, z-residuals."""
    rv = np.var(r)
    x0 = [np.mean(r), rv*0.05, 0.08, 0.88]
    bnds = [(None,None),(1e-7,None),(1e-5,0.5),(1e-5,0.998)]
    res = minimize(garch_loglik, x0, args=(r,), method="L-BFGS-B",
                   bounds=bnds, options={"maxiter":5000,"ftol":1e-10})
    mu, omega, alpha, beta = res.x
    n = len(r); e = r - mu
    h = np.full(n, np.var(r))
    for t in range(1, n):
        h[t] = max(omega + alpha*e[t-1]**2 + beta*h[t-1], 1e-8)
    sigma = np.sqrt(h)
    z = e / sigma
    return {"mu":mu,"omega":omega,"alpha":alpha,"beta":beta}, sigma, z

r_arr = returns.values
garch_params, cond_vol, z_resid = fit_garch(r_arr)
cond_vol_s = pd.Series(cond_vol, index=returns.index)
z_s        = pd.Series(z_resid,  index=returns.index)

gp = garch_params
print("GARCH(1,1) fitted parameters:")
print(f"  mu    = {gp['mu']:.6f}  ({gp['mu']*252:.3%} annualised)")
print(f"  omega = {gp['omega']:.3e}")
print(f"  alpha = {gp['alpha']:.4f}  (ARCH effect)")
print(f"  beta  = {gp['beta']:.4f}  (GARCH effect)")
print(f"  persistence (alpha+beta) = {gp['alpha']+gp['beta']:.4f}  (<1 = covariance stationary)")
uncond = np.sqrt(gp["omega"]/(1-gp["alpha"]-gp["beta"])) * np.sqrt(252)
print(f"  Unconditional ann. vol   = {uncond:.2%}")
print(f"\nResiduals z_t: mean={z_resid.mean():.4f}, std={z_resid.std():.4f}, "
      f"kurt={pd.Series(z_resid).kurt():.2f}")


In [None]:
fig, axes = plt.subplots(2, 2, figsize=(16, 8))

# Conditional vol
ax = axes[0,0]
v = cond_vol_s * np.sqrt(252) * 100
ax.plot(v.index, v.values, color=YELLOW, lw=0.8)
ax.fill_between(v.index, v.values, alpha=0.3, color=YELLOW)
ax.set_title("Conditional Volatility (Ann. %)"); ax.set_ylabel("Vol %"); ax.grid(True, alpha=0.4)

# Standardised residuals
ax = axes[0,1]
ax.plot(z_s.index, z_s.values, color=BLUE, lw=0.4, alpha=0.8)
ax.axhline(0, color="white", lw=0.8, ls="--")
ax.axhline( 3, color=RED, lw=0.8, ls=":"); ax.axhline(-3, color=RED, lw=0.8, ls=":")
ax.set_title("Standardised Residuals z_t"); ax.set_ylabel("z"); ax.grid(True, alpha=0.4)

# QQ plot
ax = axes[1,0]
qq = stats.probplot(z_resid)
ax.scatter(qq[0][0], qq[0][1], color=ACCENT, s=2, alpha=0.4)
lim = max(abs(qq[0][0].min()), abs(qq[0][0].max()))
ax.plot([-lim,lim],[qq[1][0]*(-lim)+qq[1][1], qq[1][0]*lim+qq[1][1]], color=RED, lw=1.5)
ax.set_title("QQ Plot (vs Normal)"); ax.set_xlabel("Theoretical Quantiles"); ax.grid(True, alpha=0.4)

# ACF of z^2 (ARCH effects)
ax = axes[1,1]
z2 = z_resid**2; lags = range(1,31)
acf = [pd.Series(z2).autocorr(lag=l) for l in lags]
ci  = 1.96/np.sqrt(len(z2))
ax.bar(lags, acf, color=PURPLE, alpha=0.7)
ax.axhline(ci, color=YELLOW, ls="--", lw=1, label="95% CI")
ax.axhline(-ci, color=YELLOW, ls="--", lw=1)
ax.set_title("ACF of z^2  (residual ARCH effects?)"); ax.set_xlabel("Lag"); ax.legend()
ax.grid(True, alpha=0.4)

plt.suptitle("GARCH(1,1) Diagnostics", fontsize=14, y=1.01)
plt.tight_layout(); plt.show()


## 3. CUSUM Engine

Two-sided CUSUM on the GARCH standardised residuals:

$$S^+_t = \max(0,\; S^+_{t-1} + z_t - k) \qquad \text{detects upward shift}$$
$$S^-_t = \max(0,\; S^-_{t-1} - z_t - k) \qquad \text{detects downward shift}$$

- Signal fires when $S^\pm_t > h$
- After a signal both statistics **reset to 0**
- **Cooldown** prevents re-triggering within a specified window


In [None]:
def run_cusum(z, h, k, cooldown=0):
    """
    Two-sided CUSUM with cooldown.
    Returns dict: S_pos, S_neg, signals (1=up, -1=down, 0=none)
    """
    z = np.asarray(z); n = len(z)
    S_pos = np.zeros(n); S_neg = np.zeros(n)
    signals = np.zeros(n, dtype=int)
    last_signal = -cooldown - 1
    for t in range(1, n):
        S_pos[t] = max(0, S_pos[t-1] + z[t] - k)
        S_neg[t] = max(0, S_neg[t-1] - z[t] - k)
        if (t - last_signal) > cooldown:
            if S_pos[t] > h:
                signals[t] = 1; last_signal = t
                S_pos[t] = 0; S_neg[t] = 0
            elif S_neg[t] > h:
                signals[t] = -1; last_signal = t
                S_pos[t] = 0; S_neg[t] = 0
    return {"S_pos": S_pos, "S_neg": S_neg, "signals": signals}

def compute_arl(h, k, n_sim=3000, n_obs=2000, seed=0):
    """Monte Carlo ARL under in-control conditions (z ~ N(0,1))."""
    rng = np.random.default_rng(seed)
    rls = []
    for _ in range(n_sim):
        z = rng.standard_normal(n_obs)
        Sp = Sn = 0
        for t, zt in enumerate(z):
            Sp = max(0, Sp + zt - k); Sn = max(0, Sn - zt - k)
            if Sp > h or Sn > h: rls.append(t); break
        else: rls.append(n_obs)
    return np.mean(rls)

print("CUSUM engine defined.")


## 4. Calibration — Threshold h and Cooldown

In [None]:
# ---- 4a: Calibrate h to achieve target ARL ----
K_OPT      = 0.50    # allowance: optimal for detecting ~1 sigma mean shifts
TARGET_ARL = 500     # ~2 years between false alarms

print(f"Calibrating h for ARL target = {TARGET_ARL} days  (k={K_OPT})")
print("Running Monte Carlo simulation...")

h_grid = np.linspace(2.0, 8.0, 20)
arls = []
for h_val in h_grid:
    a = compute_arl(h_val, K_OPT, n_sim=2000, n_obs=3000)
    arls.append(a)
    if a >= TARGET_ARL * 2.5: break

arls_arr = np.array(arls); h_grid_c = h_grid[:len(arls)]
best_idx = np.argmin(np.abs(arls_arr - TARGET_ARL))
H_OPT = float(h_grid_c[best_idx])
achieved = compute_arl(H_OPT, K_OPT, n_sim=3000)
print(f"Optimal h = {H_OPT:.2f}  (achieved ARL = {achieved:.0f} days)")

fig, ax = plt.subplots(figsize=(10, 4))
ax.plot(h_grid_c, arls_arr, color=ACCENT, marker="o", ms=5)
ax.axhline(TARGET_ARL, color=YELLOW, ls="--", lw=1.5, label=f"Target ARL={TARGET_ARL}")
ax.axvline(H_OPT, color=RED, ls="--", lw=1.5, label=f"h={H_OPT:.2f}")
ax.set_xlabel("Threshold h"); ax.set_ylabel("Average Run Length (days)")
ax.set_title("ARL Calibration Curve  (k=0.5)"); ax.legend(); ax.grid(True, alpha=0.4)
plt.tight_layout(); plt.show()


In [None]:
# ---- 4b: Calibrate cooldown from in-control inter-signal gap distribution ----
def measure_gaps(h, k, n_sim=400, n_obs=5000, seed=1):
    rng = np.random.default_rng(seed); gaps = []
    for _ in range(n_sim):
        z = rng.standard_normal(n_obs)
        res = run_cusum(z, h=h, k=k, cooldown=0)
        idx = np.where(res["signals"] != 0)[0]
        if len(idx) > 1: gaps.extend(np.diff(idx).tolist())
    return np.array(gaps)

print("Measuring inter-signal gap distribution on in-control data...")
gaps_ic = measure_gaps(H_OPT, K_OPT)

if len(gaps_ic) > 10:
    COOLDOWN = max(int(np.percentile(gaps_ic, 25)), 10)
    print(f"Gap stats: mean={gaps_ic.mean():.1f}d, P25={np.percentile(gaps_ic,25):.1f}d, "
          f"P50={np.median(gaps_ic):.1f}d")
    fig, ax = plt.subplots(figsize=(10, 4))
    ax.hist(gaps_ic, bins=40, color=PURPLE, alpha=0.7, edgecolor="none")
    ax.axvline(COOLDOWN, color=RED, ls="--", lw=2, label=f"Cooldown={COOLDOWN}d (P25)")
    ax.axvline(np.median(gaps_ic), color=YELLOW, ls=":", lw=1.5,
               label=f"Median={np.median(gaps_ic):.0f}d")
    ax.set_xlabel("Days Between Consecutive Signals"); ax.set_ylabel("Frequency")
    ax.set_title("Inter-Signal Gap Distribution (In-Control)"); ax.legend()
    ax.grid(True, alpha=0.4); plt.tight_layout(); plt.show()
else:
    COOLDOWN = 20
    print("Few gaps found; defaulting cooldown=20")

print(f"\nFinal parameters: k={K_OPT}, h={H_OPT:.2f}, cooldown={COOLDOWN} days")


## 5. Run CUSUM on Full History

In [None]:
cusum_result = run_cusum(z_resid, h=H_OPT, k=K_OPT, cooldown=COOLDOWN)
S_pos    = pd.Series(cusum_result["S_pos"],   index=returns.index)
S_neg    = pd.Series(cusum_result["S_neg"],   index=returns.index)
signals  = pd.Series(cusum_result["signals"], index=returns.index)

up_signals   = signals[signals ==  1].index
down_signals = signals[signals == -1].index
all_signals  = signals[signals !=  0].index

print(f"Total signals : {(signals!=0).sum()}")
print(f"  Up   (bull) : {(signals==1).sum()}")
print(f"  Down (bear) : {(signals==-1).sum()}")
print("\nSignal dates:")
for dt in all_signals:
    d = signals[dt]
    regime = true_regimes.loc[dt] if true_regimes is not None else "N/A"
    print(f"  {dt.date()}  {'UP   (+)' if d==1 else 'DOWN (-)'}  [{regime}]")


In [None]:
fig = plt.figure(figsize=(18, 14))
gs  = gridspec.GridSpec(4, 1, hspace=0.06, height_ratios=[3,1.5,1.5,1.5])
ax1 = fig.add_subplot(gs[0])
ax2 = fig.add_subplot(gs[1], sharex=ax1)
ax3 = fig.add_subplot(gs[2], sharex=ax1)
ax4 = fig.add_subplot(gs[3], sharex=ax1)

# Panel 1: Price + signal overlays
ax1.plot(price_index.index, price_index.values, color="#888", lw=1.2, zorder=2)
ax1.set_ylabel("Price (log)"); ax1.set_yscale("log")
ax1.set_title(f"CUSUM Regime Detection (h={H_OPT:.2f}, k={K_OPT}, cooldown={COOLDOWN}d)", fontsize=13)
ax1.grid(True, alpha=0.3, zorder=1)

# Shade alternating regimes by signal
sig_list = [(dt, signals[dt]) for dt in all_signals]
if sig_list:
    cur_col = "#ffffff08"; prev_dt = returns.index[0]
    for sig_dt, sig_dir in sig_list:
        ax1.axvspan(prev_dt, sig_dt, color=cur_col, lw=0, zorder=0)
        cur_col = "#00d4aa18" if sig_dir==1 else "#ff4d6d18"
        prev_dt = sig_dt
    ax1.axvspan(prev_dt, returns.index[-1], color=cur_col, lw=0, zorder=0)

for dt in up_signals:
    ax1.axvline(dt, color=ACCENT, lw=1.2, alpha=0.9, zorder=3)
    ax1.annotate("U", xy=(dt, price_index.loc[dt]), color=ACCENT, fontsize=9,
                 ha="center", va="bottom", xytext=(0,4), textcoords="offset points")
for dt in down_signals:
    ax1.axvline(dt, color=RED, lw=1.2, alpha=0.9, zorder=3)
    ax1.annotate("D", xy=(dt, price_index.loc[dt]), color=RED, fontsize=9,
                 ha="center", va="top", xytext=(0,-4), textcoords="offset points")

legend_elems = [
    mpatches.Patch(color=ACCENT, alpha=0.7, label="Up signal (mean increase)"),
    mpatches.Patch(color=RED,    alpha=0.7, label="Down signal (mean decrease)"),
]
ax1.legend(handles=legend_elems, loc="upper left")

# Panel 2: GARCH residuals
ax2.fill_between(z_s.index, z_s, 0, where=z_s>=0, color=ACCENT, alpha=0.5, lw=0)
ax2.fill_between(z_s.index, z_s, 0, where=z_s<0,  color=RED,   alpha=0.5, lw=0)
ax2.axhline(0, color="white", lw=0.5); ax2.set_ylabel("z (GARCH)"); ax2.grid(True, alpha=0.3)

# Panel 3: S+
ax3.plot(S_pos.index, S_pos.values, color=ACCENT, lw=1.2, label="S+ (up detector)")
ax3.axhline(H_OPT, color=YELLOW, ls="--", lw=1.5, label=f"h={H_OPT:.2f}")
ax3.fill_between(S_pos.index, S_pos.values, 0, alpha=0.2, color=ACCENT)
ax3.set_ylabel("S+"); ax3.legend(loc="upper right", fontsize=8); ax3.grid(True, alpha=0.3)

# Panel 4: S-
ax4.plot(S_neg.index, S_neg.values, color=RED, lw=1.2, label="S- (down detector)")
ax4.axhline(H_OPT, color=YELLOW, ls="--", lw=1.5, label=f"h={H_OPT:.2f}")
ax4.fill_between(S_neg.index, S_neg.values, 0, alpha=0.2, color=RED)
ax4.set_ylabel("S-"); ax4.legend(loc="upper right", fontsize=8); ax4.grid(True, alpha=0.3)

plt.setp(ax1.get_xticklabels(), visible=False)
plt.setp(ax2.get_xticklabels(), visible=False)
plt.setp(ax3.get_xticklabels(), visible=False)
plt.savefig("cusum_dashboard.png", dpi=150, bbox_inches="tight"); plt.show()


## 6. Backtest

**Strategy**: Regime-switching between full equity and cash.
- Down signal → exit to cash (0% equity)
- Up signal → re-enter market (100% equity)
- Position executed at **next-day open** (1-day lag to avoid look-ahead bias)

We also test with a 10bps round-trip transaction cost.


In [None]:
def backtest(returns, signals, tc=0.0):
    """Regime-switching backtest. tc = one-way transaction cost (e.g. 0.001 = 10bps)."""
    pos = 1.0; positions = np.ones(len(returns)); trades = 0
    for t in range(len(returns)):
        if   signals.iloc[t] == -1: pos = 0.0; trades += 1
        elif signals.iloc[t] ==  1: pos = 1.0; trades += 1
        positions[t] = pos
    pos_s = pd.Series(positions, index=returns.index).shift(1).fillna(1.0)
    tc_series = (signals != 0).astype(float) * tc
    strat_rets = pos_s * returns - tc_series
    return strat_rets, pos_s, trades

def perf_metrics(r, ann=252):
    r = np.asarray(r)
    ann_ret = r.mean() * ann; ann_vol = r.std() * np.sqrt(ann)
    sharpe  = r.mean() / r.std() * np.sqrt(ann) if r.std() > 0 else 0
    cum = np.exp(np.cumsum(r)); peak = np.maximum.accumulate(cum)
    dd = (cum-peak)/peak; max_dd = dd.min()
    calmar = ann_ret/abs(max_dd) if max_dd != 0 else np.nan
    win_rate = (r > 0).mean()
    return {"Ann.Return":ann_ret,"Ann.Vol":ann_vol,"Sharpe":sharpe,
            "MaxDD":max_dd,"Calmar":calmar,"WinRate":win_rate}

strat_rets, position, n_trades   = backtest(returns, signals, tc=0.0)
strat_rets_tc, _, _              = backtest(returns, signals, tc=0.001)
m_bh     = perf_metrics(returns.values)
m_strat  = perf_metrics(strat_rets.values)
m_strat_tc = perf_metrics(strat_rets_tc.values)

print("=== Performance Comparison ===")
hdr = f"{'Strategy':<22} {'Ann.Ret':>9} {'Ann.Vol':>9} {'Sharpe':>8} {'MaxDD':>9} {'Calmar':>8} {'WinRate':>8}"
print(hdr); print("-"*75)
for name, m in [("Buy & Hold",m_bh),("CUSUM (no TC)",m_strat),("CUSUM (10bps TC)",m_strat_tc)]:
    print(f"{name:<22} {m['Ann.Return']:>9.2%} {m['Ann.Vol']:>9.2%} "
          f"{m['Sharpe']:>8.3f} {m['MaxDD']:>9.2%} {m['Calmar']:>8.3f} {m['WinRate']:>8.2%}")
print(f"\nTotal trades executed: {n_trades}")


In [None]:
# Signal quality vs true bear regimes
BEAR_REGIMES = {"Dot-com bust","GFC crash","COVID crash","Rate hike bear"}
true_bear     = true_regimes.isin(BEAR_REGIMES).astype(int)
inferred_bear = (position < 0.5).astype(int)
tb = true_bear.values; ib = inferred_bear.values

TP = int(((ib==1)&(tb==1)).sum()); TN = int(((ib==0)&(tb==0)).sum())
FP = int(((ib==1)&(tb==0)).sum()); FN = int(((ib==0)&(tb==1)).sum())
precision   = TP/(TP+FP+1e-9); recall    = TP/(TP+FN+1e-9)
f1          = 2*precision*recall/(precision+recall+1e-9)
specificity = TN/(TN+FP+1e-9); accuracy  = (TP+TN)/len(tb)

print("=== Signal Quality vs True Regimes ===")
print(f"  Confusion matrix:  TP={TP:5d}  FN={FN:5d}")
print(f"                     FP={FP:5d}  TN={TN:5d}")
print(f"  Precision   : {precision:.3f}  (detected-bear days that were truly bear)")
print(f"  Recall      : {recall:.3f}  (true-bear days that were captured)")
print(f"  F1 Score    : {f1:.3f}")
print(f"  Specificity : {specificity:.3f}")
print(f"  Accuracy    : {accuracy:.3f}")

print("\n=== Signal Lag Analysis ===")
bear_starts = true_bear.index[true_bear.diff()==1]
lags = []
for bs in bear_starts:
    reg = true_regimes.loc[bs]
    fd  = down_signals[down_signals >= bs]
    if len(fd) > 0:
        lag = (fd[0]-bs).days; lags.append(lag)
        print(f"  {reg:<22} {bs.date()} -> signal {fd[0].date()} (lag {lag}d)")
    else:
        print(f"  {reg:<22} {bs.date()} -> NO signal detected")
if lags:
    print(f"\nMean detection lag: {np.mean(lags):.1f} days")


In [None]:
def drawdown(cum): peak = cum.cummax(); return (cum-peak)/peak
cum_bh    = np.exp(returns.cumsum())
cum_strat = np.exp(strat_rets.cumsum())
dd_bh = drawdown(cum_bh); dd_strat = drawdown(cum_strat)

fig, axes = plt.subplots(3,1,figsize=(16,12),sharex=True,
                          gridspec_kw={"height_ratios":[3,1.5,1.5]})

ax = axes[0]
ax.plot(cum_bh.index, cum_bh.values, color="#888", lw=1.2, label="Buy & Hold")
ax.plot(cum_strat.index, cum_strat.values, color=ACCENT, lw=1.5, label="CUSUM Strategy")
ax.set_yscale("log"); ax.set_ylabel("Cumulative Return (log)")
ax.set_title("Backtest: CUSUM Regime-Switching vs Buy & Hold", fontsize=13)
ax.grid(True, alpha=0.3)
# Shade bear periods
in_bear=False; bear_start=None
for dt in position.index:
    if position[dt]<0.5 and not in_bear: in_bear=True; bear_start=dt
    elif position[dt]>=0.5 and in_bear:
        ax.axvspan(bear_start,dt,color="#ff4d6d20",lw=0); in_bear=False
if in_bear: ax.axvspan(bear_start,position.index[-1],color="#ff4d6d20",lw=0)
for dt in down_signals: ax.axvline(dt,color=RED,lw=0.8,alpha=0.6)
for dt in up_signals:   ax.axvline(dt,color=ACCENT,lw=0.8,alpha=0.6)
patch = mpatches.Patch(color="#ff4d6d50", label="CUSUM bear period")
ax.legend(handles=[ax.get_lines()[0],ax.get_lines()[1],patch],loc="upper left")

ax = axes[1]
ax.fill_between(dd_bh.index, dd_bh*100, 0, color="#666", alpha=0.5, label="B&H")
ax.fill_between(dd_strat.index, dd_strat*100, 0, color=ACCENT, alpha=0.5, label="CUSUM")
ax.set_ylabel("Drawdown %"); ax.set_title("Drawdown Comparison")
ax.legend(loc="lower left",fontsize=9); ax.grid(True, alpha=0.3)

ax = axes[2]
ax.fill_between(position.index, position.values, 0,
    where=position.values>=0.5, color=ACCENT, alpha=0.6, label="Long (equity)")
ax.fill_between(position.index, position.values, 0,
    where=position.values<0.5, color="#555", alpha=0.5, label="Cash")
ax.set_ylim(-0.1,1.2); ax.set_ylabel("Position"); ax.set_title("CUSUM-Derived Position")
ax.legend(loc="lower right",fontsize=9); ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig("backtest.png", dpi=150, bbox_inches="tight"); plt.show()


In [None]:
fig, axes = plt.subplots(2,1,figsize=(16,6),sharex=True)
ax = axes[0]
ax.fill_between(true_bear.index, true_bear.values, 0, color=RED, alpha=0.7, step="post", label="Bear")
ax.fill_between(true_bear.index, 1-true_bear.values, 0, color=ACCENT, alpha=0.4, step="post", label="Bull")
ax.set_ylabel("Regime"); ax.set_title("True Regime Labels (from simulation)")
ax.set_ylim(-0.05,1.1); ax.legend(loc="upper right"); ax.grid(True, alpha=0.3)

ax = axes[1]
ax.fill_between(inferred_bear.index, inferred_bear.values, 0, color=RED, alpha=0.7, step="post", label="CUSUM: Cash")
ax.fill_between(inferred_bear.index, 1-inferred_bear.values, 0, color=ACCENT, alpha=0.4, step="post", label="CUSUM: Long")
ax.set_ylabel("Regime"); ax.set_title("CUSUM-Inferred Regime")
ax.set_ylim(-0.05,1.1); ax.legend(loc="upper right"); ax.grid(True, alpha=0.3)

plt.suptitle(f"Regime Alignment  |  Precision={precision:.2f}  Recall={recall:.2f}  F1={f1:.2f}", fontsize=12)
plt.tight_layout()
plt.savefig("regime_alignment.png", dpi=150, bbox_inches="tight"); plt.show()


## 7. Sensitivity Analysis — Parameter Grid

In [None]:
h_vals = np.linspace(2.0, 7.0, 6)
k_vals = [0.25, 0.50, 0.75]
results = []
for h_v in h_vals:
    for k_v in k_vals:
        res   = run_cusum(z_resid, h=h_v, k=k_v, cooldown=COOLDOWN)
        sigs  = pd.Series(res["signals"], index=returns.index)
        sr, pos, nt = backtest(returns, sigs)
        ib_  = (pos<0.5).astype(int).values
        TP_  = ((ib_==1)&(tb==1)).sum(); FP_ = ((ib_==1)&(tb==0)).sum()
        FN_  = ((ib_==0)&(tb==1)).sum()
        p_   = TP_/(TP_+FP_+1e-9); r_ = TP_/(TP_+FN_+1e-9)
        f1_  = 2*p_*r_/(p_+r_+1e-9)
        m    = perf_metrics(sr.values)
        results.append({"h":h_v,"k":k_v,"F1":f1_,"Sharpe":m["Sharpe"],
                        "MaxDD":m["MaxDD"],"Signals":int((sigs!=0).sum())})

res_df = pd.DataFrame(results)
fig, axes = plt.subplots(1, 3, figsize=(16, 5))
for ax, metric, cm in zip(axes,["F1","Sharpe","MaxDD"],["viridis","RdYlGn","RdYlGn_r"]):
    pivot = res_df.pivot(index="h", columns="k", values=metric)
    im = ax.imshow(pivot.values, aspect="auto", cmap=cm, origin="lower",
                   extent=[k_vals[0]-0.13, k_vals[-1]+0.13, h_vals[0]-0.5, h_vals[-1]+0.5])
    plt.colorbar(im, ax=ax)
    for i, h_v in enumerate(pivot.index):
        for j, k_v in enumerate(pivot.columns):
            ax.text(k_v, h_v, f"{pivot.iloc[i,j]:.2f}", ha="center", va="center",
                    color="white", fontsize=8, fontweight="bold")
    ax.set_xlabel("k (allowance)"); ax.set_ylabel("h (threshold)")
    ax.set_title(f"{metric} vs (h, k)"); ax.set_xticks(k_vals)
plt.suptitle("Sensitivity Analysis", fontsize=13)
plt.tight_layout()
plt.savefig("sensitivity.png", dpi=150, bbox_inches="tight"); plt.show()

print("Top 5 by F1:")
print(res_df.sort_values("F1",ascending=False).head(5).to_string(index=False))
print("\nTop 5 by Sharpe:")
print(res_df.sort_values("Sharpe",ascending=False).head(5).to_string(index=False))


## 8. Rolling CUSUM — Adaptive Baseline

Re-standardise GARCH residuals against a rolling 252-day window.
This makes the detector **local**: it asks whether the recent mean has shifted
from the near-past. More robust to secular drift over multi-decade periods.


In [None]:
WINDOW = 252
z_roll_mean = z_s.rolling(WINDOW, min_periods=60).mean().fillna(0)
z_roll_std  = z_s.rolling(WINDOW, min_periods=60).std().fillna(1).clip(lower=0.1)
z_adaptive  = ((z_s - z_roll_mean) / z_roll_std).fillna(0)

rc = run_cusum(z_adaptive.values, h=H_OPT, k=K_OPT, cooldown=COOLDOWN)
rc_signals = pd.Series(rc["signals"], index=returns.index)
rc_sr, rc_pos, rc_nt = backtest(returns, rc_signals)
m_rc = perf_metrics(rc_sr.values)
cum_rc = np.exp(rc_sr.cumsum())

fig, ax = plt.subplots(figsize=(16,5))
ax.plot(cum_bh.index,    cum_bh.values,    color="#666",  lw=1.2, label="Buy & Hold")
ax.plot(cum_strat.index, cum_strat.values, color=ACCENT,  lw=1.5, ls="--", label="Global CUSUM")
ax.plot(cum_rc.index,    cum_rc.values,    color=PURPLE,  lw=1.5, label=f"Rolling CUSUM ({WINDOW}d)")
ax.set_yscale("log")
rc_down = rc_signals[rc_signals==-1].index
for dt in rc_down: ax.axvline(dt, color=PURPLE, lw=0.7, alpha=0.5)
ax.set_title(f"Rolling CUSUM vs Global CUSUM vs Buy & Hold")
ax.legend(); ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig("rolling_cusum.png", dpi=150, bbox_inches="tight"); plt.show()

print(f"Rolling CUSUM: {(rc_signals!=0).sum()} signals, {rc_nt} trades")
print(f"  Sharpe: {m_rc['Sharpe']:.3f}  MaxDD: {m_rc['MaxDD']:.2%}")


## 9. Final Summary Report

In [None]:
print("="*65)
print("  CUSUM REGIME DETECTION - FINAL SUMMARY")
print("="*65)
print(f"\nDATA")
print(f"  Period         : {returns.index[0].date()} to {returns.index[-1].date()}")
print(f"  Observations   : {len(returns):,}")
gp = garch_params
print(f"\nGARCH(1,1)")
print(f"  alpha          : {gp['alpha']:.4f}  (ARCH effect)")
print(f"  beta           : {gp['beta']:.4f}  (GARCH persistence)")
print(f"  alpha+beta     : {gp['alpha']+gp['beta']:.4f}")
print(f"\nCUSUM CALIBRATION")
print(f"  k (allowance)  : {K_OPT}")
print(f"  h (threshold)  : {H_OPT:.2f}  (target ARL = {TARGET_ARL} days)")
print(f"  cooldown       : {COOLDOWN} days")
print(f"\nSIGNALS (Global CUSUM)")
print(f"  Total          : {(signals!=0).sum()}")
print(f"  Down (to cash) : {(signals==-1).sum()}")
print(f"  Up (to long)   : {(signals==1).sum()}")
print(f"\nREGIME DETECTION QUALITY")
print(f"  Precision      : {precision:.3f}")
print(f"  Recall         : {recall:.3f}")
print(f"  F1 Score       : {f1:.3f}")
print(f"  Accuracy       : {accuracy:.3f}")
print(f"\nBACKTEST PERFORMANCE")
hdr = f"  {'Strategy':<20} {'Ann.Ret':>9} {'Vol':>9} {'Sharpe':>8} {'MaxDD':>9} {'Calmar':>8}"
print(hdr); print("  "+"-"*63)
for name, m in [("Buy & Hold",m_bh),("Global CUSUM",m_strat),("Rolling CUSUM",m_rc)]:
    print(f"  {name:<20} {m['Ann.Return']:>9.2%} {m['Ann.Vol']:>9.2%} "
          f"{m['Sharpe']:>8.3f} {m['MaxDD']:>9.2%} {m['Calmar']:>8.3f}")
print("="*65)


## Appendix: Extensions & Real Data

```python
# --- Plug in real S&P 500 ---
# pip install yfinance arch
import yfinance as yf
from arch import arch_model

raw     = yf.download("^GSPC", start="2000-01-01", auto_adjust=True)
returns = np.log(raw["Close"]/raw["Close"].shift(1)).dropna()
am      = arch_model(returns, vol="Garch", p=1, q=1, dist="t")  # t-dist for fat tails
res     = am.fit(disp="off")
z_resid = res.std_resid.values  # pass to run_cusum()

# --- Multi-asset portfolio ---
# Fit GARCH per asset -> z_i residuals
# Compute: z_portfolio = weights @ Z_matrix  (weight-averaged)
# Then: run_cusum(z_portfolio, ...)

# --- Variance regime detection ---
# Use squared residuals as CUSUM input to detect vol regime shifts
z_var = z_resid**2 - 1
cusum_vol = run_cusum(z_var, h=H_OPT, k=K_OPT, cooldown=COOLDOWN)

# --- Add transaction costs ---
strat_rets_tc, _, _ = backtest(returns, signals, tc=0.001)  # 10bps per trade
```

| Design Choice | Rationale |
|---|---|
| GARCH(1,1) filtering | Removes vol clustering; prevents false alarms in turbulent periods |
| k = 0.5 | Balanced detection of 1-sigma mean shifts |
| ARL-calibrated h | Principled false-alarm control (not arbitrary) |
| Cooldown = P25 gap | Blocks burst re-signalling; calibrated on in-control data |
| Reset after signal | Prevents old accumulation contaminating new regime detection |
| Rolling baseline option | Adapts to secular drift across multi-decade periods |
