# Walk-Forward Validation — BTC/USDT 1h (Full-Year 2024)

Single-period backtests are unreliable: BollingerMeanReversion scored Sharpe +2.0 in
Jan–Mar 2024 but Sharpe −0.76 in Jan–Jun 2024 (F4). Walk-forward validation evaluates
the strategy sequentially on non-overlapping out-of-sample windows, giving a realistic
estimate that cannot be cherry-picked by regime.

## §1 — Config

In [1]:
import sys
from pathlib import Path

# Ensure repo root is on the path
repo_root = Path("__file__").resolve().parent.parent
if str(repo_root) not in sys.path:
    sys.path.insert(0, str(repo_root))

import warnings
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

plt.rcParams.update({
    "figure.dpi":      120,
    "axes.spines.top": False,
    "axes.spines.right": False,
})

# ── Walk-forward config ──────────────────────────────────────────────────────
SYMBOL      = "BTC/USDT"
TIMEFRAME   = "1h"
SINCE       = "2024-01-01"
UNTIL       = "2025-01-01"

N_SPLITS    = 5
TRAIN_FRAC  = 0.6

# Strategy params (fixed — optimisation is P7)
BB_PARAMS   = {"period": 20, "num_std": 2.0}
MA_PARAMS   = {"fast_period": 20, "slow_period": 50}

print(f"Config: {SYMBOL} {TIMEFRAME} | {SINCE} → {UNTIL}")
print(f"Walk-forward: N_SPLITS={N_SPLITS}, TRAIN_FRAC={TRAIN_FRAC}")

Config: BTC/USDT 1h | 2024-01-01 → 2025-01-01
Walk-forward: N_SPLITS=5, TRAIN_FRAC=0.6


## §2 — Fetch data

In [2]:
from data.fetch import fetch_ohlcv

df = fetch_ohlcv(since=SINCE, until=UNTIL)
print(f"Loaded {len(df):,} bars  |  {df.index[0]} → {df.index[-1]}")
df.tail(3)

Loaded 8,785 bars  |  2024-01-01 00:00:00+00:00 → 2025-01-01 00:00:00+00:00


Unnamed: 0_level_0,open,high,low,close,volume
timestamp,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2024-12-31 22:00:00+00:00,93892.6,93892.6,93379.6,93484.0,82.08504
2024-12-31 23:00:00+00:00,93484.0,93764.9,93380.0,93585.1,125.375445
2025-01-01 00:00:00+00:00,93585.0,94495.2,93500.0,94399.9,322.777877


## §3 — Walk-forward: BollingerMeanReversion (rolling)

In [3]:
from backtesting import walk_forward
from strategies.single import BollingerMeanReversion

wf_bmr = walk_forward(
    BollingerMeanReversion,
    df,
    BB_PARAMS,
    n_splits    = N_SPLITS,
    train_frac  = TRAIN_FRAC,
    window_type = "rolling",
)

pd.set_option("display.float_format", "{:.4f}".format)
print("BollingerMeanReversion — per-window OOS summary")
wf_bmr.summary_df

BollingerMeanReversion — per-window OOS summary


Unnamed: 0,test_start,test_end,n_bars,total_return,sharpe_ratio,sortino_ratio,max_drawdown,win_rate
0,2024-03-02 00:00:00+00:00,2024-05-01 23:00:00+00:00,1464,-0.1047,-2.1467,-0.855,-0.1706,0.0663
1,2024-05-02 00:00:00+00:00,2024-07-01 23:00:00+00:00,1464,0.0083,0.3856,0.1555,-0.0405,0.0718
2,2024-07-02 00:00:00+00:00,2024-08-31 23:00:00+00:00,1464,-0.1716,-4.7099,-2.0045,-0.1832,0.0574
3,2024-09-01 00:00:00+00:00,2024-10-31 23:00:00+00:00,1464,0.0149,0.6031,0.2632,-0.0645,0.056
4,2024-11-01 00:00:00+00:00,2024-12-31 23:00:00+00:00,1464,0.0459,1.2382,0.5012,-0.111,0.0725


## §4 — Per-window equity curves (small-multiple panel)

In [4]:
fig, axes = plt.subplots(1, N_SPLITS, figsize=(18, 3), sharey=False)
fig.suptitle("BollingerMeanReversion — OOS equity per window", fontsize=12, y=1.02)

for ax, wr in zip(axes, wf_bmr.windows):
    wr.equity.plot(ax=ax, color="steelblue", linewidth=1.2)
    sharpe = wr.metrics["sharpe_ratio"]
    ret    = wr.metrics["total_return"] * 100
    ax.set_title(
        f"W{wr.window_idx+1}\n"
        f"{wr.test_start.strftime('%b %d')} – {wr.test_end.strftime('%b %d')}",
        fontsize=8,
    )
    ax.set_xlabel("")
    ax.tick_params(axis="x", labelsize=6, rotation=30)
    ax.tick_params(axis="y", labelsize=7)
    ax.text(
        0.05, 0.95,
        f"SR={sharpe:.2f}\nRet={ret:.1f}%",
        transform=ax.transAxes,
        fontsize=7,
        va="top",
        color="darkslategray",
    )
    ax.axhline(1, color="gray", linewidth=0.7, linestyle="--")

plt.tight_layout()
plt.show()

## §5 — Per-window metrics bar chart

In [5]:
metrics_of_interest = ["total_return", "sharpe_ratio", "max_drawdown"]
labels = [f"W{wr.window_idx+1}" for wr in wf_bmr.windows]

fig, axes = plt.subplots(1, 3, figsize=(14, 4))
fig.suptitle("BollingerMeanReversion — per-window key metrics", fontsize=12)

for ax, metric in zip(axes, metrics_of_interest):
    values = [wr.metrics[metric] for wr in wf_bmr.windows]
    colors = ["#2ecc71" if v >= 0 else "#e74c3c" for v in values]
    ax.bar(labels, values, color=colors, edgecolor="white", linewidth=0.5)
    ax.axhline(0, color="black", linewidth=0.8)
    ax.set_title(metric.replace("_", " ").title(), fontsize=10)
    ax.yaxis.set_major_formatter(mticker.FuncFormatter(
        lambda x, _: f"{x*100:.1f}%" if metric != "sharpe_ratio" else f"{x:.2f}"
    ))

plt.tight_layout()
plt.show()

## §6 — Stitched OOS equity vs buy-and-hold

In [6]:
# Build buy-and-hold equity over the same OOS period
oos_start = wf_bmr.oos_equity.index[0]
oos_end   = wf_bmr.oos_equity.index[-1]
bah_slice = df.loc[oos_start:oos_end, "close"]
bah_equity = bah_slice / bah_slice.iloc[0]

fig, ax = plt.subplots(figsize=(14, 4))
wf_bmr.oos_equity.plot(ax=ax, label="BollingerMeanReversion OOS", color="steelblue", linewidth=1.5)
bah_equity.plot(ax=ax, label="Buy-and-Hold", color="darkorange", linewidth=1.2, linestyle="--")

# Shade window boundaries
for i, wr in enumerate(wf_bmr.windows):
    ax.axvline(wr.test_start, color="gray", linewidth=0.5, linestyle=":")

ax.axhline(1, color="black", linewidth=0.5, linestyle="-")
ax.set_title("Stitched OOS equity vs buy-and-hold (BollingerMeanReversion, rolling)", fontsize=11)
ax.set_ylabel("Equity (normalised)")
ax.legend()

oos_sr  = wf_bmr.oos_metrics["sharpe_ratio"]
oos_ret = wf_bmr.oos_metrics["total_return"] * 100
ax.text(
    0.01, 0.97,
    f"OOS Sharpe={oos_sr:.2f}  |  OOS Return={oos_ret:.1f}%",
    transform=ax.transAxes, fontsize=9, va="top", color="steelblue",
)
plt.tight_layout()
plt.show()

## §7 — Anchored vs rolling OOS equity comparison

In [7]:
wf_bmr_anchored = walk_forward(
    BollingerMeanReversion,
    df,
    BB_PARAMS,
    n_splits    = N_SPLITS,
    train_frac  = TRAIN_FRAC,
    window_type = "anchored",
)

fig, ax = plt.subplots(figsize=(14, 4))
wf_bmr.oos_equity.plot(
    ax=ax, label="Rolling window", color="steelblue", linewidth=1.5
)
wf_bmr_anchored.oos_equity.plot(
    ax=ax, label="Anchored window", color="mediumseagreen", linewidth=1.5, linestyle="--"
)
ax.axhline(1, color="black", linewidth=0.5)
ax.set_title("Anchored vs rolling — BollingerMeanReversion OOS equity", fontsize=11)
ax.set_ylabel("Equity (normalised)")
ax.legend()

for label, wfr in [("Rolling", wf_bmr), ("Anchored", wf_bmr_anchored)]:
    print(f"{label:<10}  Sharpe={wfr.oos_metrics['sharpe_ratio']:.3f}  "
          f"Return={wfr.oos_metrics['total_return']*100:.1f}%  "
          f"MaxDD={wfr.oos_metrics['max_drawdown']*100:.1f}%")

plt.tight_layout()
plt.show()

Rolling     Sharpe=-1.121  Return=-20.6%  MaxDD=-33.5%
Anchored    Sharpe=-1.121  Return=-20.6%  MaxDD=-33.5%


## §8 — Strategy comparison: MeanReversion vs Breakout (rolling WF)

In [8]:
from strategies.single import BollingerBreakout

wf_bbo = walk_forward(
    BollingerBreakout,
    df,
    BB_PARAMS,
    n_splits    = N_SPLITS,
    train_frac  = TRAIN_FRAC,
    window_type = "rolling",
)

# ── Side-by-side summary tables ─────────────────────────────────────────────
summary_merged = wf_bmr.summary_df[["test_start", "test_end", "sharpe_ratio", "total_return", "max_drawdown"]].copy()
summary_merged.columns = ["test_start", "test_end", "MR_sharpe", "MR_return", "MR_mdd"]
summary_merged["BO_sharpe"] = wf_bbo.summary_df["sharpe_ratio"].values
summary_merged["BO_return"] = wf_bbo.summary_df["total_return"].values
summary_merged["BO_mdd"]    = wf_bbo.summary_df["max_drawdown"].values
print("MeanReversion (MR) vs Breakout (BO) — rolling walk-forward")
display(summary_merged)

# ── Equity comparison plot ───────────────────────────────────────────────────
fig, ax = plt.subplots(figsize=(14, 4))
wf_bmr.oos_equity.plot(ax=ax, label="MeanReversion", color="steelblue",    linewidth=1.5)
wf_bbo.oos_equity.plot(ax=ax, label="Breakout",      color="darkorange",   linewidth=1.5)
bah_equity.plot(        ax=ax, label="Buy-and-Hold",  color="gray",         linewidth=1.0, linestyle=":")
ax.axhline(1, color="black", linewidth=0.5)
ax.set_title("MeanReversion vs Breakout — stitched OOS equity (rolling)", fontsize=11)
ax.set_ylabel("Equity (normalised)")
ax.legend()
plt.tight_layout()
plt.show()

# ── OOS metrics comparison ───────────────────────────────────────────────────
keys = ["total_return", "sharpe_ratio", "sortino_ratio", "max_drawdown", "win_rate"]
comp = pd.DataFrame({
    "MeanReversion": {k: wf_bmr.oos_metrics[k] for k in keys},
    "Breakout":      {k: wf_bbo.oos_metrics[k]  for k in keys},
})
print("\nOOS aggregate metrics")
display(comp)

MeanReversion (MR) vs Breakout (BO) — rolling walk-forward


Unnamed: 0,test_start,test_end,MR_sharpe,MR_return,MR_mdd,BO_sharpe,BO_return,BO_mdd
0,2024-03-02 00:00:00+00:00,2024-05-01 23:00:00+00:00,-2.1467,-0.1047,-0.1706,2.1467,0.1015,-0.067
1,2024-05-02 00:00:00+00:00,2024-07-01 23:00:00+00:00,0.3856,0.0083,-0.0405,-0.3856,-0.0125,-0.0431
2,2024-07-02 00:00:00+00:00,2024-08-31 23:00:00+00:00,-4.7099,-0.1716,-0.1832,4.7099,0.1962,-0.0527
3,2024-09-01 00:00:00+00:00,2024-10-31 23:00:00+00:00,0.6031,0.0149,-0.0645,-0.6031,-0.0195,-0.0786
4,2024-11-01 00:00:00+00:00,2024-12-31 23:00:00+00:00,1.2382,0.0459,-0.111,-1.2382,-0.0531,-0.1657



OOS aggregate metrics


Unnamed: 0,MeanReversion,Breakout
total_return,-0.2063,0.2081
sharpe_ratio,-1.1209,1.1209
sortino_ratio,-0.449,0.6637
max_drawdown,-0.3349,-0.1657
win_rate,0.0648,0.0545


## §9 — Conclusions

**Key findings from walk-forward validation (2024 BTC/USDT 1h, N=5, rolling):**

1. **Single-period results are unreliable.** The wide spread in per-window Sharpe ratios
   confirms finding F4: a single backtest period can dramatically over- or under-state
   true strategy performance.

2. **Walk-forward gives a regime-robust estimate.** By stitching N non-overlapping
   OOS windows, the stitched equity curve spans multiple market regimes (trending,
   ranging, volatile) and is much harder to cherry-pick.

3. **Rolling vs anchored** are broadly consistent for fixed-parameter strategies;
   the difference matters more when parameters are re-optimised per fold (P7).

4. **Strategy comparison** is now apples-to-apples: both strategies are evaluated
   on the same OOS windows under identical conditions.

**Next steps:**
- P5: Regime-aware advanced strategy (combine BB width + ADX, switch between MR/Breakout)
- P6: Per-trade metrics (profit factor, trade count, avg duration)
- P7: Optuna parameter optimisation wired to the `optimize_fn` hook in `walk_forward()`