# Walk-Forward Validation

Learn how Mantis helps you avoid overfitting with robust validation.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/johan-gras/mantis/blob/main/notebooks/validation.ipynb)

In [None]:
!pip install mantis-bt -q

In [None]:
import mantis as mt
import numpy as np

## Why Validation Matters

Every backtest is an in-sample test. Without validation, you can't distinguish between:
- A robust strategy that captures real market inefficiencies
- An overfit strategy that memorized historical noise

Mantis provides first-class validation tools to help you make this distinction.

## Load Data

In [None]:
data = mt.load_sample("SPY")
print(f"Loaded {len(data['bars'])} bars")

## Walk-Forward Analysis

Walk-forward validation splits your data into multiple folds:
- **In-sample (IS)**: Used to train/optimize the strategy
- **Out-of-sample (OOS)**: Used to test the strategy on unseen data

Mantis uses 12 folds by default, which is the industry standard.

In [None]:
# Run backtest
results = mt.backtest(data, strategy="sma-crossover")

# Validate with walk-forward analysis
validation = results.validate()
print(validation)

## Understanding the Verdict

Mantis classifies strategies into three categories:

| Verdict | OOS/IS Ratio | Interpretation |
|---------|-------------|----------------|
| `robust` | >= 80% | Strategy performs well out-of-sample |
| `borderline` | 60-80% | Strategy shows some OOS degradation |
| `likely_overfit` | < 60% | Strategy likely overfit to historical data |

In [None]:
print(f"Verdict: {validation.verdict}")
print(f"OOS/IS efficiency: {validation.efficiency_ratio:.1%}")

## Validation Warnings

Mantis automatically flags suspicious metrics that often indicate overfitting.

In [None]:
warnings = validation.warnings()
if warnings:
    print("Warnings detected:")
    for w in warnings:
        print(f"  - {w}")
else:
    print("No warnings - strategy looks healthy!")

## Configuring Validation

You can customize the validation parameters.

In [None]:
# More folds for more thorough validation
validation_strict = results.validate(folds=20)
print(f"20-fold verdict: {validation_strict.verdict}")

In [None]:
# Anchored windows (expanding in-sample period)
validation_anchored = results.validate(anchored=True)
print(f"Anchored verdict: {validation_anchored.verdict}")

## Deflated Sharpe Ratio

When you test many strategies, some will appear profitable by chance. The Deflated Sharpe Ratio adjusts for multiple testing.

Pass the `trials` parameter to account for how many strategies you've tested.

In [None]:
# Account for testing 10 strategies
validation_deflated = results.validate(trials=10)
print(f"Deflated Sharpe (10 trials): {validation_deflated.deflated_sharpe:.3f}")

In [None]:
# Account for testing 100 strategies
validation_100 = results.validate(trials=100)
print(f"Deflated Sharpe (100 trials): {validation_100.deflated_sharpe:.3f}")

## Visualize Validation Results

In [None]:
# Plot fold-by-fold performance
validation.plot()

## Parameter Sensitivity

A robust strategy should work across a range of parameter values, not just one "optimal" setting.

In [None]:
# Test parameter sensitivity
sensitivity = mt.sensitivity(
    data,
    strategy="sma-crossover",
    params={
        "fast_period": [5, 10, 15, 20, 25],
        "slow_period": [30, 40, 50, 60, 70]
    }
)
print(sensitivity)

## Cost Sensitivity

Test how your strategy holds up when transaction costs increase.

In [None]:
# Test at different cost multipliers
cost_sensitivity = mt.cost_sensitivity(data, strategy="sma-crossover")
print(cost_sensitivity)

## Monte Carlo Simulation

Monte Carlo simulation helps assess the range of possible outcomes.

In [None]:
# Run Monte Carlo simulation
mc = mt.monte_carlo(results, n_simulations=1000)
print(f"5th percentile return: {mc['p5']:.2%}")
print(f"95th percentile return: {mc['p95']:.2%}")

## Key Takeaways

1. **Always validate** - In-sample results alone are meaningless
2. **Use multiple folds** - 12+ folds give more reliable estimates
3. **Account for multiple testing** - Use Deflated Sharpe when testing many strategies
4. **Check parameter sensitivity** - Robust strategies work across parameter ranges
5. **Test cost sensitivity** - Ensure profits survive realistic costs

## Next Steps

- [Quick Start](https://colab.research.google.com/github/johan-gras/mantis/blob/main/notebooks/quickstart.ipynb) - Basic usage
- [Multi-Symbol](https://colab.research.google.com/github/johan-gras/mantis/blob/main/notebooks/multi_symbol.ipynb) - Portfolio backtesting