Event-driven backtesting engine for quantitative trading strategies with realistic execution modeling.
This library is one of five interconnected libraries supporting the machine learning for trading workflow described in Machine Learning for Trading:
Each library addresses a distinct stage: data infrastructure, feature engineering, signal evaluation, strategy backtesting, and live deployment.
Backtesting requires accurate simulation of order execution, position tracking, and risk management. ml4t-backtest provides:
- Event-driven architecture with point-in-time correctness (no look-ahead bias)
- Exit-first order processing matching real broker behavior
- Configurable execution modes (same-bar or next-bar fills)
- Quote-aware execution and marking with
price, bid, ask, midpoint, and side-aware sources - Position-level risk rules (stop-loss, take-profit, trailing stops)
- Portfolio-level constraints (max positions, drawdown limits)
- Cash, margin, and crypto account policies
- First-class trade, fill, and portfolio-state export for audit and downstream analysis
- 40+ behavioral knobs for framework-specific parity
The same Strategy class used in backtesting works unchanged in ml4t-live for production deployment.
pip install ml4t-backtestimport polars as pl
from ml4t.backtest import Engine, Strategy, BacktestConfig, DataFeed
class SignalStrategy(Strategy):
def on_data(self, timestamp, data, context, broker):
for asset, bar in data.items():
signal = bar.get("signals", {}).get("prediction", 0)
price = bar.get("price", bar.get("close", 0))
position = broker.get_position(asset)
if position is None and signal > 0.5:
shares = (broker.get_account_value() * 0.10) / price
if shares > 0:
broker.submit_order(asset, shares)
elif position is not None and signal < -0.5:
broker.close_position(asset)
config = BacktestConfig(
initial_cash=100_000,
commission_rate=0.001,
slippage_rate=0.0005,
)
feed = DataFeed(prices_df=prices, signals_df=signals)
engine = Engine(feed, SignalStrategy(), config)
result = engine.run()
print(f"Total Return: {result.metrics['total_return_pct']:.2f}%")
print(f"Sharpe Ratio: {result.metrics['sharpe']:.2f}")
print(result.to_fills_dataframe().head())bar["price"] follows FeedSpec.price_col when you provide one, so the same strategy works for close-based bars and quote-aware feeds.
Position-level exit rules:
from ml4t.backtest import Strategy, StopLoss, TakeProfit, TrailingStop, RuleChain
class MyStrategy(Strategy):
def on_start(self, broker):
broker.set_position_rules(RuleChain([
StopLoss(pct=0.05),
TakeProfit(pct=0.15),
TrailingStop(pct=0.03),
]))Portfolio-level controls:
from ml4t.backtest.risk.portfolio.limits import MaxDrawdownLimit, DailyLossLimitBuilt-in profiles replicate the behavioral semantics of major backtesting frameworks:
from ml4t.backtest import BacktestConfig
# Match VectorBT behavior (same-bar close fills, fractional shares)
config = BacktestConfig.from_preset("vectorbt")
# Match Backtrader behavior (next-bar open fills, integer shares)
config = BacktestConfig.from_preset("backtrader")
# Match Zipline behavior (next-bar open fills, integer shares, per-share commission)
config = BacktestConfig.from_preset("zipline")
# Match QuantConnect LEAN behavior (same-bar close fills, integer shares)
config = BacktestConfig.from_preset("lean")
# Conservative production settings (higher costs, cash buffer)
config = BacktestConfig.from_preset("realistic")Each profile sets 40+ behavioral knobs (fill timing, execution price, share type, commission model, order processing, etc.) to match the target framework exactly.
from ml4t.backtest import ExecutionMode, StopFillMode
# Same-bar fills (VectorBT style)
config = BacktestConfig(
execution_mode=ExecutionMode.SAME_BAR,
stop_fill_mode=StopFillMode.STOP_PRICE,
)
# Next-bar fills (Backtrader style)
config = BacktestConfig(
execution_mode=ExecutionMode.NEXT_BAR,
stop_fill_mode=StopFillMode.STOP_PRICE,
)from ml4t.backtest import BacktestConfig, DataFeed
from ml4t.backtest.config import ExecutionPrice
feed = DataFeed(
prices_df=quotes,
price_col="mid_price",
bid_col="bid",
ask_col="ask",
bid_size_col="bid_size",
ask_size_col="ask_size",
)
config = BacktestConfig(
execution_price=ExecutionPrice.QUOTE_SIDE,
mark_price=ExecutionPrice.QUOTE_SIDE,
)With QUOTE_SIDE, buys fill at the ask and sells fill at the bid when quotes are present. mark_price is configured separately, so you can trade on one source and mark the book on another.
Quote-aware runs also preserve the microstructure context in the result surface:
result.to_fills_dataframe()includes bid/ask/midpoint/spread/size contextresult.to_trades_dataframe()includes nullable entry/exit quote summariesresult.to_portfolio_state_dataframe()reflects the configured mark source over timeresult.to_predictions_dataframe()preserves the raw model/input surface for downstream diagnostics
BacktestConfig is also the serializable backtest preset surface. You can keep
input configs sparse, then persist the fully resolved config that actually ran.
config = BacktestConfig.from_yaml("config/my_backtest.yaml")
result = Engine(feed, strategy, config).run()
resolved_config = result.config.to_dict()
runtime_spec = result.to_spec_dict()
written = result.to_parquet("results/run_001")The exported result directory includes:
config.yamlfor the replayable resolved config payloadspec.yamlfor the richer runtime snapshot with library version and realized run window
Use top-level feed in BacktestConfig for generic feed semantics and top-level
metadata for user-defined provenance like input paths or strategy ids.
from ml4t.backtest import BacktestConfig, CommissionType
config = BacktestConfig(
commission_rate=0.001, # 10 bps percentage
slippage_rate=0.0005, # 5 bps slippage
stop_slippage_rate=0.001, # Additional slippage for stop exits
)
# Or per-share (Interactive Brokers style)
config = BacktestConfig(
commission_type=CommissionType.PER_SHARE,
commission_per_share=0.005,
commission_minimum=1.0,
)from ml4t.backtest import Strategy, TargetWeightExecutor, RebalanceConfig
class WeightStrategy(Strategy):
def __init__(self):
self.executor = TargetWeightExecutor(RebalanceConfig(
min_trade_value=100,
min_weight_change=0.01,
))
self.bar_count = 0
def on_data(self, timestamp, data, context, broker):
self.bar_count += 1
if self.bar_count % 21 != 1: # Monthly rebalance
return
# ML predictions → portfolio weights
weights = {}
for asset, bar in data.items():
signal = bar.get("signals", {}).get("prediction", 0)
if signal and signal > 0:
weights[asset] = signal
if weights:
total = sum(weights.values())
weights = {a: w / total for a, w in weights.items()}
self.executor.execute(weights, data, broker)ml4t-backtest is validated by configuring profiles to match each framework's behavior exactly:
| Framework | Scenarios | Trade Match | Notes |
|---|---|---|---|
| VectorBT Pro | 16/16 | 100% | Full feature coverage |
| VectorBT OSS | 16/16 | 100% | Open-source subset |
| Backtrader | 16/16 | 100% | Next-bar execution |
| Zipline | 15/15 | 100% | NYSE calendar alignment |
Large-scale validation (250 assets x 20 years, real data):
| Profile | Trades | Value | Gap |
|---|---|---|---|
| zipline_strict | 225,583 | match | 0 trades, $19 (0.0001%) |
| backtrader_strict | 216,980 | match | 1 trade (0.0005%) |
| vectorbt_strict | 210,352 | match | 91 trades (0.04%) |
| lean | 428,459 fills | match | 0 fills, $1.55 (0.0002%) |
See validation/README.md for methodology and detailed results.
Release-gate commands:
# Fast parity contract gate (scenario 01 across vectorbt/backtrader/zipline)
ML4T_COMPARISON_INPROC=1 uv run pytest tests/contracts/test_cross_engine_contracts.py -q
# Full correctness runner (selected scenarios)
python validation/run_all_correctness.py --framework vectorbt_oss --scenarios 01,03,05,09
python validation/run_all_correctness.py --framework backtrader --scenarios 01,03,05,09
python validation/run_all_correctness.py --framework zipline --scenarios 01,03,05,09Benchmark on 250 assets x 20 years daily data (1.26M bars):
| Metric | Value |
|---|---|
| Runtime | ~30s |
| Speed | ~40,000 bars/sec |
| Memory | ~290 MB |
| vs Backtrader | 19x faster |
| vs Zipline | 8x faster |
| vs LEAN | 5x faster |
- Getting Started — your first backtest
- Data Feed —
price_col, quote columns, and feed wiring - Strategies — strategy interface and templates
- Stateful Strategies — advanced event-driven patterns (Kelly sizing, pairs trading, circuit breakers)
- Execution Semantics — fill timing, ordering, stops
- Configuration — 40+ behavioral knobs
- Risk Management — stops, trails, portfolio limits
- Rebalancing — weight-based portfolio management
- Results & Analysis — trades, fills, equity, and Parquet export
- Market Impact — commission, slippage, and impact models
- Profiles — framework parity presets
- Event-driven: Each bar processes sequentially with exit-first logic
- Point-in-time: No access to future data within strategy callbacks
- Configurable fills: Match behavior of different backtesting frameworks
- Quote-aware: Optional bid/ask/mid/size caches with side-aware market fills
- Parquet export: Trades, fills, equity, daily P&L, and config are serializable
- Type-safe: 0 type diagnostics (ty/Astral), full type annotations
- ml4t-data: Market data acquisition and storage
- ml4t-engineer: Feature engineering and technical indicators
- ml4t-diagnostic: Signal evaluation and statistical validation
- ml4t-live: Live trading with broker integration
git clone https://github.com/ml4t/ml4t-backtest.git
cd ml4t-backtest
uv sync
uv run pytest tests/ -q
uv run ty checkSee LIMITATIONS.md for documented assumptions:
- No intrabar stop simulation (uses bar OHLC)
- Calendar overnight sessions require configuration
- See LIMITATIONS.md for full list
MIT License - see LICENSE for details.

