ml4t-backtest

Event-driven backtesting engine for quantitative trading strategies with realistic execution modeling.

Part of the ML4T Library Ecosystem

This library is one of five interconnected libraries supporting the machine learning for trading workflow described in Machine Learning for Trading:

Each library addresses a distinct stage: data infrastructure, feature engineering, signal evaluation, strategy backtesting, and live deployment.

What This Library Does

Backtesting requires accurate simulation of order execution, position tracking, and risk management. ml4t-backtest provides:

Event-driven architecture with point-in-time correctness (no look-ahead bias)
Exit-first order processing matching real broker behavior
Configurable execution modes (same-bar or next-bar fills)
Quote-aware execution and marking with price, bid, ask, midpoint, and side-aware sources
Position-level risk rules (stop-loss, take-profit, trailing stops)
Portfolio-level constraints (max positions, drawdown limits)
Cash, margin, and crypto account policies
First-class trade, fill, and portfolio-state export for audit and downstream analysis
40+ behavioral knobs for framework-specific parity

The same Strategy class used in backtesting works unchanged in ml4t-live for production deployment.

Installation

pip install ml4t-backtest

Quick Start

import polars as pl
from ml4t.backtest import Engine, Strategy, BacktestConfig, DataFeed

class SignalStrategy(Strategy):
    def on_data(self, timestamp, data, context, broker):
        for asset, bar in data.items():
            signal = bar.get("signals", {}).get("prediction", 0)
            price = bar.get("price", bar.get("close", 0))
            position = broker.get_position(asset)

            if position is None and signal > 0.5:
                shares = (broker.get_account_value() * 0.10) / price
                if shares > 0:
                    broker.submit_order(asset, shares)
            elif position is not None and signal < -0.5:
                broker.close_position(asset)

config = BacktestConfig(
    initial_cash=100_000,
    commission_rate=0.001,
    slippage_rate=0.0005,
)

feed = DataFeed(prices_df=prices, signals_df=signals)
engine = Engine(feed, SignalStrategy(), config)
result = engine.run()

print(f"Total Return: {result.metrics['total_return_pct']:.2f}%")
print(f"Sharpe Ratio: {result.metrics['sharpe']:.2f}")
print(result.to_fills_dataframe().head())

bar["price"] follows FeedSpec.price_col when you provide one, so the same strategy works for close-based bars and quote-aware feeds.

Risk Management

Position-level exit rules:

from ml4t.backtest import Strategy, StopLoss, TakeProfit, TrailingStop, RuleChain

class MyStrategy(Strategy):
    def on_start(self, broker):
        broker.set_position_rules(RuleChain([
            StopLoss(pct=0.05),
            TakeProfit(pct=0.15),
            TrailingStop(pct=0.03),
        ]))

Portfolio-level controls:

from ml4t.backtest.risk.portfolio.limits import MaxDrawdownLimit, DailyLossLimit

Framework Profiles

Built-in profiles replicate the behavioral semantics of major backtesting frameworks:

from ml4t.backtest import BacktestConfig

# Match VectorBT behavior (same-bar close fills, fractional shares)
config = BacktestConfig.from_preset("vectorbt")

# Match Backtrader behavior (next-bar open fills, integer shares)
config = BacktestConfig.from_preset("backtrader")

# Match Zipline behavior (next-bar open fills, integer shares, per-share commission)
config = BacktestConfig.from_preset("zipline")

# Match QuantConnect LEAN behavior (same-bar close fills, integer shares)
config = BacktestConfig.from_preset("lean")

# Conservative production settings (higher costs, cash buffer)
config = BacktestConfig.from_preset("realistic")

Each profile sets 40+ behavioral knobs (fill timing, execution price, share type, commission model, order processing, etc.) to match the target framework exactly.

Execution Modes

from ml4t.backtest import ExecutionMode, StopFillMode

# Same-bar fills (VectorBT style)
config = BacktestConfig(
    execution_mode=ExecutionMode.SAME_BAR,
    stop_fill_mode=StopFillMode.STOP_PRICE,
)

# Next-bar fills (Backtrader style)
config = BacktestConfig(
    execution_mode=ExecutionMode.NEXT_BAR,
    stop_fill_mode=StopFillMode.STOP_PRICE,
)

Quote-Aware Execution

from ml4t.backtest import BacktestConfig, DataFeed
from ml4t.backtest.config import ExecutionPrice

feed = DataFeed(
    prices_df=quotes,
    price_col="mid_price",
    bid_col="bid",
    ask_col="ask",
    bid_size_col="bid_size",
    ask_size_col="ask_size",
)

config = BacktestConfig(
    execution_price=ExecutionPrice.QUOTE_SIDE,
    mark_price=ExecutionPrice.QUOTE_SIDE,
)

With QUOTE_SIDE, buys fill at the ask and sells fill at the bid when quotes are present. mark_price is configured separately, so you can trade on one source and mark the book on another.

Quote-aware runs also preserve the microstructure context in the result surface:

result.to_fills_dataframe() includes bid/ask/midpoint/spread/size context
result.to_trades_dataframe() includes nullable entry/exit quote summaries
result.to_portfolio_state_dataframe() reflects the configured mark source over time
result.to_predictions_dataframe() preserves the raw model/input surface for downstream diagnostics

Reproducible Config Snapshots

BacktestConfig is also the serializable backtest preset surface. You can keep input configs sparse, then persist the fully resolved config that actually ran.

config = BacktestConfig.from_yaml("config/my_backtest.yaml")
result = Engine(feed, strategy, config).run()

resolved_config = result.config.to_dict()
runtime_spec = result.to_spec_dict()
written = result.to_parquet("results/run_001")

The exported result directory includes:

config.yaml for the replayable resolved config payload
spec.yaml for the richer runtime snapshot with library version and realized run window

Use top-level feed in BacktestConfig for generic feed semantics and top-level metadata for user-defined provenance like input paths or strategy ids.

Commission and Slippage

from ml4t.backtest import BacktestConfig, CommissionType

config = BacktestConfig(
    commission_rate=0.001,         # 10 bps percentage
    slippage_rate=0.0005,          # 5 bps slippage
    stop_slippage_rate=0.001,      # Additional slippage for stop exits
)

# Or per-share (Interactive Brokers style)
config = BacktestConfig(
    commission_type=CommissionType.PER_SHARE,
    commission_per_share=0.005,
    commission_minimum=1.0,
)

Multi-Asset Rebalancing

from ml4t.backtest import Strategy, TargetWeightExecutor, RebalanceConfig

class WeightStrategy(Strategy):
    def __init__(self):
        self.executor = TargetWeightExecutor(RebalanceConfig(
            min_trade_value=100,
            min_weight_change=0.01,
        ))
        self.bar_count = 0

    def on_data(self, timestamp, data, context, broker):
        self.bar_count += 1
        if self.bar_count % 21 != 1:  # Monthly rebalance
            return

        # ML predictions → portfolio weights
        weights = {}
        for asset, bar in data.items():
            signal = bar.get("signals", {}).get("prediction", 0)
            if signal and signal > 0:
                weights[asset] = signal
        if weights:
            total = sum(weights.values())
            weights = {a: w / total for a, w in weights.items()}
            self.executor.execute(weights, data, broker)

Cross-Framework Validation

ml4t-backtest is validated by configuring profiles to match each framework's behavior exactly:

Framework	Scenarios	Trade Match	Notes
VectorBT Pro	16/16	100%	Full feature coverage
VectorBT OSS	16/16	100%	Open-source subset
Backtrader	16/16	100%	Next-bar execution
Zipline	15/15	100%	NYSE calendar alignment

Large-scale validation (250 assets x 20 years, real data):

Profile	Trades	Value	Gap
zipline_strict	225,583	match	0 trades, $19 (0.0001%)
backtrader_strict	216,980	match	1 trade (0.0005%)
vectorbt_strict	210,352	match	91 trades (0.04%)
lean	428,459 fills	match	0 fills, $1.55 (0.0002%)

See validation/README.md for methodology and detailed results.

Release-gate commands:

# Fast parity contract gate (scenario 01 across vectorbt/backtrader/zipline)
ML4T_COMPARISON_INPROC=1 uv run pytest tests/contracts/test_cross_engine_contracts.py -q

# Full correctness runner (selected scenarios)
python validation/run_all_correctness.py --framework vectorbt_oss --scenarios 01,03,05,09
python validation/run_all_correctness.py --framework backtrader --scenarios 01,03,05,09
python validation/run_all_correctness.py --framework zipline --scenarios 01,03,05,09

Performance

Benchmark on 250 assets x 20 years daily data (1.26M bars):

Metric	Value
Runtime	~30s
Speed	~40,000 bars/sec
Memory	~290 MB
vs Backtrader	19x faster
vs Zipline	8x faster
vs LEAN	5x faster

Documentation

Getting Started — your first backtest
Data Feed — price_col, quote columns, and feed wiring
Strategies — strategy interface and templates
Stateful Strategies — advanced event-driven patterns (Kelly sizing, pairs trading, circuit breakers)
Execution Semantics — fill timing, ordering, stops
Configuration — 40+ behavioral knobs
Risk Management — stops, trails, portfolio limits
Rebalancing — weight-based portfolio management
Results & Analysis — trades, fills, equity, and Parquet export
Market Impact — commission, slippage, and impact models
Profiles — framework parity presets

Technical Characteristics

Event-driven: Each bar processes sequentially with exit-first logic
Point-in-time: No access to future data within strategy callbacks
Configurable fills: Match behavior of different backtesting frameworks
Quote-aware: Optional bid/ask/mid/size caches with side-aware market fills
Parquet export: Trades, fills, equity, daily P&L, and config are serializable
Type-safe: 0 type diagnostics (ty/Astral), full type annotations

Related Libraries

ml4t-data: Market data acquisition and storage
ml4t-engineer: Feature engineering and technical indicators
ml4t-diagnostic: Signal evaluation and statistical validation
ml4t-live: Live trading with broker integration

Development

git clone https://github.com/ml4t/ml4t-backtest.git
cd ml4t-backtest
uv sync
uv run pytest tests/ -q
uv run ty check

Known Limitations

See LIMITATIONS.md for documented assumptions:

No intrabar stop simulation (uses bar OHLC)
Calendar overnight sessions require configuration
See LIMITATIONS.md for full list

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
scripts		scripts
src/ml4t/backtest		src/ml4t/backtest
tests		tests
validation		validation
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
LIMITATIONS.md		LIMITATIONS.md
README.md		README.md
api.yaml		api.yaml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ml4t-backtest

Part of the ML4T Library Ecosystem

What This Library Does

Installation

Quick Start

Risk Management

Framework Profiles

Execution Modes

Quote-Aware Execution

Reproducible Config Snapshots

Commission and Slippage

Multi-Asset Rebalancing

Cross-Framework Validation

Performance

Documentation

Technical Characteristics

Related Libraries

Development

Known Limitations

License

About

Uh oh!

Releases 17

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ml4t-backtest

Part of the ML4T Library Ecosystem

What This Library Does

Installation

Quick Start

Risk Management

Framework Profiles

Execution Modes

Quote-Aware Execution

Reproducible Config Snapshots

Commission and Slippage

Multi-Asset Rebalancing

Cross-Framework Validation

Performance

Documentation

Technical Characteristics

Related Libraries

Development

Known Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 17

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages