feat: ship shared validation runners by stefan-jansen · Pull Request #11 · ml4t/backtest

stefan-jansen · 2026-03-29T19:39:30Z

Summary

ship shared internal validation runners for LEAN, VectorBT, Backtrader, and Zipline under
refactor the benchmark suite to reuse those shared runners and update the LEAN parity profile/docs to the current verified semantics
add focused runner/parity tests and refresh canonical validation docs/results for the Chapter 16 parity notebooks

Verification

uv run ty check
uv run pytest tests/ -q
uv build

Copilot

Pull request overview

This PR factors out shared cross-framework validation/benchmark runners (LEAN, VectorBT, Backtrader, Zipline) into ml4t.backtest._validation, refactors the benchmark harness to use them, and updates the LEAN parity profile plus associated docs/tests.

Changes:

Add shared validation runners for LEAN/VectorBT/Backtrader/Zipline under src/ml4t/backtest/_validation/.
Refactor validation/benchmark_suite.py to reuse shared runners and update LEAN benchmarking artifacts (fills/equity/order-events).
Update the LEAN profile semantics (next-bar open + leverage) and refresh parity docs/results and focused tests.

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
validation/benchmark_suite.py	Refactors the benchmark harness to consume shared runners; updates LEAN artifact generation/parsing.
validation/README.md	Refreshes benchmark/parity result tables.
validation/METHODOLOGY.md	Updates methodology tables for the new LEAN “fills” parity framing.
uv.lock	Updates lock metadata/extras for dependency wiring.
tests/test_config_wiring.py	Updates preset wiring tests; adds EXIT_FIRST edge-case regression tests; adds lean preset assertions.
tests/contracts/test_profile_parity_basics.py	Updates LEAN profile contract assertions to reflect new semantics.
tests/benchmark/test_lean_adapter.py	Adds tests for LEAN artifact loader behavior surfaced via benchmark suite adapter.
tests/benchmark/test_backtrader_zipline_runners.py	Adds unit tests for new shared runner helpers (trade log + normalization/flattening).
src/ml4t/backtest/types.py	Adds internal `_created_bar_index` to `Order` for next-bar queue handling.
src/ml4t/backtest/profiles.py	Updates LEAN profile semantics; removes `lean_strict`; enables queue shadow validation for strict-like parity.
src/ml4t/backtest/core/order_book.py	Stamps orders with `_created_bar_index` at submission time.
src/ml4t/backtest/core/execution_engine.py	Adds next-bar queue shadow validation path; adjusts EXIT_FIRST processing.
src/ml4t/backtest/config.py	Wires new `next_bar_queue_shadow_validation` knob through serialization and presets.
src/ml4t/backtest/broker.py	Threads `next_bar_queue_shadow_validation` from config into runtime broker.
src/ml4t/backtest/_validation/zipline_runner.py	New shared Zipline orchestration + bundle caching + artifact extraction helpers.
src/ml4t/backtest/_validation/vectorbt_runner.py	New shared VectorBT import/execution/artifact helpers.
src/ml4t/backtest/_validation/lean_runner.py	New shared LEAN CLI orchestration, data export, and artifact parsing.
src/ml4t/backtest/_validation/backtrader_runner.py	New shared Backtrader orchestration + PyFolio trade-log helpers.
src/ml4t/backtest/_validation/init.py	Introduces internal `_validation` package marker.
docs/user-guide/profiles.md	Removes `lean_strict` references; updates LEAN parity row to fills-based metrics.
docs/user-guide/accounts.md	Removes `lean_strict` from the “strict variants” list.
docs/index.md	Updates LEAN parity row to fills-based metrics.
README.md	Updates top-level validation summary to fills-based LEAN parity.
LIMITATIONS.md	Removes `lean_strict` mention; updates LEAN parity row.
CHANGELOG.md	Adds release notes for the new shared validation runners and parity updates.

Comments suppressed due to low confidence (1)

validation/benchmark_suite.py:1940

In the generated LEAN algorithm, on_data returns early when the timestamp key is missing from _targets. Because targets.csv is written sparsely (non-zero targets only), a day with all-zero targets (possible when signals are missing/NaN early in the series) would skip both equity logging and the logic that closes any existing positions. Consider treating missing keys as an empty target set (targets = self._targets.get(key, {})) and still appending the equity row.

    def on_data(self, data: Slice):
        key = self.time.strftime("%Y-%m-%d")
        targets = self._targets.get(key)
        if targets is None:
            return

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-29T19:46:14Z

+                broker._partial_orders.pop(order.order_id, None)
+            else:
+                fill.update_partial_order(order)
+            broker.mark_account_positions(use_open=False)


broker.mark_account_positions(use_open=False) forces close/price_col marking even when this path is triggered for next-bar open execution (use_open=True). This can desync position marks (and any mark-to-market dependent logic) within the bar. Pass through use_open (or call once after the loop with the correct flag).

Suggested change

broker.mark_account_positions(use_open=False)

broker.mark_account_positions(use_open=use_open)

Copilot · 2026-03-29T19:46:15Z

    cache_hit = False
    if manifest_path.exists():
        try:
            manifest = json.loads(manifest_path.read_text(encoding="utf-8"))
            cache_hit = manifest.get("signature") == lean_data_sig
        except Exception:
            cache_hit = False



The pre-check that sets cache_hit by comparing manifest['signature'] to lean_data_sig no longer matches the manifest format written by export_lean_daily_data (it hashes the full payload). This makes the local check misleading/dead code. Either remove this block and rely on export_lean_daily_data’s return value, or align the signature scheme so both sides compare the same value.

Copilot · 2026-03-29T19:46:15Z

+| Zipline | **0 (0.00%)** | **$10.30 (0.0014%)** | **99.9%+** | DONE |
+| Backtrader | **0 (0.00%)** | float noise | **99.9%+** | DONE |
 | VBT OSS | +91 (0.04%) | $0 (0.00%) | **99%+** | Production |
 | LEAN | +663 (0.29%) | $13K (1.2%) | **97%+** | Buying-power reservation gap |


The “Parity Confidence” table still lists the old LEAN trade/value diffs (+663 trades, $13K) even though this doc now reports LEAN parity as 0 fill gap and ~$1.55 value gap above. Update the LEAN row to reflect the current LEAN semantics/metrics (fills-based) to avoid contradictory guidance.

Suggested change

| LEAN | +663 (0.29%) | $13K (1.2%) | **97%+** | Buying-power reservation gap |

| LEAN | **0 (0.00%)** | **$1.55 (0.0002%)** | **99.9%+** | DONE |

Copilot · 2026-03-29T19:46:15Z

+            validation_price = broker._current_prices.get(order.asset, fill_price)
+            valid, rejection_reason = self._validate_shadow_queue_order(
+                order=order,
+                validation_price=validation_price,
+                shadow_cash=shadow_cash,
+                shadow_positions=shadow_positions,
+            )


In _process_orders_next_bar_queue_shadow, validation_price is derived from broker._current_prices, which will typically be the feed’s price_col (often close). In NEXT_BAR + open execution this can validate commissions/buying power against a different price than the actual fill_price, causing incorrect accepts/rejects. Use fill_price (or price) for validation/commission inputs, and only use mark prices for marking positions.

feat: ship shared validation runners

91834e5

Copilot AI review requested due to automatic review settings March 29, 2026 19:39

Copilot started reviewing on behalf of stefan-jansen March 29, 2026 19:40 View session

style: format validation runners

d04e877

Copilot AI reviewed Mar 29, 2026

View reviewed changes

stefan-jansen merged commit 53db538 into main Mar 29, 2026
7 checks passed

stefan-jansen deleted the feat/shared-validation-runners branch March 29, 2026 19:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: ship shared validation runners#11

feat: ship shared validation runners#11
stefan-jansen merged 2 commits into
mainfrom
feat/shared-validation-runners

stefan-jansen commented Mar 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Copilot AI Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	broker.mark_account_positions(use_open=False)
	broker.mark_account_positions(use_open=use_open)

	\| LEAN \| +663 (0.29%) \| $13K (1.2%) \| 97%+ \| Buying-power reservation gap \|
	\| LEAN \| 0 (0.00%) \| $1.55 (0.0002%) \| 99.9%+ \| DONE \|

Conversation

stefan-jansen commented Mar 29, 2026

Summary

Verification

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants