Skip to content

feat: ship shared validation runners#11

Merged
stefan-jansen merged 2 commits into
mainfrom
feat/shared-validation-runners
Mar 29, 2026
Merged

feat: ship shared validation runners#11
stefan-jansen merged 2 commits into
mainfrom
feat/shared-validation-runners

Conversation

@stefan-jansen
Copy link
Copy Markdown
Contributor

Summary

  • ship shared internal validation runners for LEAN, VectorBT, Backtrader, and Zipline under
  • refactor the benchmark suite to reuse those shared runners and update the LEAN parity profile/docs to the current verified semantics
  • add focused runner/parity tests and refresh canonical validation docs/results for the Chapter 16 parity notebooks

Verification

  • uv run ty check
  • uv run pytest tests/ -q
  • uv build

Copilot AI review requested due to automatic review settings March 29, 2026 19:39
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR factors out shared cross-framework validation/benchmark runners (LEAN, VectorBT, Backtrader, Zipline) into ml4t.backtest._validation, refactors the benchmark harness to use them, and updates the LEAN parity profile plus associated docs/tests.

Changes:

  • Add shared validation runners for LEAN/VectorBT/Backtrader/Zipline under src/ml4t/backtest/_validation/.
  • Refactor validation/benchmark_suite.py to reuse shared runners and update LEAN benchmarking artifacts (fills/equity/order-events).
  • Update the LEAN profile semantics (next-bar open + leverage) and refresh parity docs/results and focused tests.

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
validation/benchmark_suite.py Refactors the benchmark harness to consume shared runners; updates LEAN artifact generation/parsing.
validation/README.md Refreshes benchmark/parity result tables.
validation/METHODOLOGY.md Updates methodology tables for the new LEAN “fills” parity framing.
uv.lock Updates lock metadata/extras for dependency wiring.
tests/test_config_wiring.py Updates preset wiring tests; adds EXIT_FIRST edge-case regression tests; adds lean preset assertions.
tests/contracts/test_profile_parity_basics.py Updates LEAN profile contract assertions to reflect new semantics.
tests/benchmark/test_lean_adapter.py Adds tests for LEAN artifact loader behavior surfaced via benchmark suite adapter.
tests/benchmark/test_backtrader_zipline_runners.py Adds unit tests for new shared runner helpers (trade log + normalization/flattening).
src/ml4t/backtest/types.py Adds internal _created_bar_index to Order for next-bar queue handling.
src/ml4t/backtest/profiles.py Updates LEAN profile semantics; removes lean_strict; enables queue shadow validation for strict-like parity.
src/ml4t/backtest/core/order_book.py Stamps orders with _created_bar_index at submission time.
src/ml4t/backtest/core/execution_engine.py Adds next-bar queue shadow validation path; adjusts EXIT_FIRST processing.
src/ml4t/backtest/config.py Wires new next_bar_queue_shadow_validation knob through serialization and presets.
src/ml4t/backtest/broker.py Threads next_bar_queue_shadow_validation from config into runtime broker.
src/ml4t/backtest/_validation/zipline_runner.py New shared Zipline orchestration + bundle caching + artifact extraction helpers.
src/ml4t/backtest/_validation/vectorbt_runner.py New shared VectorBT import/execution/artifact helpers.
src/ml4t/backtest/_validation/lean_runner.py New shared LEAN CLI orchestration, data export, and artifact parsing.
src/ml4t/backtest/_validation/backtrader_runner.py New shared Backtrader orchestration + PyFolio trade-log helpers.
src/ml4t/backtest/_validation/init.py Introduces internal _validation package marker.
docs/user-guide/profiles.md Removes lean_strict references; updates LEAN parity row to fills-based metrics.
docs/user-guide/accounts.md Removes lean_strict from the “strict variants” list.
docs/index.md Updates LEAN parity row to fills-based metrics.
README.md Updates top-level validation summary to fills-based LEAN parity.
LIMITATIONS.md Removes lean_strict mention; updates LEAN parity row.
CHANGELOG.md Adds release notes for the new shared validation runners and parity updates.
Comments suppressed due to low confidence (1)

validation/benchmark_suite.py:1940

  • In the generated LEAN algorithm, on_data returns early when the timestamp key is missing from _targets. Because targets.csv is written sparsely (non-zero targets only), a day with all-zero targets (possible when signals are missing/NaN early in the series) would skip both equity logging and the logic that closes any existing positions. Consider treating missing keys as an empty target set (targets = self._targets.get(key, {})) and still appending the equity row.
    def on_data(self, data: Slice):
        key = self.time.strftime("%Y-%m-%d")
        targets = self._targets.get(key)
        if targets is None:
            return

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

broker._partial_orders.pop(order.order_id, None)
else:
fill.update_partial_order(order)
broker.mark_account_positions(use_open=False)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

broker.mark_account_positions(use_open=False) forces close/price_col marking even when this path is triggered for next-bar open execution (use_open=True). This can desync position marks (and any mark-to-market dependent logic) within the bar. Pass through use_open (or call once after the loop with the correct flag).

Suggested change
broker.mark_account_positions(use_open=False)
broker.mark_account_positions(use_open=use_open)

Copilot uses AI. Check for mistakes.
Comment on lines 2024 to 2031
cache_hit = False
if manifest_path.exists():
try:
manifest = json.loads(manifest_path.read_text(encoding="utf-8"))
cache_hit = manifest.get("signature") == lean_data_sig
except Exception:
cache_hit = False

Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pre-check that sets cache_hit by comparing manifest['signature'] to lean_data_sig no longer matches the manifest format written by export_lean_daily_data (it hashes the full payload). This makes the local check misleading/dead code. Either remove this block and rely on export_lean_daily_data’s return value, or align the signature scheme so both sides compare the same value.

Copilot uses AI. Check for mistakes.
Comment thread validation/METHODOLOGY.md
| Zipline | **0 (0.00%)** | **$10.30 (0.0014%)** | **99.9%+** | DONE |
| Backtrader | **0 (0.00%)** | float noise | **99.9%+** | DONE |
| VBT OSS | +91 (0.04%) | $0 (0.00%) | **99%+** | Production |
| LEAN | +663 (0.29%) | $13K (1.2%) | **97%+** | Buying-power reservation gap |
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “Parity Confidence” table still lists the old LEAN trade/value diffs (+663 trades, $13K) even though this doc now reports LEAN parity as 0 fill gap and ~$1.55 value gap above. Update the LEAN row to reflect the current LEAN semantics/metrics (fills-based) to avoid contradictory guidance.

Suggested change
| LEAN | +663 (0.29%) | $13K (1.2%) | **97%+** | Buying-power reservation gap |
| LEAN | **0 (0.00%)** | **$1.55 (0.0002%)** | **99.9%+** | DONE |

Copilot uses AI. Check for mistakes.
Comment on lines +153 to +159
validation_price = broker._current_prices.get(order.asset, fill_price)
valid, rejection_reason = self._validate_shadow_queue_order(
order=order,
validation_price=validation_price,
shadow_cash=shadow_cash,
shadow_positions=shadow_positions,
)
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _process_orders_next_bar_queue_shadow, validation_price is derived from broker._current_prices, which will typically be the feed’s price_col (often close). In NEXT_BAR + open execution this can validate commissions/buying power against a different price than the actual fill_price, causing incorrect accepts/rejects. Use fill_price (or price) for validation/commission inputs, and only use mark prices for marking positions.

Copilot uses AI. Check for mistakes.
@stefan-jansen stefan-jansen merged commit 53db538 into main Mar 29, 2026
7 checks passed
@stefan-jansen stefan-jansen deleted the feat/shared-validation-runners branch March 29, 2026 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants