feat(backtesting): wire config fields into implementation by w7-mgfcode · Pull Request #34 · w7-mgfcode/ForecastLabAI

w7-mgfcode · 2026-02-01T04:59:16Z

Summary

Add _validate_config() method to enforce settings constraints at runtime:
- Validates n_splits does not exceed BACKTEST_MAX_SPLITS (default: 20)
- Validates gap does not exceed BACKTEST_MAX_GAP (default: 30)
- Logs warning if min_train_size is below BACKTEST_DEFAULT_MIN_TRAIN_SIZE (default: 30)
Add save_results() method using BACKTEST_RESULTS_DIR for persisting backtest results as JSON
Add 6 unit tests for config validation and result saving functionality

This wires the previously unused config fields from app/core/config.py (lines 50-54) into the backtesting implementation.

Test plan

All 21 service tests pass
All 101 backtesting unit tests pass
mypy and pyright pass with no errors
ruff linting passes

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Backtesting with expanding and sliding window time-series cross-validation
- Configurable gap parameter to simulate data latency
- Comprehensive metric suite: MAE, sMAPE, WAPE, Bias, Stability Index
- Baseline comparisons with Naive and Seasonal Naive models
- Data lineage recording of actuals vs. predictions per fold
- New API endpoint: POST /backtesting/run
Documentation
- Backtesting protocol specification and architecture updates
- Testing guide with example scripts
Tests
- Comprehensive unit and integration test coverage for backtesting functionality

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Add _validate_config() to enforce settings constraints: - Validate n_splits <= BACKTEST_MAX_SPLITS - Validate gap <= BACKTEST_MAX_GAP - Warn if min_train_size < BACKTEST_DEFAULT_MIN_TRAIN_SIZE - Add save_results() method using BACKTEST_RESULTS_DIR - Add unit tests for config validation and result saving Closes issue with unused config fields in app/core/config.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

sourcery-ai

Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters

coderabbitai · 2026-02-01T04:59:29Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

🔍 Trigger a full review

📝 Walkthrough

Walkthrough

This PR introduces a complete backtesting vertical for time-series forecasting, featuring time-based cross-validation with expanding/sliding windows and gap support, a comprehensive metrics suite (MAE, sMAPE, WAPE, Bias, Stability Index), baseline model comparisons, data lineage tracking per fold, and API endpoints for backtesting orchestration and result retrieval.

Changes

Cohort / File(s)	Summary
Documentation & Configuration `INITIAL-6.md`, `README.md`, `docs/ARCHITECTURE.md`, `docs/validation/pytest-standard.md`, `app/core/config.py`	Added backtesting feature documentation, testing guidelines, architecture spec update, and 4 new configuration settings (backtest_max_splits, backtest_default_min_train_size, backtest_max_gap, backtest_results_dir).
Core Data Models & Schemas `app/features/backtesting/schemas.py`	Introduced 7 immutable Pydantic models: SplitConfig, BacktestConfig (with config_hash method), SplitBoundary, FoldResult, ModelBacktestResult, BacktestRequest, BacktestResponse for versioned, hashable backtesting configuration and results.
Time-Series Splitting Logic `app/features/backtesting/splitter.py`	Implemented TimeSeriesSplitter class supporting expanding and sliding window strategies with gap parameter for latency simulation, boundary extraction, and leakage validation; includes TimeSeriesSplit dataclass for fold metadata.
Metrics Computation `app/features/backtesting/metrics.py`	Created MetricsCalculator with static methods for MAE, sMAPE, WAPE, Bias, and Stability Index (with edge-case handling for zeros, empty arrays, NaN filtering); aggregation utilities for per-fold metrics with std deviation tracking.
Backtesting Service & Orchestration `app/features/backtesting/service.py`	Implemented BacktestingService to orchestrate end-to-end backtesting: data loading, per-fold train/predict/evaluate, baseline comparisons, leakage checking, result aggregation, and JSON persistence; includes SeriesData container and internal helper methods.
API Routes & Module Export `app/features/backtesting/routes.py`, `app/features/backtesting/__init__.py`, `app/main.py`	Added POST /backtesting/run and GET /backtesting/results/{backtest_id} endpoints; centralized public API exports via init.py with all list; registered router in main app.
Test Fixtures & Scaffolding `app/features/backtesting/tests/conftest.py`	Provided 16+ fixtures for integration testing including async DB session, HTTP client, sample store/product, 120-day calendar and sales data, date sequences, and BacktestConfig variants for expanding/sliding/gap scenarios.
Unit & Integration Tests `app/features/backtesting/tests/test_*.py` (metrics, schemas, splitter, routes_integration, service, service_integration)`	Comprehensive test coverage (2050+ lines) validating metrics calculations, schema validation, splitter behavior across strategies/gaps, API integration with real DB, service orchestration, and end-to-end backtesting flows.
Example Scripts `examples/backtest/inspect_splits.py`, `examples/backtest/metrics_demo.py`, `examples/backtest/run_backtest.py`	Three example scripts demonstrating TimeSeriesSplitter visualization, MetricsCalculator usage across scenarios, and end-to-end backtest execution via HTTP API with result parsing and display.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API Route
    participant BacktestingService
    participant Database
    participant TimeSeriesSplitter
    participant Model
    participant MetricsCalculator

    Client->>API Route: POST /backtesting/run (store, product, dates, config)
    API Route->>BacktestingService: run_backtest(db, config, dates)
    
    BacktestingService->>Database: _load_series_data(store_id, product_id, date_range)
    Database-->>BacktestingService: SeriesData (dates, values)
    
    BacktestingService->>TimeSeriesSplitter: split(dates, values)
    TimeSeriesSplitter-->>BacktestingService: Iterator[TimeSeriesSplit] (train/test indices per fold)
    
    loop For each fold
        BacktestingService->>Model: train(X_train, y_train)
        Model-->>BacktestingService: trained_model
        BacktestingService->>Model: predict(X_test)
        Model-->>BacktestingService: predictions
        
        BacktestingService->>MetricsCalculator: calculate_all(actuals, predictions)
        MetricsCalculator-->>BacktestingService: fold_metrics (MAE, sMAPE, WAPE, Bias)
    end
    
    BacktestingService->>MetricsCalculator: aggregate_fold_metrics(all_fold_metrics)
    MetricsCalculator-->>BacktestingService: aggregated_metrics, stability_indices
    
    BacktestingService-->>API Route: BacktestResponse (main results, baselines, comparison summary)
    API Route-->>Client: 200 OK with BacktestResponse JSON

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Release: Feature Engineering + Forecasting Module (PRP-4 & PRP-5) #29: Introduces forecasting module interfaces (model_factory, Naive/SeasonalNaive forecasters, model config schemas) that this PR directly depends on for model training and prediction within backtesting folds.

Suggested reviewers

w7-learn

Poem

🐰 Hops through time with expanding grace,
Splits and gaps in data's embrace,
Metrics bloom like clover in spring,
Backtests verified, comparisons sing! 🌱✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main focus: wiring config fields (backtest_max_splits, backtest_max_gap, backtest_default_min_train_size, backtest_results_dir) from app/core/config.py into the BacktestingService implementation via _validate_config() and save_results() methods.
Docstring Coverage	✅ Passed	Docstring coverage is 97.59% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/wire-backtest-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

- Add SIGNED_METRICS class constant to identify signed metrics (e.g., "bias") - Update _generate_comparison_summary to use absolute values for percentage improvement calculations on signed metrics - Original signed values are preserved in main/naive/seasonal_naive keys - Add 3 unit tests for signed metric handling: - test_comparison_signed_metric_uses_absolute_values - test_comparison_signed_metric_positive_values - test_comparison_signed_metric_mixed_signs Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

sourcery-ai Bot reviewed Feb 1, 2026

View reviewed changes

w7-mgfcode changed the base branch from main to dev February 1, 2026 05:02

w7-mgfcode merged commit daef9ce into dev Feb 1, 2026
6 of 7 checks passed

w7-mgfcode deleted the fix/wire-backtest-config branch February 1, 2026 05:06

This was referenced May 19, 2026

feat(release): ship the MLZOO advanced ML model zoo (A–C2) to main (#252) #253

Merged

release: prepare v0.2.17 — MLZOO-D + dogfood automation + cross-slice import pattern #266

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backtesting): wire config fields into implementation#34

feat(backtesting): wire config fields into implementation#34
w7-mgfcode merged 2 commits into
devfrom
fix/wire-backtest-config

w7-mgfcode commented Feb 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

coderabbitai Bot commented Feb 1, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

w7-mgfcode commented Feb 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

w7-mgfcode commented Feb 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 1, 2026 •

edited

Loading