Skip to content

feat(backtesting): implement time-series backtesting module (PRP-6)#32

Merged
w7-mgfcode merged 11 commits into
devfrom
feat/prp-6-backtesting
Feb 1, 2026
Merged

feat(backtesting): implement time-series backtesting module (PRP-6)#32
w7-mgfcode merged 11 commits into
devfrom
feat/prp-6-backtesting

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

@w7-mgfcode w7-mgfcode commented Feb 1, 2026

Summary

  • Implement complete backtesting infrastructure for time-series model evaluation (PRP-6)
  • Add TimeSeriesSplitter with expanding/sliding window strategies and configurable gap parameter
  • Add MetricsCalculator with MAE, sMAPE (0-200), WAPE, Bias, and Stability Index
  • Add BacktestingService for orchestrating backtests with mandatory baseline comparisons
  • Add POST /backtesting/run endpoint with full response schema
  • Add comprehensive integration tests for routes and service layer

Changes

New Module: app/features/backtesting/

  • schemas.py - Pydantic schemas (SplitConfig, BacktestConfig, FoldResult, etc.)
  • splitter.py - TimeSeriesSplitter with leakage validation
  • metrics.py - MetricsCalculator with edge case handling
  • service.py - BacktestingService orchestrator
  • routes.py - FastAPI endpoint

Tests: 95 unit tests + 16 integration tests (111 total)

  • test_schemas.py - Schema validation (28 tests)
  • test_splitter.py - Splitter behavior (22 tests)
  • test_metrics.py - Metrics calculation (28 tests)
  • test_service.py - Service unit tests (17 tests)
  • test_routes_integration.py - Route integration tests (8 tests)
  • test_service_integration.py - Service integration tests (8 tests)

Examples:

  • examples/backtest/run_backtest.py - API usage
  • examples/backtest/inspect_splits.py - Split visualization
  • examples/backtest/metrics_demo.py - Metrics explanation

Documentation:

  • Updated README.md with testing section
  • Updated docs/validation/pytest-standard.md with integration test patterns

Test plan

  • All 95 backtesting unit tests pass
  • All 16 backtesting integration tests pass
  • All 352 project tests pass
  • Ruff linting clean
  • MyPy type checking clean
  • Pyright type checking clean
  • CI green

🤖 Generated with Claude Code

w7-learn and others added 2 commits February 1, 2026 03:20
Add complete backtesting infrastructure for model evaluation:
- TimeSeriesSplitter with expanding/sliding window strategies and gap support
- MetricsCalculator with MAE, sMAPE, WAPE, Bias, and Stability Index
- BacktestingService for orchestrating backtests with baseline comparisons
- POST /backtesting/run endpoint with full response schema
- 95 unit tests covering schemas, splitter, metrics, and service
- Example scripts for API usage, split visualization, and metrics demo

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 1, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/prp-6-backtesting

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

w7-learn and others added 4 commits February 1, 2026 03:57
- README.md: Add backtesting endpoint, examples, and project structure
- ARCHITECTURE.md: Mark backtesting as implemented with full details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add 16 integration tests that run against real PostgreSQL database:
- 8 route tests for POST /backtesting/run endpoint
- 8 service tests for BacktestingService._load_series_data

Tests use @pytest.mark.integration marker and require docker-compose.
Test data: 120 days of sequential sales (quantity = day number 1-120).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use savepoint-based transaction isolation instead of table drop/create
- Fix client dependency override to use async generator
- Format example files (inspect_splits.py, metrics_demo.py)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
w7-learn
w7-learn previously approved these changes Feb 1, 2026
Remove complex savepoint-based isolation that caused issues with
FastAPI dependency injection. Use simpler session pattern that
matches other working integration tests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
w7-learn
w7-learn previously approved these changes Feb 1, 2026
- Generate unique store codes and SKUs using UUID per test
- Use merge() for calendar fixture to handle existing records
- Clean up test data after each test (SalesDaily, TEST-* stores/products)
- Preserve shared Calendar data between tests

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
w7-learn
w7-learn previously approved these changes Feb 1, 2026
w7-learn
w7-learn previously approved these changes Feb 1, 2026
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
w7-learn
w7-learn previously approved these changes Feb 1, 2026
…te coercion

The strict=True config prevented Pydantic from automatically converting
ISO date strings to date objects in JSON requests, causing 422 errors.
Changed to extra="forbid" to still reject unknown fields while allowing
normal type coercion.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
w7-learn
w7-learn previously approved these changes Feb 1, 2026
Delete calendar entries from 2024-01-01 to 2024-04-29 during test
cleanup to prevent conflicts with other test modules that insert
calendar records in the same date range.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@w7-mgfcode w7-mgfcode merged commit 8aca4d1 into dev Feb 1, 2026
8 checks passed
@w7-mgfcode w7-mgfcode deleted the feat/prp-6-backtesting branch February 1, 2026 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants