Skip to content

feat(release): ship the MLZOO advanced ML model zoo (A–C2) to main (#252)#253

Merged
w7-mgfcode merged 25 commits into
mainfrom
dev
May 19, 2026
Merged

feat(release): ship the MLZOO advanced ML model zoo (A–C2) to main (#252)#253
w7-mgfcode merged 25 commits into
mainfrom
dev

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

@w7-mgfcode w7-mgfcode commented May 19, 2026

Releases the MLZOO advanced ML model zoo arc (feature-aware forecasting A → C2) plus accumulated fixes from dev to main. Closes #252.

main is at v0.2.15. Merging this PR lets release-please open its Release PR; merging that tags the next version (pre-1.0 feat: → PATCH ⇒ v0.2.16).

What ships

Area PR Summary
MLZOO-A #238 Feature-aware forecasting foundation — shared leakage-safe feature-frame contract
MLZOO-B #242 LightGBM feature-aware model (optional ml-lightgbm extra)
MLZOO-B.2 #244 Feature-aware models wired into the backtesting fold loop (per-fold leakage-safe X)
MLZOO-C1 #251 XGBoost feature-aware model (optional ml-xgboost extra)
MLZOO-C2 #250 Prophet-like additive model — pure scikit-learn, decompose() trend/seasonality/regressor
fix(api) #249 Tailscale CGNAT origins in the dev CORS allow-list
fix(ui/jobs) #228, #229 What-if planner: empty assumption dates + model_exogenous reachability
docs / chore #238#248 MLZOO PRPs & planning briefs, feature-frame contract docs, uv.lock sync

Validation

  • dev @ 7531eac — CI workflow green (Lint, Type Check, Test, Migration, Schema).
  • Both new models verified end-to-end: ruff · ruff format · mypy · pyright (0 errors) · 1376 unit + integration tests.

⚠️ Merge instructions (release-please subject trap)

This PR is titled feat(release): … so the bump fires regardless of merge method. If merging from the CLI, do not let a non-bumping subject through — either merge via the GitHub web UI (default Merge pull request #… subject) or keep the feat(release): subject explicitly. See docs/_base/RUNBOOKS.md → "release-please skipped the bump".

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced advanced forecasting models: LightGBM, XGBoost, and Prophet-like additive models available as optional dependencies.
    • Enabled feature-aware backtesting for regression and advanced models with safe, leakage-free feature matrices.
    • Added feature-aware model detection via capability flags for flexible model selection.
  • Documentation

    • Added comprehensive model interface and feature-frame contract guides.
    • Included example scripts demonstrating each new model type.
    • Updated README with optional-dependency installation and supported model types.
  • Tests

    • Added unit and integration tests covering all new models and backtesting paths.
    • Included leakage-safety validation tests for feature-frame construction.

Review Change Stack

w7-mgfcode and others added 25 commits May 19, 2026 13:14
chore(release): back-merge v0.2.15 release commits into dev (#232)
Add a regression branch to JobService._execute_train so a train job can
train a regression model and complete with result.run_id set. Rewire the
What-If Planner baseline picker to train jobs and read the artifact key
from job.result.run_id, making the model_exogenous re-forecast path
reachable from the browser.
Add a pure assumptionDateErrors helper (unit-tested) and use it in the
What-If Planner to disable Run simulation and Save as plan while any
enabled Price/Promotion assumption has a blank date, with inline Required
hints. The empty-string-date 422 is now structurally unreachable from the
form.
…nd-empty-dates

fix(jobs,ui): reach model_exogenous + block empty assumption dates in the planner
docs: land MLZOO planning briefs and sync uv.lock
…g-foundation

feat(forecast): feature-aware forecasting foundation — shared feature-frame contract
Implements PRP-30 (MLZOO-B) — the first advanced feature-aware
forecasting model, LightGBMForecaster wrapping lightgbm.LGBMRegressor,
on top of the PRP-29 shared feature-frame contract.

- LightGBMForecaster (requires_features=True), structural twin of
  RegressionForecaster; lightgbm is lazy-imported inside fit() so the
  module never requires the optional dependency. Deterministic via
  n_jobs=1 + deterministic=True + force_col_wise=True + random_state.
- model_factory lightgbm branch (behind forecast_enable_lightgbm).
- ScenarioService.simulate dispatch generalised from a model_type
  string to bundle.model.requires_features — a LightGBM bundle now
  takes the genuine model_exogenous re-forecast path.
- JobService._execute_train accepts model_type=lightgbm.
- ModelBundle.lightgbm_version + registry runtime_info captured
  best-effort; no migration (runtime_info is JSONB).
- LightGBM ships as an optional [project.optional-dependencies]
  group ml-lightgbm; uv.lock regenerated.

No baseline-model change, no API contract change, no migration.
Feature-aware backtesting stays loud-fail (deferred to PRP-MLZOO-B.2).
…st-model

feat(forecast): add LightGBM feature-aware forecasting model (#242)
…oop (#244)

Implements PRP-MLZOO-B.2. Feature-aware models (regression, lightgbm — any
model with requires_features=True) are now evaluated by POST /backtesting/run
and backtest jobs, with per-fold leakage-safe X_train / X_future matrices.

- Promote the historical row builder to app/shared/feature_frames/rows.py
  (build_historical_feature_rows) and add build_future_feature_rows — the
  leakage-critical per-fold test-window assembler. ForecastingService.
  _assemble_regression_rows becomes a delegating shim; its leakage spec
  (test_regression_features_leakage.py) stays green, unedited.
- BacktestingService: resolve exogenous data once async (ExogenousFrame),
  branch the fold loop on requires_features, slice X_train from one historical
  matrix, rebuild X_future per fold. min_train_size >= 30 enforced loud.
- ModelBacktestResult gains additive feature_aware / exogenous_policy fields;
  _shape_backtest_result keys unchanged (frontend contract byte-stable).
- JobService._execute_backtest accepts regression + lightgbm.
- Repurpose the interim loud-fail test (PRP-29 #7 / PRP-30 #6 superseded);
  add the shared X_future leakage spec incl. a gap>0 fold.
…e-folds

feat(backtest): wire feature-aware models into the backtesting fold loop (#244)
Add ProphetLikeForecaster — a deterministic, feature-aware additive
linear model (MLZOO-C2). It is a scikit-learn Pipeline of a
SimpleImputer(median) + Ridge(solver="cholesky") over the canonical
14-column feature frame, plus a decompose() method that splits any
forecast into its additive trend / seasonality / holiday-regressor
contributions.

Pure scikit-learn — ships always-enabled like the regression model:
no new dependency, no optional extra, no feature flag. The model
trains, scenario-re-forecasts (method="model_exogenous"), and
backtests through the existing requires_features-based dispatch with
zero changes to the train/predict/scenarios/backtesting service layers.

- ProphetLikeModelConfig (alpha + feature_config_hash); added to the
  ModelConfig union.
- _PROPHET_LIKE_COMPONENTS constant + ForecastDecomposition dataclass.
- "prophet_like" added to the ModelType literal + model_factory branch.
- jobs _execute_train + _execute_backtest gain a prophet_like branch.
- test_prophet_like_forecaster.py: contract tests + model-specific
  invariants (additive invariant, NaN tolerance, imputer leakage-safety,
  determinism).
- prophet_like coverage extended into the forecasting/jobs/backtesting
  unit tests and the scenarios model_exogenous integration test.
- examples/models/prophet_like_additive.py: runnable train / predict /
  decompose example for ProphetLikeForecaster.
- model_interface.md: ProphetLikeModelConfig entry + Prophet-like
  Forecaster formula section (additive decomposition + component
  column grouping).
- feature_frame_contract.md: record prophet_like as an implemented
  feature-aware model and the worked non-NaN-tolerant imputer example.
- README.md: add prophet_like to the Supported Model Types list.
Adds the [project.optional-dependencies] ml-xgboost = ["xgboost>=2.1.0"]
extra, mirroring ml-lightgbm. uv.lock regenerated so CI's
uv sync --frozen --all-extras installs it. No core dependency change.
Implements XGBoostForecaster, the second advanced feature-aware tree
model (MLZOO-C1), mirroring the merged LightGBMForecaster byte-for-byte
with xgboost.XGBRegressor in place of lightgbm.LGBMRegressor.

- XGBoostModelConfig: conservative schema (n_estimators / max_depth /
  learning_rate / feature_config_hash), added to the ModelConfig union.
- XGBoostForecaster: requires_features=True, lazy xgboost import inside
  fit(), deterministic via n_jobs=1 + tree_method=hist + fixed seed.
- model_factory: xgboost branch gated on forecast_enable_xgboost; the
  ModelType literal gains "xgboost".
- JobService._execute_train / _execute_backtest: xgboost branches.
- POST /forecasting/train: xgboost feature-flag gate (400 when off).
- ModelBundle.xgboost_version + registry runtime_info xgboost block,
  both best-effort; compute_hash unchanged.
- Tests mirror the LightGBM suite, gated with importorskip("xgboost").
- Docs + examples/models/advanced_xgboost.py additive.

train / predict / scenarios / backtesting services are unchanged: each
branches on requires_features, so an xgboost model routes through every
path automatically. No migration, no API-contract change. XGBoost reuses
the regression historical/future feature builders, so the existing
leakage specs cover it by construction.
fix(api): allow Tailscale CGNAT origins in dev CORS allow-list (#246)
feat(forecast): add XGBoost feature-aware forecasting model (#247)
# Conflicts:
#	README.md
#	app/features/backtesting/tests/test_feature_aware_backtest.py
#	app/features/forecasting/models.py
#	app/features/forecasting/tests/test_service.py
#	app/features/jobs/service.py
#	app/features/jobs/tests/test_service.py
#	app/features/scenarios/tests/test_routes_integration.py
#	examples/models/feature_frame_contract.md
#	examples/models/model_interface.md
…-model

feat(forecast): add Prophet-like additive forecasting model (#248)
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

📝 Walkthrough

Walkthrough

Adds shared feature-frame contract, introduces LightGBM/XGBoost/Prophet-like models, wires feature-aware backtesting using exogenous data, updates scenarios and planner validation, extends schemas, persistence, routes, registry, tests, examples, and documentation.

Changes

Advanced ML Model Zoo release (A–C2) and integrations

Layer / File(s) Summary
Feature-frames, models, backtesting, scenarios, UI, and docs wiring
app/shared/feature_frames/*, app/features/forecasting/*, app/features/backtesting/*, app/features/scenarios/*, frontend/src/*, examples/models/*, docs/*, pyproject.toml, app/core/config.py, app/main.py, PRPs/*
Creates shared feature-frame contract and builders; adds LightGBM/XGBoost/Prophet-like forecasters and schemas/factory; records optional lib versions; enables feature-aware backtesting with ExogenousFrame; updates scenarios to capability flag; adds planner date validation; expands docs/tests/examples and optional dependency extras.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • w7-learn

Poem

A rabbit taps keys with gentle might,
Stitches lags to moons and Friday’s light.
Trees and boosts, a prophet’s line—
Fourteen columns, forecasts fine.
Backtests hum, no future peek;
Features aligned, results we seek.
Hop! Ship MLZOO—sleek.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

@socket-security
Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpypi/​xgboost@​3.2.098100100100100
Addedpypi/​lightgbm@​4.6.099100100100100

View full report

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (1)
app/features/forecasting/tests/test_routes.py (1)

78-79: ⚡ Quick win

Strengthen error-path assertions to lock RFC 7807 responses.

These checks currently validate only status/text. Also assert application/problem+json and core fields (type, title, status, detail) so the error contract is enforced, not just the message substring.

Suggested test assertion upgrade
     response = await client.post("/forecasting/train", json=payload)
     assert response.status_code == 400
-    assert "lightgbm" in response.text.lower()
+    assert response.headers["content-type"].startswith("application/problem+json")
+    body = response.json()
+    assert body["status"] == 400
+    assert "lightgbm" in body.get("detail", "").lower()
@@
     response = await client.post("/forecasting/train", json=payload)
     assert response.status_code == 400
-    assert "xgboost" in response.text.lower()
+    assert response.headers["content-type"].startswith("application/problem+json")
+    body = response.json()
+    assert body["status"] == 400
+    assert "xgboost" in body.get("detail", "").lower()
As per coding guidelines, "app/**/*.py: Use RFC 7807 `application/problem+json` error responses via `app/core/problem_details.py`." and "`app/**/tests/test_routes.py`: New API endpoints must have route tests covering the 2xx happy path plus at least one error path."

Also applies to: 97-98

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/features/forecasting/tests/test_routes.py` around lines 78 - 79, The test
currently only checks status code and substring; update it to assert the RFC
7807 problem response contract by verifying response.headers["Content-Type"] (or
response.content_type) equals "application/problem+json", parse response.json()
and assert it contains the keys "type", "title", "status", and "detail", assert
json["status"] == 400 and that the "detail" or "title" field includes "lightgbm"
(to keep the original semantic check); apply these same assertions for the other
occurrence mentioned (lines 97-98) so error-path tests enforce the problem+json
contract rather than only text matching.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/features/backtesting/service.py`:
- Around line 39-40: The backtesting service is directly importing Calendar,
Product, Promotion, SalesDaily from app.features.data_platform.models which
violates slice boundaries; create a thin data access facade in app/core or
app/shared (e.g., app.core.data_platform_repo with functions like
get_calendar(...), get_products(...), get_promotions(...), get_sales_daily(...))
that encapsulates all DB queries and returns the same domain objects or DTOs,
then replace the direct imports/usages of Calendar, Product, Promotion,
SalesDaily in app/features/backtesting/service.py (and the similar usages around
the referenced 752-852 region) to instead call those new repository functions;
keep model_factory usage unchanged but ensure any data-platform queries in
functions in service.py are routed through the new core/shared facade.

In `@app/features/backtesting/tests/test_routes_integration.py`:
- Around line 476-477: Replace the raw substring check with RFC 7807 assertions:
assert response.status_code == 400, assert
response.headers["content-type"].startswith("application/problem+json"), parse
payload = response.json() and assert it contains the keys "type", "title",
"status", "detail", that payload["status"] == 400 and that payload["detail"]
includes the existing "at least 30" message; ensure the test uses the same RFC
7807 formatting produced by app/core/problem_details.py rather than raw
HTTPException text.

In `@app/features/forecasting/models.py`:
- Around line 704-705: Replace the bare lazy imports of optional ML libs with
guarded imports that convert missing-module errors into ValueError with
actionable install guidance: wrap the "import lightgbm as lgb" (symbol: lgb) in
a try/except ImportError and raise a ValueError like "LightGBM is required for
this feature; install with 'pip install lightgbm'" (or similar) so the route's
validation handler returns a 400; do the same for the XGBoost import (e.g.,
"import xgboost as xgb" / symbol: xgb) at the other location, catching
ImportError/ModuleNotFoundError and raising a ValueError with a clear message
telling the user how to install the missing package.

In `@app/features/forecasting/routes.py`:
- Around line 75-79: Replace the bare HTTPException used when
request.config.model_type == "xgboost" and when the LightGBM check occurs with
the problem_response helper: import problem_response from
app.core.problem_details and return
problem_response(status=status.HTTP_400_BAD_REQUEST, title="XGBoost disabled",
detail="XGBoost is disabled. Set forecast_enable_xgboost=True in settings.",
error_code="BAD_REQUEST") for the XGBoost branch; do the analogous change for
the LightGBM branch (use a title like "LightGBM disabled" and the corresponding
settings flag), so responses use RFC 7807 application/problem+json instead of
raising HTTPException.

In `@app/features/jobs/schemas.py`:
- Line 28: The docs for the model_type field are out of date—update the
model_type documentation in schemas.py to include 'lightgbm', 'xgboost', and
'prophet_like' in addition to 'naive', 'seasonal_naive', 'moving_average', and
'regression'; preferably add or reuse a single SUPPORTED_MODEL_TYPES (or
similar) constant and reference it from the TrainJob schema's model_type
docstring/Field description so the docs and runtime validation stay in sync
(update any occurrences of model_type docstring or Field on the TrainJob/Job
schema accordingly).

In `@app/shared/feature_frames/rows.py`:
- Line 97: Validate baseline_price once at the start of the feature construction
(where price_factor is computed) and raise a clear ValueError if baseline_price
is zero, None, or otherwise invalid, instead of performing division; update both
places that compute ratios using baseline_price (the expressions that do
prices[index] / baseline_price and similar at the other ratio site) to assume
the validated baseline_price and remove redundant guards, or wrap the division
with a pre-check that raises the same ValueError to prevent crashes or poisoned
features.
- Around line 88-100: Before the loop that builds rows, validate that the input
sequences have matching lengths to avoid IndexError: check that len(dates) ==
len(quantities) == len(prices) and also that each entry in CALENDAR_COLUMNS
(e.g., calendar_columns[name]) has length == len(dates); if any mismatch raise a
ValueError with a clear message referencing the mismatched variable names
(dates, quantities, prices, calendar_columns) so callers get an immediate,
descriptive failure before the for index, day in enumerate(dates) loop that uses
EXOGENOUS_LAGS and accesses quantities[index - lag] and prices[index].

In `@examples/models/feature_frame_contract.md`:
- Around line 55-59: The fenced code block containing the feature list starting
with "lag_1, lag_7, lag_14, lag_28, dow_sin, dow_cos, month_sin, month_cos,
is_weekend, is_month_end, price_factor, promo_active, is_holiday,
days_since_launch" needs a language identifier to satisfy markdownlint MD040;
update that fence (the triple-backtick block around that comma-separated list)
to start with a language tag such as ```text (or ```txt) so the block becomes
```text ... ``` and the linter will stop flagging MD040.

In `@examples/models/model_interface.md`:
- Around line 241-243: The fenced formula blocks (e.g., the lines with ŷ[t+h] =
HistGradientBoostingRegressor.predict(X[t+h]), ŷ[t+h] =
LGBMRegressor.predict(X[t+h]), ŷ[t+h] = XGBRegressor.predict(X[t+h]), and ŷ[t+h]
= intercept + trend[t+h] + seasonality[t+h] + holiday_regressor[t+h]) are
missing a language tag and trigger MD040; update each triple-backtick fence
around these formulas to include a language identifier such as text (e.g.,
replace ``` with ```text) for all occurrences mentioned (also at the other
locations noted) so markdownlint no longer flags them.

In `@PRPs/PRP-MLZOO-B.2-feature-aware-backtesting.md`:
- Around line 186-236: The markdown has unlabeled fenced code blocks in
PRP-MLZOO-B.2-feature-aware-backtesting.md (the file/trees and diffs under the
app/ and "Desired Codebase tree" / "Files to MODIFY" sections); add appropriate
language identifiers to each fence (e.g., ```text or ```tree for file tree
listings and ```diff for the diff-like block) so the blocks are labeled and
MD040 is resolved—update each opening ``` to include the chosen language token
for the three/any unlabeled blocks in that file.

In `@PRPs/PRP-MLZOO-C2-prophet-like-additive-model.md`:
- Around line 19-23: Remove the blank line inside the blockquote so the quote
lines are contiguous (fixing markdownlint MD028): locate the blockquote
beginning with "> **Sibling PRP:** `PRPs/PRP-MLZOO-C1-xgboost-model.md`" and the
following quoted lines, delete the empty line between them so every quoted line
starts with ">" with no intervening blank line, preserving the existing text and
punctuation.

In `@README.md`:
- Around line 399-404: Update the "Feature-Aware Models" bullet to include
"prophet_like" alongside "regression", "lightgbm", and "xgboost" so README
aligns with supported models; specifically add `prophet_like` to the list and
keep the existing guidance about setting model_config_main.model_type, per-fold
leakage-safe feature matrices (min_train_size >= 30), and the resulting metadata
(feature_aware: true, exogenous_policy: "observed").

---

Nitpick comments:
In `@app/features/forecasting/tests/test_routes.py`:
- Around line 78-79: The test currently only checks status code and substring;
update it to assert the RFC 7807 problem response contract by verifying
response.headers["Content-Type"] (or response.content_type) equals
"application/problem+json", parse response.json() and assert it contains the
keys "type", "title", "status", and "detail", assert json["status"] == 400 and
that the "detail" or "title" field includes "lightgbm" (to keep the original
semantic check); apply these same assertions for the other occurrence mentioned
(lines 97-98) so error-path tests enforce the problem+json contract rather than
only text matching.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1ddb62d7-9b66-4f3c-a5f2-116f0c9c8f24

📥 Commits

Reviewing files that changed from the base of the PR and between 0333bc8 and 7531eac.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (63)
  • PRPs/INITIAL/INITIAL-MLZOO-A-foundation-feature-frames.md
  • PRPs/INITIAL/INITIAL-MLZOO-B-lightgbm-first-model.md
  • PRPs/INITIAL/INITIAL-MLZOO-B.2-feature-aware-backtesting.md
  • PRPs/INITIAL/INITIAL-MLZOO-C-xgboost-prophet-extensions.md
  • PRPs/INITIAL/INITIAL-MLZOO-D-frontend-registry-explainability.md
  • PRPs/INITIAL/INITIAL-MLZOO-index.md
  • PRPs/PRP-29-feature-aware-forecasting-foundation.md
  • PRPs/PRP-30-lightgbm-first-advanced-model.md
  • PRPs/PRP-MLZOO-B.2-feature-aware-backtesting.md
  • PRPs/PRP-MLZOO-C1-xgboost-model.md
  • PRPs/PRP-MLZOO-C2-prophet-like-additive-model.md
  • README.md
  • app/core/config.py
  • app/features/backtesting/schemas.py
  • app/features/backtesting/service.py
  • app/features/backtesting/tests/test_feature_aware_backtest.py
  • app/features/backtesting/tests/test_routes_integration.py
  • app/features/backtesting/tests/test_schemas.py
  • app/features/backtesting/tests/test_service.py
  • app/features/backtesting/tests/test_service_integration.py
  • app/features/forecasting/models.py
  • app/features/forecasting/persistence.py
  • app/features/forecasting/routes.py
  • app/features/forecasting/schemas.py
  • app/features/forecasting/service.py
  • app/features/forecasting/tests/test_lightgbm_forecaster.py
  • app/features/forecasting/tests/test_persistence.py
  • app/features/forecasting/tests/test_prophet_like_forecaster.py
  • app/features/forecasting/tests/test_regression_features_leakage.py
  • app/features/forecasting/tests/test_routes.py
  • app/features/forecasting/tests/test_service.py
  • app/features/forecasting/tests/test_xgboost_forecaster.py
  • app/features/jobs/schemas.py
  • app/features/jobs/service.py
  • app/features/jobs/tests/test_service.py
  • app/features/registry/service.py
  • app/features/registry/tests/test_service.py
  • app/features/scenarios/feature_frame.py
  • app/features/scenarios/service.py
  • app/features/scenarios/tests/conftest.py
  • app/features/scenarios/tests/test_feature_frame.py
  • app/features/scenarios/tests/test_future_frame_leakage.py
  • app/features/scenarios/tests/test_routes_integration.py
  • app/main.py
  • app/shared/feature_frames/__init__.py
  • app/shared/feature_frames/contract.py
  • app/shared/feature_frames/rows.py
  • app/shared/feature_frames/tests/__init__.py
  • app/shared/feature_frames/tests/test_contract.py
  • app/shared/feature_frames/tests/test_leakage.py
  • docs/PHASE/5-BACKTESTING.md
  • docs/optional-features/10-baseforecaster-feature-contract.md
  • docs/optional-features/11-feature-aware-predict-serving.md
  • docs/optional-features/README.md
  • examples/models/advanced_lightgbm.py
  • examples/models/advanced_xgboost.py
  • examples/models/feature_frame_contract.md
  • examples/models/model_interface.md
  • examples/models/prophet_like_additive.py
  • frontend/src/lib/scenario-utils.test.ts
  • frontend/src/lib/scenario-utils.ts
  • frontend/src/pages/visualize/planner.tsx
  • pyproject.toml

Comment on lines +39 to 40
from app.features.data_platform.models import Calendar, Product, Promotion, SalesDaily
from app.features.forecasting.models import model_factory
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move exogenous DB access behind a shared/core boundary instead of querying another feature slice directly.

The new backtesting path now directly imports and queries app.features.data_platform.models, which deepens cross-slice coupling. Please route this through app/core/ or app/shared/ (for example via a shared repository/helper) and keep the backtesting slice decoupled from other feature internals.

As per coding guidelines, "Slices must NOT import from each other; cross-cutting code goes through app/core/ or app/shared/."

Also applies to: 752-852

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/features/backtesting/service.py` around lines 39 - 40, The backtesting
service is directly importing Calendar, Product, Promotion, SalesDaily from
app.features.data_platform.models which violates slice boundaries; create a thin
data access facade in app/core or app/shared (e.g., app.core.data_platform_repo
with functions like get_calendar(...), get_products(...), get_promotions(...),
get_sales_daily(...)) that encapsulates all DB queries and returns the same
domain objects or DTOs, then replace the direct imports/usages of Calendar,
Product, Promotion, SalesDaily in app/features/backtesting/service.py (and the
similar usages around the referenced 752-852 region) to instead call those new
repository functions; keep model_factory usage unchanged but ensure any
data-platform queries in functions in service.py are routed through the new
core/shared facade.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marked needs-human — not applied.

app/features/data_platform is the de-facto shared ORM layer in this repo: its models carry every fact-table FK and are already imported directly by multiple slices (forecasting, analytics, backtesting). Routing exogenous DB access through a new app/core/app/shared facade is a Major architectural change (new module + the ~752–852 region) that warrants a maintainer decision, not an in-review fix. Deferring.

Comment thread app/features/backtesting/tests/test_routes_integration.py
Comment thread app/features/forecasting/models.py
Comment thread app/features/forecasting/routes.py
Comment thread app/features/jobs/schemas.py
Comment thread examples/models/feature_frame_contract.md
Comment thread examples/models/model_interface.md
Comment thread PRPs/PRP-MLZOO-B.2-feature-aware-backtesting.md
Comment thread PRPs/PRP-MLZOO-C2-prophet-like-additive-model.md
Comment thread README.md
@w7-mgfcode w7-mgfcode merged commit 689ac94 into main May 19, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

release: ship the MLZOO advanced ML model zoo (A–C2) to main

1 participant