Skip to content

feat(forecast): add XGBoost feature-aware forecasting model (#247)#251

Merged
w7-mgfcode merged 4 commits into
devfrom
feat/forecasting-xgboost-model
May 19, 2026
Merged

feat(forecast): add XGBoost feature-aware forecasting model (#247)#251
w7-mgfcode merged 4 commits into
devfrom
feat/forecasting-xgboost-model

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

Summary

Implements XGBoostForecaster — the second advanced feature-aware tree model (MLZOO-C1) — wrapping xgboost.XGBRegressor. It mirrors the merged LightGBMForecaster (PRP-30 / MLZOO-B) byte-for-byte, with two library swaps: lgb.LGBMRegressorxgb.XGBRegressor, and deterministic/force_col_wisetree_method="hist".

This is C1 of two MLZOO-C review units. The sibling PRPs/PRP-MLZOO-C2-prophet-like-additive-model.md ships the Prophet-like additive model on a separate branch — C1 and C2 are intentionally separate, additive, and order-independent.

What changed

  • XGBoostModelConfig — conservative schema (n_estimators / max_depth / learning_rate / feature_config_hash), added to the ModelConfig union.
  • XGBoostForecasterrequires_features=True; xgboost is lazy-imported inside fit() so importing models.py never requires the optional extra. Deterministic via n_jobs=1 + tree_method="hist" + fixed random_state + no stochastic subsampling; NaN-tolerant (missing=np.nan).
  • model_factory — new xgboost branch gated on forecast_enable_xgboost; ModelType literal gains "xgboost".
  • forecast_enable_xgboost runtime flag in app/core/config.py (default False).
  • ml-xgboost optional dependency extra (xgboost>=2.1.0); uv.lock regenerated.
  • Jobsxgboost branches in _execute_train and _execute_backtest.
  • Route gatePOST /forecasting/train returns 400 for xgboost when the flag is off.
  • Reproducibility metadataModelBundle.xgboost_version (best-effort save + mismatch-warn on load) and a xgboost_version block in registry runtime_info. compute_hash unchanged → no bundle hash shift.
  • Tests mirror the LightGBM suite, all gated with pytest.importorskip("xgboost").
  • Docs — additive entries in model_interface.md, feature_frame_contract.md, README.md, plus examples/models/advanced_xgboost.py.

ForecastingService.train_model / predict, scenarios/service.py, and backtesting/service.py are unchanged — each branches on requires_features, so an XGBoost model routes through every path automatically. No Alembic migration, no API-contract change. XGBoost reuses the regression historical/future feature builders, so the existing leakage specs (test_regression_features_leakage.py, app/shared/feature_frames/tests/test_leakage.py) cover it by construction — no new leakage test added.

Validation

  • uv run ruff check . — PASS
  • uv run ruff format --check . — PASS
  • uv run mypy app/ — PASS (272 files; no xgboost.* mypy override needed — xgboost ships py.typed)
  • uv run pyright app/ — PASS (0 errors, 68 pre-existing warnings)
  • uv run pytest -v -m "not integration" — PASS (1358 passed, 247 deselected)
  • uv run pytest -m integration on forecasting/scenarios/jobs/registry — PASS (20 passed), including test_xgboost_baseline_returns_model_exogenous (method == model_exogenous) and test_train_xgboost_rejected_when_disabled (400)
  • Dogfood: XGBoost determinism (assert_array_equal) and requires_features is True confirmed; examples/models/advanced_xgboost.py runs end-to-end.

Closes #247

Adds the [project.optional-dependencies] ml-xgboost = ["xgboost>=2.1.0"]
extra, mirroring ml-lightgbm. uv.lock regenerated so CI's
uv sync --frozen --all-extras installs it. No core dependency change.
Implements XGBoostForecaster, the second advanced feature-aware tree
model (MLZOO-C1), mirroring the merged LightGBMForecaster byte-for-byte
with xgboost.XGBRegressor in place of lightgbm.LGBMRegressor.

- XGBoostModelConfig: conservative schema (n_estimators / max_depth /
  learning_rate / feature_config_hash), added to the ModelConfig union.
- XGBoostForecaster: requires_features=True, lazy xgboost import inside
  fit(), deterministic via n_jobs=1 + tree_method=hist + fixed seed.
- model_factory: xgboost branch gated on forecast_enable_xgboost; the
  ModelType literal gains "xgboost".
- JobService._execute_train / _execute_backtest: xgboost branches.
- POST /forecasting/train: xgboost feature-flag gate (400 when off).
- ModelBundle.xgboost_version + registry runtime_info xgboost block,
  both best-effort; compute_hash unchanged.
- Tests mirror the LightGBM suite, gated with importorskip("xgboost").
- Docs + examples/models/advanced_xgboost.py additive.

train / predict / scenarios / backtesting services are unchanged: each
branches on requires_features, so an xgboost model routes through every
path automatically. No migration, no API-contract change. XGBoost reuses
the regression historical/future feature builders, so the existing
leakage specs cover it by construction.
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b1d51702-215f-492d-8d94-903c96c80ebf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/forecasting-xgboost-model

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 19, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpypi/​xgboost@​3.2.098100100100100

View full report

@w7-mgfcode w7-mgfcode merged commit 2091f2f into dev May 19, 2026
9 checks passed
@w7-mgfcode w7-mgfcode deleted the feat/forecasting-xgboost-model branch May 19, 2026 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant