Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions INITIAL-4.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,13 @@
## DOCUMENTATION:
- Time-series feature engineering best practices
- scikit-learn transformers/pipelines (if used)
- [scikit-learn Pipeline Composition](https://scikit-learn.org/stable/modules/compose.html)
- [MLForecast Feature Engineering](https://www.nixtla.io/blog/automated-time-series-feature-engineering-with-mlforecast?utm_source=chatgpt.com#introduction-to-mlforecast)
- [sktime Transformations API](https://www.sktime.net/en/stable/api_reference/transformations.html)

## OTHER CONSIDERATIONS:
- Feature configs must be persisted per run in the registry.
- Reproducibility: same config + same data window must be re-runnable.
- **Imputation Logic**: Define behavior for missing price data (forward-fill) vs missing sales data (zero-fill).
- **Agent Tooling**: Expose the Feature Pipeline as a tool for PydanticAI to "inspect" the shape of the data before suggesting ModelConfigs.
- **Computation Overhead**: Evaluate if features should be computed on-the-fly in FastAPI or pre-computed in a materialized view for performance.
12 changes: 11 additions & 1 deletion INITIAL-5.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@
- Extensible “Global ML” hook:
- regression pipeline (scikit-learn)
- enabled/disabled via feature flags
- Unified Estimator Pipeline:
- Scikit-learn Pipeline incorporating Scaling -> Encoding -> Regressor.
- Integration with FeatureEngineeringService for automated lag-injection.
- Persistence Layer:
- Joblib-based serialization including a 'ModelBundle' (Model + Metadata + FeatureHash).
- Multi-Horizon Support:
- Logic for Recursive Forecasting (predicting day-by-day and updating lags).

## EXAMPLES:
- `examples/models/baseline_naive.py`
Expand All @@ -21,8 +28,11 @@
## DOCUMENTATION:
- scikit-learn estimators + pipelines
- joblib serialization patterns
- [scikit-learn Pipeline Composition](https://scikit-learn.org/stable/modules/compose.html)
- [scikit-learn Glossary](https://scikit-learn.org/stable/glossary.html)
- [scikit-learn Model Persistence](https://scikit-learn.org/stable/model_persistence.html)

## OTHER CONSIDERATIONS:
- No hardcoded horizons: driven by request/config.
- Determinism: random seed from Settings.
- Enforce input grain validation (store×product×date).
- Enforce input grain validation (store×product×date).
Loading