You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Phase 2 of the seeder (PRs #92 / #93) added retail-depth generators (lifecycle, replenishment, markdowns, bundles) producing realistic per-(store, product, date) columns. A grep across app/features/featuresets/, forecasting/, and backtesting/ returns zero references to any Phase 2 generator — the data shape exists, but no downstream consumer reads it. Only Phase 1's exogenous columns are wired (via _compute_exogenous_features at service.py:360).
This issue scopes a single vertical-slice extension: take three of the four Phase 2 generators (lifecycle, replenishment, markdowns/promotion) and expose them as time-safe feature columns inside app/features/featuresets/, mirroring the proven exogenous pattern. bundles is intentionally deferred — its cross-product JOIN deserves its own leakage analysis.
Locked design decisions
Lifecycle: continuous-only. Drop the categorical stage; emit days_since_launch_lag{N} + days_since_discontinue_lag{N} only. LightGBM splits discover stage boundaries naturally; no encoding decision needed; no seeder-threshold coupling.
Promotion: generalized.PromotionConfig handles all four promotion.kind values (pct_off | bogo | bundle | markdown) via one JOIN. Default kinds_to_track=("markdown",) preserves the original markdown-only scope; bundles/BOGO/pct_off ride along when caller opts in.
Replenishment: in-method async JOIN._load_replenishment_events_up_to_cutoff filters event_date <= cutoff_date at the SQL boundary; pandas left-merge inside _compute_replenishment_features.
MVP scope
✅ 3 new Pydantic v2 *Config models (LifecycleConfig, ReplenishmentConfig, PromotionConfig)
✅ 3 new _compute_*_features methods in app/features/featuresets/service.py
✅ 3 new leakage cases in tests/test_leakage.py (one per family) — leakage test ships FIRST
✅ 1 end-to-end LightGBM 14-day rolling backtest proving the extended matrix trains cleanly
Every new feature ships with a leakage test FIRST. app/features/featuresets/tests/test_leakage.py is load-bearing per .claude/rules/product-vision.md Principle 5. Code review specifically eyeballs every .rolling() for a preceding .shift(1).
Verified DB shapes (no migration needed)
product.launch_date + product.discontinue_date (both Date | None) — models.py:101-102
replenishment_event table with index (store_id, product_id, date) — models.py:471-514
Quality gates (every slice)
uv run ruff check app/features/featuresets/
uv run mypy app/features/featuresets/
uv run pyright app/features/featuresets/
uv run pytest app/features/featuresets/ -v
Summary
Phase 2 of the seeder (PRs #92 / #93) added retail-depth generators (
lifecycle,replenishment,markdowns,bundles) producing realistic per-(store, product, date)columns. A grep acrossapp/features/featuresets/,forecasting/, andbacktesting/returns zero references to any Phase 2 generator — the data shape exists, but no downstream consumer reads it. Only Phase 1'sexogenouscolumns are wired (via_compute_exogenous_featuresatservice.py:360).This issue scopes a single vertical-slice extension: take three of the four Phase 2 generators (
lifecycle,replenishment,markdowns/promotion) and expose them as time-safe feature columns insideapp/features/featuresets/, mirroring the proven exogenous pattern.bundlesis intentionally deferred — its cross-product JOIN deserves its own leakage analysis.Locked design decisions
days_since_launch_lag{N}+days_since_discontinue_lag{N}only. LightGBM splits discover stage boundaries naturally; no encoding decision needed; no seeder-threshold coupling.PromotionConfighandles all fourpromotion.kindvalues (pct_off | bogo | bundle | markdown) via one JOIN. Defaultkinds_to_track=("markdown",)preserves the original markdown-only scope; bundles/BOGO/pct_off ride along when caller opts in._load_replenishment_events_up_to_cutofffiltersevent_date <= cutoff_dateat the SQL boundary; pandas left-merge inside_compute_replenishment_features.MVP scope
*Configmodels (LifecycleConfig,ReplenishmentConfig,PromotionConfig)_compute_*_featuresmethods inapp/features/featuresets/service.pytests/test_leakage.py(one per family) — leakage test ships FIRST/featuresets/computecallers receive byte-identical responsesapp/features/data_platform/models.py)bundleswiring — separate slice (cross-product JOIN)Execution plan (5-slice PRP)
Umbrella:
PRPs/PRP-3.1-phase2-feature-wiring.md· Slices written individually underPRPs/PRP-3.1A...E.md.replenishment_eventJOIN) + leakage testPHASE/3-FEATURE_ENGINEERING.md,_base/DOMAIN_MODEL.md)Recommended sequence: A → B → (C ∥ D) → E.
Time-safety contract
Every new feature ships with a leakage test FIRST.
app/features/featuresets/tests/test_leakage.pyis load-bearing per.claude/rules/product-vision.mdPrinciple 5. Code review specifically eyeballs every.rolling()for a preceding.shift(1).Verified DB shapes (no migration needed)
product.launch_date+product.discontinue_date(bothDate | None) —models.py:101-102promotion.kind∈{pct_off, bogo, bundle, markdown}(CHECK constraint) —models.py:305, 334promotion.discount_pct: Numeric(5,4)(range 0..1) —models.py:306replenishment_eventtable with index(store_id, product_id, date)—models.py:471-514Quality gates (every slice)
Related
mainbranch protection (paired CI-hardening track, out-of-scope here)