Skip to content

feat(features): wire phase 2 seeder columns into time-safe featuresets #109

@w7-mgfcode

Description

@w7-mgfcode

Summary

Phase 2 of the seeder (PRs #92 / #93) added retail-depth generators (lifecycle, replenishment, markdowns, bundles) producing realistic per-(store, product, date) columns. A grep across app/features/featuresets/, forecasting/, and backtesting/ returns zero references to any Phase 2 generator — the data shape exists, but no downstream consumer reads it. Only Phase 1's exogenous columns are wired (via _compute_exogenous_features at service.py:360).

This issue scopes a single vertical-slice extension: take three of the four Phase 2 generators (lifecycle, replenishment, markdowns/promotion) and expose them as time-safe feature columns inside app/features/featuresets/, mirroring the proven exogenous pattern. bundles is intentionally deferred — its cross-product JOIN deserves its own leakage analysis.

Locked design decisions

  • Lifecycle: continuous-only. Drop the categorical stage; emit days_since_launch_lag{N} + days_since_discontinue_lag{N} only. LightGBM splits discover stage boundaries naturally; no encoding decision needed; no seeder-threshold coupling.
  • Promotion: generalized. PromotionConfig handles all four promotion.kind values (pct_off | bogo | bundle | markdown) via one JOIN. Default kinds_to_track=("markdown",) preserves the original markdown-only scope; bundles/BOGO/pct_off ride along when caller opts in.
  • Replenishment: in-method async JOIN. _load_replenishment_events_up_to_cutoff filters event_date <= cutoff_date at the SQL boundary; pandas left-merge inside _compute_replenishment_features.

MVP scope

Execution plan (5-slice PRP)

Umbrella: PRPs/PRP-3.1-phase2-feature-wiring.md · Slices written individually under PRPs/PRP-3.1A...E.md.

Slice Scope Status
PRP-3.1A Pydantic configs + Phase 2-shaped fixtures (schema-only) — unblocks B/C/D in parallel PRP ready
PRP-3.1B Lifecycle compute method + leakage test Pending
PRP-3.1C Replenishment compute method (async replenishment_event JOIN) + leakage test Pending
PRP-3.1D Promotion compute method (date-range JOIN per kind) + leakage test Pending
PRP-3.1E E2E LightGBM backtest + docs (PHASE/3-FEATURE_ENGINEERING.md, _base/DOMAIN_MODEL.md) Pending

Recommended sequence: A → B → (C ∥ D) → E.

Time-safety contract

Every new feature ships with a leakage test FIRST. app/features/featuresets/tests/test_leakage.py is load-bearing per .claude/rules/product-vision.md Principle 5. Code review specifically eyeballs every .rolling() for a preceding .shift(1).

Verified DB shapes (no migration needed)

  • product.launch_date + product.discontinue_date (both Date | None) — models.py:101-102
  • promotion.kind{pct_off, bogo, bundle, markdown} (CHECK constraint) — models.py:305, 334
  • promotion.discount_pct: Numeric(5,4) (range 0..1) — models.py:306
  • replenishment_event table with index (store_id, product_id, date)models.py:471-514

Quality gates (every slice)

uv run ruff check app/features/featuresets/
uv run mypy app/features/featuresets/
uv run pyright app/features/featuresets/
uv run pytest app/features/featuresets/ -v

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions