Skip to content

feat(features): implement replenishment compute method (#109)#114

Merged
w7-mgfcode merged 2 commits into
devfrom
feat/featuresets-replenishment
May 12, 2026
Merged

feat(features): implement replenishment compute method (#109)#114
w7-mgfcode merged 2 commits into
devfrom
feat/featuresets-replenishment

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

Summary

Lands PRP-3.1C — the replenishment compute slice for time-safe featuresets. Rebased onto dev after PRs #111 (lifecycle) and #112 (promotion) landed; patched to align with the PRP-3.1D sidecar pattern before pushing (see "Patch notes" below).

  • Adds FeatureEngineeringService._compute_replenishment_features(df, events_df).
  • Wires it at step 8 of compute_features, gated on self.config.replenishment_config.
  • Sidecar pattern (matches PRP-3.1D promotion): events DataFrame is attached out-of-band via setattr(service, "_replenishment_events_df", events_df) in the loader-orchestrator (or in tests). The compute method reads via getattr(self, "_replenishment_events_df", None). No new public API.
  • Two features when fully configured:
    • days_since_last_replenishment_lag{N} (float64; NaN sentinel for no-prior-event).
    • replenishment_count_w{W}_lag{N} (int64; 0 for no-events).
  • Adds FeatureDataLoader.load_replenishment_events() with SQL-side date <= cutoff_date filter (enforces time-safety BEFORE pandas sees the rows).
  • Extends compute_features_for_series to load events when replenishment_config is set.

Time-safety

  • merge_asof(..., by=["store_id", "product_id"]) plus per-entity groupby(...).shift(...) is the leakage gate for days_since_last.
  • shift(1).rolling(W).sum() is the canonical ordering for count_windownever rolling(W).sum().shift(1). For lag_days > 1, an extra shift(lag_days - 1) layers on top.
  • TestReplenishmentLeakage (4 cases) asserts: same-day event invisible at lag-1; shift(1).rolling(W) ordering; cross-series isolation; on-cutoff inclusion via the SQL <= filter.

Scope

  • Files touched: service.py, tests/test_leakage.py, tests/test_service.py (3 files, +749/-0 vs dev after the patch).
  • No schemas.py / routes.py / conftest.py / migration / docs edits.
  • Additive-contract snapshot (config_hash_unchanged_when_phase2_omitted) still passes.

Patch notes — three deviations from the original agent commit

  1. Removed public set_replenishment_events(events_df) setter. The original commit exposed it as a new public method on FeatureEngineeringService. Replaced with the matching PRP-3.1D pattern: dynamic sidecar attribute (setattr in loader/orchestrator, direct assignment in tests). Result: no new public surface; both replenishment and promotion families share one idiom.
  2. W=7 vs W=3 in test_count_window_uses_shift_then_rolling — kept as-is in this PR. The PRP-3.1A schema (feat(features): pydantic configs + PRP set for phase 2 feature wiring (#109) #110) bounds count_window_days at ge=7, so the test uses W=7 as the smallest allowed. Follow-up tracked in feat(features): revisit count_window_days minimum in PRP-3.1A ReplenishmentConfig schema #113 to decide whether to lower the schema floor or document it.
  3. days_since returning [NaN, NaN, NaN, 0, 1, 2, 3] at the post-shift step — accepted as-is after review. This matches the canonical groupby.shift(1) leakage idiom: the event-day's per-row value (0) becomes invisible because shift(1) reads the previous row, which had no event yet. The post-event rows correctly see the gap as observed at the lagged row. The PRP pseudocode's [NaN, NaN, NaN, NaN, 1, 2, 3] would be over-conservative without leakage benefit.

Test plan

  • ruff check + ruff format --check clean
  • mypy --strict app/features/featuresets/ clean (0 errors)
  • pyright --strict app/features/featuresets/ clean (0 errors)
  • pytest -v on app/features/featuresets/ — 104 tests pass (4 new replenishment leakage + 5 new replenishment service cases on top of lag/rolling/calendar/lifecycle/promotion)
  • CI green on PR

Refs

Closes the replenishment portion of #109. Follow-up: #113 (revisit count_window_days lower bound). PRP-3.1E will document the days_since semantic choice in the featuresets reference and exercise the loader extension end-to-end.

Land PRP-3.1C — the replenishment compute slice for time-safe featuresets.

Add `FeatureEngineeringService._compute_replenishment_features` plus a sidecar
events attribute, a one-line orchestrator branch after the exogenous family,
and `FeatureDataLoader.load_replenishment_events` with a SQL-side
`date <= cutoff_date` filter for time-safety before pandas sees the rows.

Replenishment features land as two columns when the config is set:
  * `days_since_last_replenishment_lag{N}` (float64, NaN sentinel)
  * `replenishment_count_w{W}_lag{N}` (int64, 0 for no-events)

Cross-series isolation is enforced by `merge_asof(..., by=["store_id",
"product_id"])` plus per-entity `groupby.shift(...)` — belt-and-braces.
The rolling count uses `shift(1).rolling(W).sum()` (NEVER `rolling().shift()`),
and `lag_days > 1` layers an additional shift on top of the canonical safety
boundary (PRP-3.1C §15 Decision C).

Tests:
  * `TestReplenishmentLeakage` (4 cases) asserts shift(N) invariance,
    shift(1).rolling(W) ordering, cross-series isolation, and the on-cutoff
    inclusion semantics implied by the SQL `<=` filter.
  * `TestReplenishmentFeatures` (5 cases) covers happy path, zero-event
    entity, single-event entity, cutoff alignment, and dtype contracts.

No schema, route, or migration changes — additive only. The PRP-3.1A
`config_hash_unchanged_when_phase2_omitted` snapshot still passes.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2074956c-3e64-471f-a27d-315bd857adac

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/featuresets-replenishment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant