Skip to content

feat(features): pydantic configs + PRP set for phase 2 feature wiring (#109)#110

Merged
w7-mgfcode merged 2 commits into
devfrom
feat/featuresets-phase2-configs
May 12, 2026
Merged

feat(features): pydantic configs + PRP set for phase 2 feature wiring (#109)#110
w7-mgfcode merged 2 commits into
devfrom
feat/featuresets-phase2-configs

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

Summary

Lands the foundation of the PRP-3.1 umbrella (issue #109 — wire Phase 2 seeder columns into time-safe featuresets):

  • Schema slice (PRP-3.1A, implemented): 3 new Pydantic v2 Configs (LifecycleConfig, ReplenishmentConfig, PromotionConfig), 3 optional FeatureSetConfig sub-config slots, extended get_enabled_features(), 3 Phase 2 DataFrame fixtures, 18 new test cases. No service/routes edits. No DB migration.
  • PRP set (3.1A → 3.1E): full execution plan for the parallel implementation of B/C/D + the integrating E2E slice. Each PRP is self-contained, cites exact file paths + line numbers, and targets 9/10 confidence.

This PR alone is a schema-only additive diff (+370 / -1 in app/features/featuresets/). It unblocks PRP-3.1B/C/D to run as parallel branches off dev without conflicting on schemas.py.

What changed

Area File Net
Code app/features/featuresets/schemas.py +115
Code app/features/featuresets/tests/conftest.py +88
Code app/features/featuresets/tests/test_schemas.py +168
Docs PRPs/PRP-3.1A-…md + PRP-3.1B-…md + PRP-3.1C-…md + PRP-3.1D-…md + PRP-3.1E-…md +5503 (new files)

Notable decisions made during execution

  • Decision E — config_hash() uses exclude_none=True. The PRP's "byte-identical pre/post PR" additive-contract invariant was unsatisfiable without this fix: Pydantic's default model_dump_json() serializes None-defaulted fields as null keys, so adding optional fields changes the dump regardless of whether the caller sets them. The 1-line fix in FeatureConfigBase.config_hash() makes the additive contract TRUE going forward. New baseline hash for FeatureSetConfig(name="x") is 6c12b1a783eccdd4 and pinned via a snapshot test (test_config_hash_unchanged_when_phase2_omitted). Verified safe: no existing test pins a literal hash value; downstream consumers (forecasting/registry) all green.
  • Decision F — test breadth over LOC target. PRP target was ≤150 net LOC; actual is +370. The overrun is concentrated in test_schemas.py (+168 vs ~50 target): 18 test cases across the 3 new Configs instead of the PRP's "≥4 cases" minimum. Schema is the contract for PRP-3.1B/C/D; more validation cases now means downstream slices land cleaner.
  • Decision G — tuple[str, ...] validator signature for PromotionConfig.kinds_to_track. Per PRP gotcha (pyright --strict narrows tuple[Literal[...], ...] poorly); Pydantic still preserves Literal narrowing at construction.

Test plan

  • uv run ruff check app/features/featuresets/ — clean
  • uv run ruff format --check app/features/featuresets/ — clean
  • uv run mypy app/ — 188 source files, 0 errors
  • uv run pyright app/features/featuresets/ — 0 errors, 0 warnings
  • uv run pytest app/features/featuresets/tests/test_schemas.py -v — 46 passed
  • uv run pytest app/features/featuresets/ -m "not integration" -v — 75 passed (no regression)
  • uv run pytest app/features/forecasting/ app/features/registry/ -m "not integration" — 188 passed (downstream config_hash consumers safe)
  • CI green on this PR

Follow-ups (out of scope here)

  • PRP-3.1B (lifecycle compute) — depends on this PR
  • PRP-3.1C (replenishment compute) — depends on this PR
  • PRP-3.1D (promotion compute) — depends on this PR
  • PRP-3.1E (E2E + docs) — depends on B/C/D

…sets (#109)

Schema-only, additive PR landing the foundation for PRP-3.1B/C/D
parallel implementation:

- LifecycleConfig: include_days_since_launch, include_days_since_discontinue,
  lag_days (1-30). Continuous-only encoding per PRP-3.1 decisions log §1.
- ReplenishmentConfig: include_days_since_last, include_count_window,
  lag_days, count_window_days (7-60).
- PromotionConfig: kinds_to_track (tuple[Literal["pct_off","bogo","bundle",
  "markdown"], ...]) with non-empty/unique validator, include_active,
  include_intensity, lag_days. Generalized from MarkdownConfig per
  decisions §3.

FeatureSetConfig gains three optional sub-config slots (all `T | None = None`,
positioned between exogenous_config and imputation_config) and
get_enabled_features() emits "lifecycle","replenishment","promotion" tokens.

Phase 2 DataFrame fixtures (phase2_product_attrs_df,
phase2_replenishment_events_df, phase2_promotion_rows_df) added to
conftest.py with sequential / derivable values so downstream leakage tests
can mathematically detect contamination.

Behavioral note: FeatureConfigBase.config_hash() now uses
model_dump_json(exclude_none=True) so that adding new optional fields is
hash-invariant for callers that don't set them (additive-contract guarantee
required by PRD §6/§11; previously violated by Pydantic's default null-key
serialization). New baseline hash for FeatureSetConfig(name="x") is
6c12b1a783eccdd4 and pinned via a snapshot test.

Tests: 46 passing in test_schemas.py (incl. 18 new cases across the three
new Configs + 3 new TestFeatureSetConfig cases). Full featuresets module
sweep: 75 passing, zero regressions. Forecasting + registry consumers of
config_hash: 188 passing.

No service.py edits, no routes.py edits, no DB migration.
Lands the 5-slice PRP set defining the parallel execution plan for
issue #109 (wire phase 2 seeder columns into time-safe featuresets):

- PRP-3.1A (3.1A): pydantic configs + phase 2 fixtures (schema-only,
  IMPLEMENTED by the preceding commit).
- PRP-3.1B: lifecycle compute method (days_since_launch /
  days_since_discontinue, continuous-only encoding).
- PRP-3.1C: replenishment compute method (days_since_last_replenishment +
  rolling event count via async event-table JOIN).
- PRP-3.1D: promotion compute method (per-kind active + intensity
  features, chain-wide vs store-specific JOIN, NULL-discount handling).
- PRP-3.1E: end-to-end integration test + docs update
  (PHASE/3-FEATURE_ENGINEERING.md, _base/DOMAIN_MODEL.md, examples).

Recommended sequence: A → B → (C ∥ D) → E. Each downstream PRP cites
exact file paths + line numbers in the current repo, has executable
validation gates (Level 1-5: ruff / mypy+pyright / unit / leakage /
HTTP smoke), and targets a 9/10 confidence score.

Each PRP is self-contained: an implementing agent can read one PRP +
edit 2-3 files + run 5-6 commands and ship a green PR.
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c8fb61ee-52e2-48a4-91a9-6e4f8ea2d226

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/featuresets-phase2-configs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant