feat(release): cut v0.2.9 with phase 2 e2e + strict-mode + main protection (#108)#124
Conversation
Surface the existing GenerateParams knobs in the admin Data Seeder panel (scenario, date range, store/product counts, seed, sparsity) so operators no longer have to drop to the CLI to seed a different year. Form state persists in localStorage and a reset-to-defaults button is provided. Also fixes a latent service-layer bug: when overriding stores/products on a scenario preset, _build_config_from_params replaced the whole DimensionConfig and silently dropped scenario-customized store_regions, store_types, product_categories, and product_brands. Now uses dataclasses.replace so only the count fields change. Adds two regression tests covering holiday_rush + custom store/product counts.
…87) Closes #86. Generated via the /w7_generating-claudemd skill in HEURISTIC_MODE (docs/_kB/repo-map/ KB not yet present). Adds: - CLAUDE.md (116 lines, 812 words; references .claude/rules/* and docs/_base/* via @imports — within the 150-line / 1800-word skill budget) - docs/_base/ARCHITECTURE.md (system boundaries, components, comm patterns, deploy chain) - docs/_base/API_CONTRACTS.md (HTTP surface across 12 slices + WebSocket + external integrations) - docs/_base/RUNBOOKS.md (common incidents, release/rollback, WSL/pnpm traps from prior session HANDOFF) - docs/_base/SECURITY.md (threat model, hard rules from security-patterns.md, scanning matrix) - docs/_base/RULES.md (Change Authority Matrix + invariants + forbidden patterns, consolidated from .claude/rules/*) - docs/_base/DOMAIN_MODEL.md (bounded contexts, aggregates, invariants, ubiquitous language) - docs/_base/DEV_GUIDE.md (human-maintained stub — {FILL IN} markers for a maintainer to complete) - docs/_base/REPO_MAP_INDEX.md (index across README, PHASE docs, ADRs, PRPs, .claude/, docs/_base/) - docs/_base/PIPELINE_CONTRACT.md (CI/CD stages, merge gates, release flow) .gitignore adjustments: - Remove `CLAUDE.md` (was blocking the doc from being shared and from being read by Claude in fresh clones) - Add `CLAUDE.local.md` (personal-prefs file — local-only by design) - Stale `.claude` duplicates on lines 2 and 5 left for a separate cleanup PR (deduping won't change behavior since `.claude/` is already tracked) Re-run the skill after a future mapping-repo-context run to drop the remaining 5 [UNVERIFIED] meta-flags.
…, changepoints, returns, substitution (#88)
Closes #84. Per .claude/rules/security-patterns.md: "Pin third-party GitHub Actions by full 40-char SHA"; first-party actions/* may use major-version. Pinned (third-party): - googleapis/release-please-action@v5 → @45996ed1f6d02564a971a2fa1b5860e934307cf7 # v5.0.0 - astral-sh/setup-uv@v7 (×8 across all five workflows) → @37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7.6.0 - github/codeql-action/upload-sarif@v4 → @c6f931105cb2c34c8f901cc885ba1e2e259cf745 # v4.34.0 Left as major-tag (first-party actions/* — rule-permitted): - actions/checkout@v6 - actions/upload-artifact@v7 Dependabot watches .github/workflows/ weekly and will bump these forward.
…es (#95) (#96) every uv run invocation in ci.yml, schema-validation.yml, and phase-snapshot.yml now uses --frozen. without it, uv re-resolves the dependency graph at command time and crashes when a freshly published pydantic-ai-slim version's [mistral] extra requires a mistralai version that does not yet exist on PyPI — observed on PR #93's most recent push where all five blocking CI jobs went red 75 minutes after a green run on the same branch with the same lockfile. dependency-check.yml's pip-audit calls deliberately retain the re-resolve behavior; that workflow's purpose is to pick up newly published vulnerabilities. uv sync --frozen --all-extras --dev was already in place to install the lock; this patch propagates the same intent to every subsequent uv run.
…enerator (#92) (#93) * feat(data,db): seeder phase 2 chunk A — retail-depth schema + configs (#92) Lays the foundation for Phase 2 retail depth without changing any generator behaviour: - Alembic migration a8b9c0d1e234 adds sales_daily.channel (NOT NULL, server default 'in_store'), product lifecycle fields (lifecycle_stage, launch_date, discontinue_date, pack_size, subcategory), promotion kind discriminator with JSONB bundle_member_product_ids, and a new replenishment_event table. All additive; retail_standard rows are unchanged. - ORM mirrors the schema, including a load-bearing JSONB(none_as_null=True) so the bundle-members CHECK fires. - Five new config dataclasses (ChannelConfig, LifecycleConfig, BundleConfig, MarkdownConfig, LeadTimeConfig) wired to SeederConfig with disabled defaults so all existing scenarios produce byte-identical row counts. - 25 integration tests cover the new CHECK + nullability constraints; 8 unit tests guard the config defaults + regression invariant across every ScenarioPreset. * feat(data): seeder phase 2 chunk B (1/5) — product lifecycle generator (#92) First slice of Phase 2 generators. Strict regression invariant: with ``LifecycleConfig.enable=False`` (default) ProductGenerator's output and rng-draw sequence are byte-identical to pre-Phase-2. - ProductGenerator gains optional ``lifecycle_config`` + ``date_range`` parameters. When enabled, each product row carries ``subcategory``, ``pack_size``, ``launch_date``, ``discontinue_date``, ``lifecycle_stage``. - New ``LifecycleGenerator`` (pure compute, no DB) computes per-(product, date) demand multipliers across intro/growth/maturity/decline/ discontinued segments. Disabled path returns 1.0 without touching rng. - 14 new unit tests cover the regression invariant + each ramp segment + discontinue override + reproducibility under enabled mode. Remaining chunk B work (next commits on this branch): - BundleGenerator (BOGO + bundle promotions) - MarkdownGenerator (clearance pricing) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (2/5) — bundle/BOGO generator (#92) Second slice of Phase 2 generators. Same regression invariant as B 1/5: with ``BundleConfig.enable=False`` (default) ``BundleGenerator.apply`` leaves both the promotion list and the rng state byte-identical. - New ``BundleGenerator`` (pure compute, no DB) wraps ``PromotionGenerator``'s output: per-promo ``bundle_probability`` chance to convert to ``kind='bundle'`` or ``kind='bogo'`` (split by ``bogo_share_within_bundles``), drawing 2–``max_bundle_size`` member product IDs (host excluded) and a discount in ``[bundle_discount_pct_min, bundle_discount_pct_max]`` quantized to ``Numeric(5, 4)``. ``discount_amount`` is cleared on converted rows to keep the row internally consistent with the new ``discount_pct``. - Locked rng order per converted promo: ``random()`` (convert?) → ``random()`` (bogo?) → ``randint()`` (n_members) → ``sample()`` (members) → ``uniform()`` (discount). Per-host pool-too-small skip happens before any rng draw so the stream stays stable across runs where only the product pool shrinks. - 18 new unit tests cover the regression invariant (no mutation, no rng consumption) + kind allow-list + member-pool sourcing + count + discount range + BOGO/bundle split at extremes + reproducibility + best-effort skip for small pools + config validation. Remaining chunk B work: - MarkdownGenerator (clearance pricing — needs Open Q on inventory age coupling resolved before starting) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (3/5) — markdown generator (#92) Third slice of Phase 2 generators. Same regression invariant as B 1/5 and B 2/5: with ``MarkdownConfig.enable=False`` (default) the generator emits empty containers and consumes zero rng state. - New ``MarkdownGenerator`` (pure compute, no DB) emits ``Promotion(kind='markdown')`` rows + companion ``PriceHistory`` drop rows + a ``markdown_dates`` lookup keyed by ``(store_id, product_id)`` for the future ``SalesDailyGenerator`` lift integration in chunk B 5/5. - Two triggers ship in this slice: - ``lifecycle_decline`` — chain-wide markdown (``store_id=None``) starting on the first date a product enters the ``decline`` stage according to a passed-in ``LifecycleGenerator``. Skips products without lifecycle attrs; emits no rows when lifecycle is disabled. - ``stockout_risk`` — per-``(store, product)`` markdown ending the day before each observed stockout, lasting ``markdown_duration_days`` days, clamped to the seeded range start. Overlapping windows are deduped within each ``(store, product)`` series. - ``trigger='age_days'`` is deferred — raises ``NotImplementedError`` pointing at issue #94 (follow-up). The default trigger remains ``lifecycle_decline`` so scenarios that just flip the enable bit still produce meaningful output. - Even the enabled path is fully deterministic (no rng draws). The ``rng`` constructor parameter is kept for API consistency with peer Phase 2 generators in case future variants need randomness. - 21 new unit tests cover the regression invariant + lifecycle_decline correctness (chain-wide, skipping missing lifecycle, clamp-to-range, no decline = no output) + stockout_risk correctness (per-store, end-day-before-stockout, overlap dedupe, clamp-to-start, unknown product, dict-order independence) + age_days NotImplementedError + config validation (depth bounds, duration bounds). Remaining chunk B work: - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (4/5) — replenishment generator (#92) Fourth slice of Phase 2 generators. Same regression invariant: with ``LeadTimeConfig.enable=False`` (default) the generator returns ``[]`` and consumes zero rng state. - New ``ReplenishmentGenerator`` (pure compute, no DB) emits ``replenishment_event`` dicts. Per ``(store, product)`` it places a PO every ``order_frequency_days`` starting at ``dates[0]``. Each PO consumes two locked rng draws: ``gauss(mean_lead_time_days, lead_time_sigma_days)`` clamped to ``>= 0`` → ``gauss(fill_rate_mean, fill_rate_sigma)`` clamped to ``[0, 1]``. ``ordered_qty = base_demand * (order_frequency_days + safety_stock_days)``; ``received_qty = round(ordered_qty * fill_rate)`` defensively clamped to ``[0, ordered_qty]``. - Receipts whose ``date_received = date_placed + lead_time_days`` fall past ``dates[-1]`` are dropped to keep the FK to ``calendar`` valid. - Sorted iteration over ``(store_id, product_id)`` makes the rng stream stable regardless of input ordering. - 21 new unit tests cover the regression invariant + record shape + ordered_qty formula + dates-within-range + reproducibility + input-order independence + extreme fill rates (zero/full) + zero lead time + output sort order + 7 config-validation cases. Downstream coupling: a follow-up commit will adjust ``InventorySnapshotGenerator`` to consume these events so realistic stockout windows emerge between scheduled receipts. This slice only emits the rows. Remaining chunk B work: - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (5a/6) — lifecycle multiplier into sales (#92) First half of B 5/5 (split per Open Q3 — channel integration deferred until semantics are confirmed). Wires the LifecycleGenerator multiplier into ``SalesDailyGenerator`` while preserving the byte-identical regression invariant. - ``SalesDailyGenerator.__init__`` gains optional ``lifecycle: LifecycleGenerator | None = None``. Defaults preserve pre-Phase-2 behavior for every existing caller. - ``SalesDailyGenerator.generate`` gains optional ``product_lifecycle_data: dict[int, tuple[date | None, date | None]] | None = None``. Missing or unspecified entries fall back to ``(None, None)`` so the multiplier evaluates to 1.0. - ``_compute_demand`` gains ``product_discontinue_date`` and applies the lifecycle multiplier guarded by ``self.lifecycle is not None and self.lifecycle.enabled``. The pre-Phase-2 ``new_product_ramp_days`` linear ramp is suppressed when lifecycle is enabled, preventing double-attenuation at launch. - 10 new tests cover the regression invariant (no kwargs / explicit None / disabled config / no rng consumption when disabled), enabled correctness (pre-launch zero, post-discontinue zero, intro < maturity, decline < maturity), legacy-ramp suppression (no double-apply when lifecycle on; still fires when lifecycle is None), and the lookup fallback (missing product_id evaluates to 1.0). The B 5b/6 channel integration is held until Open Q3 resolves between (b) dominant per row, (c) random per row from channel_mix weights, or (d) aggregated with primary channel column. Remaining Phase 2 work: - B 5b/6 — SalesDailyGenerator channel split (pending Q3) - Chunk C — DataSeeder orchestration + endpoints + integration tests * feat(data): seeder phase 2 chunk B (5b/6) — channel split into sales (#92) Second half of B 5/5. Resolves Open Q3 with semantic (c): each emitted ``sales_daily`` row gets its ``channel`` drawn from ``channel_mix`` via ``rng.choices``, preserving the existing ``(date, store, product)`` grain. - ``SalesDailyGenerator.__init__`` gains optional ``channels: ChannelConfig | None = None``. Disabled / unset path consumes zero new rng draws and emits rows without a ``channel`` key (DB ``server_default='in_store'`` applies), preserving the byte-identical regression invariant. - ``generate()`` runs ``_validate_channels()`` once at entry. Rejects channels outside the SQL allow-list, negative weights, all-zero mix, negative ``online_promo_uplift``, or ``online_substitution_to_instore`` outside ``[0, 1]``. - Per emitted row (after stockout-skip): ``_maybe_apply_channel`` builds the effective mix (``online_substitution_to_instore`` shifts weight from in_store → online during promos), draws a channel via ``rng.choices``, and applies ``online_promo_uplift`` to online rows on promo dates. One rng draw per emitted row. - 19 new tests cover regression invariant (no kwarg, disabled config, no rng consumption) + channel distribution (subset of mix keys, single-channel deterministic, dominant most common, zero-weight never chosen) + online promo uplift (fires for online + promo, not for in_store) + substitution shift (more online during promo, zero substitution = no shift) + 6 validation cases + row shape (channel key present/absent). Phase 2 chunk B complete (5/6 paired slices + 1/6 follow-up #94). Next: Chunk C — DataSeeder orchestration + new endpoints + integration tests + docs. * feat(data,api): seeder phase 2 chunk c1 — orchestration + endpoints (#92) extend GenerateParams with 5 enable flags + channel_mix / lifecycle / bundle / markdown / lead-time fields; channel_mix validator enforces the SQL allow-list and at least one positive weight. Service layer translates the new params into ChannelConfig / LifecycleConfig / BundleConfig / MarkdownConfig / LeadTimeConfig overrides. DataSeeder.generate_full now wires LifecycleGenerator + BundleGenerator + MarkdownGenerator + ReplenishmentGenerator + ChannelConfig. Product lifecycle dates are fetched alongside base_price in a single query and threaded into SalesDailyGenerator. A new _normalize_promotion_records helper enforces a uniform key set across the mixed pct_off / bundle / bogo / markdown promo records so the bulk pg_insert builds a valid multi-row VALUES clause. delete_data drops replenishment_event first (leaf table). verify_data_integrity gains 3 Phase 2 invariants: bundle member-ID consistency, lifecycle date ordering, replenishment fill rate. append_data mirrors the new return signature and fetches lifecycle dates from existing products. new endpoints: GET /seeder/channels returns the SQL allow-list; GET /dimensions/products/{id}/lifecycle-curve returns the reference demand-multiplier curve via LifecycleGenerator.multiplier_for, using default LifecycleConfig ramp parameters and the product's own launch_date / discontinue_date. SeederStatus + SeederResult both grow a replenishment_events count. disabled-path regression invariant preserved: every Phase 2 flag defaults off and consumes zero rng when off. * feat(data,docs): seeder phase 2 chunk c2 — integration tests + docs (#92) test_phase2_integration.py covers the disabled-path regression (no Phase 2 rows when toggles are off), per-feature enabled tests (lifecycle populates dates, bundles convert promotions with bundle_member_product_ids non-NULL, markdowns can emit rows when lifecycle is also on, replenishment respects received_qty <= ordered_qty, multichannel writes distinct channels), full-on verify_data_integrity returning an empty error list, and delete ordering that wipes replenishment_event without FK violations. Tests are marked @pytest.mark.integration so they only run against the real docker-compose Postgres. docs/DATA-SEEDER.md adds a Phase 2 retail-depth section documenting all five toggles with example JSON payloads, the two new endpoints (GET /seeder/channels, GET /dimensions/products/{id}/lifecycle-curve), and three new Data Integrity checks.
* chore(repo): merge dev to main — seeder phases 1+2 + ci/docs hardening (#92) (#97) * feat(api,ui): expose seeder date range and scale controls (#82) (#83) Surface the existing GenerateParams knobs in the admin Data Seeder panel (scenario, date range, store/product counts, seed, sparsity) so operators no longer have to drop to the CLI to seed a different year. Form state persists in localStorage and a reset-to-defaults button is provided. Also fixes a latent service-layer bug: when overriding stores/products on a scenario preset, _build_config_from_params replaced the whole DimensionConfig and silently dropped scenario-customized store_regions, store_types, product_categories, and product_brands. Now uses dataclasses.replace so only the count fields change. Adds two regression tests covering holiday_rush + custom store/product counts. * feat(docs,repo): land claude.md and docs/_base reference suite (#86) (#87) Closes #86. Generated via the /w7_generating-claudemd skill in HEURISTIC_MODE (docs/_kB/repo-map/ KB not yet present). Adds: - CLAUDE.md (116 lines, 812 words; references .claude/rules/* and docs/_base/* via @imports — within the 150-line / 1800-word skill budget) - docs/_base/ARCHITECTURE.md (system boundaries, components, comm patterns, deploy chain) - docs/_base/API_CONTRACTS.md (HTTP surface across 12 slices + WebSocket + external integrations) - docs/_base/RUNBOOKS.md (common incidents, release/rollback, WSL/pnpm traps from prior session HANDOFF) - docs/_base/SECURITY.md (threat model, hard rules from security-patterns.md, scanning matrix) - docs/_base/RULES.md (Change Authority Matrix + invariants + forbidden patterns, consolidated from .claude/rules/*) - docs/_base/DOMAIN_MODEL.md (bounded contexts, aggregates, invariants, ubiquitous language) - docs/_base/DEV_GUIDE.md (human-maintained stub — {FILL IN} markers for a maintainer to complete) - docs/_base/REPO_MAP_INDEX.md (index across README, PHASE docs, ADRs, PRPs, .claude/, docs/_base/) - docs/_base/PIPELINE_CONTRACT.md (CI/CD stages, merge gates, release flow) .gitignore adjustments: - Remove `CLAUDE.md` (was blocking the doc from being shared and from being read by Claude in fresh clones) - Add `CLAUDE.local.md` (personal-prefs file — local-only by design) - Stale `.claude` duplicates on lines 2 and 5 left for a separate cleanup PR (deduping won't change behavior since `.claude/` is already tracked) Re-run the skill after a future mapping-repo-context run to drop the remaining 5 [UNVERIFIED] meta-flags. * feat(data,api): seeder Phase 1 — exogenous signals, multi-seasonality, changepoints, returns, substitution (#88) * fix(ci): pin third-party github actions by sha (#84) Closes #84. Per .claude/rules/security-patterns.md: "Pin third-party GitHub Actions by full 40-char SHA"; first-party actions/* may use major-version. Pinned (third-party): - googleapis/release-please-action@v5 → @45996ed1f6d02564a971a2fa1b5860e934307cf7 # v5.0.0 - astral-sh/setup-uv@v7 (×8 across all five workflows) → @37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7.6.0 - github/codeql-action/upload-sarif@v4 → @c6f931105cb2c34c8f901cc885ba1e2e259cf745 # v4.34.0 Left as major-tag (first-party actions/* — rule-permitted): - actions/checkout@v6 - actions/upload-artifact@v7 Dependabot watches .github/workflows/ weekly and will bump these forward. * chore(repo): gitignore local session artifacts (#90) * fix(ci): pin uv run with --frozen to stop transient resolution failures (#95) (#96) every uv run invocation in ci.yml, schema-validation.yml, and phase-snapshot.yml now uses --frozen. without it, uv re-resolves the dependency graph at command time and crashes when a freshly published pydantic-ai-slim version's [mistral] extra requires a mistralai version that does not yet exist on PyPI — observed on PR #93's most recent push where all five blocking CI jobs went red 75 minutes after a green run on the same branch with the same lockfile. dependency-check.yml's pip-audit calls deliberately retain the re-resolve behavior; that workflow's purpose is to pick up newly published vulnerabilities. uv sync --frozen --all-extras --dev was already in place to install the lock; this patch propagates the same intent to every subsequent uv run. * feat(data,db): seeder phase 2 — retail-depth foundation + lifecycle generator (#92) (#93) * feat(data,db): seeder phase 2 chunk A — retail-depth schema + configs (#92) Lays the foundation for Phase 2 retail depth without changing any generator behaviour: - Alembic migration a8b9c0d1e234 adds sales_daily.channel (NOT NULL, server default 'in_store'), product lifecycle fields (lifecycle_stage, launch_date, discontinue_date, pack_size, subcategory), promotion kind discriminator with JSONB bundle_member_product_ids, and a new replenishment_event table. All additive; retail_standard rows are unchanged. - ORM mirrors the schema, including a load-bearing JSONB(none_as_null=True) so the bundle-members CHECK fires. - Five new config dataclasses (ChannelConfig, LifecycleConfig, BundleConfig, MarkdownConfig, LeadTimeConfig) wired to SeederConfig with disabled defaults so all existing scenarios produce byte-identical row counts. - 25 integration tests cover the new CHECK + nullability constraints; 8 unit tests guard the config defaults + regression invariant across every ScenarioPreset. * feat(data): seeder phase 2 chunk B (1/5) — product lifecycle generator (#92) First slice of Phase 2 generators. Strict regression invariant: with ``LifecycleConfig.enable=False`` (default) ProductGenerator's output and rng-draw sequence are byte-identical to pre-Phase-2. - ProductGenerator gains optional ``lifecycle_config`` + ``date_range`` parameters. When enabled, each product row carries ``subcategory``, ``pack_size``, ``launch_date``, ``discontinue_date``, ``lifecycle_stage``. - New ``LifecycleGenerator`` (pure compute, no DB) computes per-(product, date) demand multipliers across intro/growth/maturity/decline/ discontinued segments. Disabled path returns 1.0 without touching rng. - 14 new unit tests cover the regression invariant + each ramp segment + discontinue override + reproducibility under enabled mode. Remaining chunk B work (next commits on this branch): - BundleGenerator (BOGO + bundle promotions) - MarkdownGenerator (clearance pricing) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (2/5) — bundle/BOGO generator (#92) Second slice of Phase 2 generators. Same regression invariant as B 1/5: with ``BundleConfig.enable=False`` (default) ``BundleGenerator.apply`` leaves both the promotion list and the rng state byte-identical. - New ``BundleGenerator`` (pure compute, no DB) wraps ``PromotionGenerator``'s output: per-promo ``bundle_probability`` chance to convert to ``kind='bundle'`` or ``kind='bogo'`` (split by ``bogo_share_within_bundles``), drawing 2–``max_bundle_size`` member product IDs (host excluded) and a discount in ``[bundle_discount_pct_min, bundle_discount_pct_max]`` quantized to ``Numeric(5, 4)``. ``discount_amount`` is cleared on converted rows to keep the row internally consistent with the new ``discount_pct``. - Locked rng order per converted promo: ``random()`` (convert?) → ``random()`` (bogo?) → ``randint()`` (n_members) → ``sample()`` (members) → ``uniform()`` (discount). Per-host pool-too-small skip happens before any rng draw so the stream stays stable across runs where only the product pool shrinks. - 18 new unit tests cover the regression invariant (no mutation, no rng consumption) + kind allow-list + member-pool sourcing + count + discount range + BOGO/bundle split at extremes + reproducibility + best-effort skip for small pools + config validation. Remaining chunk B work: - MarkdownGenerator (clearance pricing — needs Open Q on inventory age coupling resolved before starting) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (3/5) — markdown generator (#92) Third slice of Phase 2 generators. Same regression invariant as B 1/5 and B 2/5: with ``MarkdownConfig.enable=False`` (default) the generator emits empty containers and consumes zero rng state. - New ``MarkdownGenerator`` (pure compute, no DB) emits ``Promotion(kind='markdown')`` rows + companion ``PriceHistory`` drop rows + a ``markdown_dates`` lookup keyed by ``(store_id, product_id)`` for the future ``SalesDailyGenerator`` lift integration in chunk B 5/5. - Two triggers ship in this slice: - ``lifecycle_decline`` — chain-wide markdown (``store_id=None``) starting on the first date a product enters the ``decline`` stage according to a passed-in ``LifecycleGenerator``. Skips products without lifecycle attrs; emits no rows when lifecycle is disabled. - ``stockout_risk`` — per-``(store, product)`` markdown ending the day before each observed stockout, lasting ``markdown_duration_days`` days, clamped to the seeded range start. Overlapping windows are deduped within each ``(store, product)`` series. - ``trigger='age_days'`` is deferred — raises ``NotImplementedError`` pointing at issue #94 (follow-up). The default trigger remains ``lifecycle_decline`` so scenarios that just flip the enable bit still produce meaningful output. - Even the enabled path is fully deterministic (no rng draws). The ``rng`` constructor parameter is kept for API consistency with peer Phase 2 generators in case future variants need randomness. - 21 new unit tests cover the regression invariant + lifecycle_decline correctness (chain-wide, skipping missing lifecycle, clamp-to-range, no decline = no output) + stockout_risk correctness (per-store, end-day-before-stockout, overlap dedupe, clamp-to-start, unknown product, dict-order independence) + age_days NotImplementedError + config validation (depth bounds, duration bounds). Remaining chunk B work: - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (4/5) — replenishment generator (#92) Fourth slice of Phase 2 generators. Same regression invariant: with ``LeadTimeConfig.enable=False`` (default) the generator returns ``[]`` and consumes zero rng state. - New ``ReplenishmentGenerator`` (pure compute, no DB) emits ``replenishment_event`` dicts. Per ``(store, product)`` it places a PO every ``order_frequency_days`` starting at ``dates[0]``. Each PO consumes two locked rng draws: ``gauss(mean_lead_time_days, lead_time_sigma_days)`` clamped to ``>= 0`` → ``gauss(fill_rate_mean, fill_rate_sigma)`` clamped to ``[0, 1]``. ``ordered_qty = base_demand * (order_frequency_days + safety_stock_days)``; ``received_qty = round(ordered_qty * fill_rate)`` defensively clamped to ``[0, ordered_qty]``. - Receipts whose ``date_received = date_placed + lead_time_days`` fall past ``dates[-1]`` are dropped to keep the FK to ``calendar`` valid. - Sorted iteration over ``(store_id, product_id)`` makes the rng stream stable regardless of input ordering. - 21 new unit tests cover the regression invariant + record shape + ordered_qty formula + dates-within-range + reproducibility + input-order independence + extreme fill rates (zero/full) + zero lead time + output sort order + 7 config-validation cases. Downstream coupling: a follow-up commit will adjust ``InventorySnapshotGenerator`` to consume these events so realistic stockout windows emerge between scheduled receipts. This slice only emits the rows. Remaining chunk B work: - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (5a/6) — lifecycle multiplier into sales (#92) First half of B 5/5 (split per Open Q3 — channel integration deferred until semantics are confirmed). Wires the LifecycleGenerator multiplier into ``SalesDailyGenerator`` while preserving the byte-identical regression invariant. - ``SalesDailyGenerator.__init__`` gains optional ``lifecycle: LifecycleGenerator | None = None``. Defaults preserve pre-Phase-2 behavior for every existing caller. - ``SalesDailyGenerator.generate`` gains optional ``product_lifecycle_data: dict[int, tuple[date | None, date | None]] | None = None``. Missing or unspecified entries fall back to ``(None, None)`` so the multiplier evaluates to 1.0. - ``_compute_demand`` gains ``product_discontinue_date`` and applies the lifecycle multiplier guarded by ``self.lifecycle is not None and self.lifecycle.enabled``. The pre-Phase-2 ``new_product_ramp_days`` linear ramp is suppressed when lifecycle is enabled, preventing double-attenuation at launch. - 10 new tests cover the regression invariant (no kwargs / explicit None / disabled config / no rng consumption when disabled), enabled correctness (pre-launch zero, post-discontinue zero, intro < maturity, decline < maturity), legacy-ramp suppression (no double-apply when lifecycle on; still fires when lifecycle is None), and the lookup fallback (missing product_id evaluates to 1.0). The B 5b/6 channel integration is held until Open Q3 resolves between (b) dominant per row, (c) random per row from channel_mix weights, or (d) aggregated with primary channel column. Remaining Phase 2 work: - B 5b/6 — SalesDailyGenerator channel split (pending Q3) - Chunk C — DataSeeder orchestration + endpoints + integration tests * feat(data): seeder phase 2 chunk B (5b/6) — channel split into sales (#92) Second half of B 5/5. Resolves Open Q3 with semantic (c): each emitted ``sales_daily`` row gets its ``channel`` drawn from ``channel_mix`` via ``rng.choices``, preserving the existing ``(date, store, product)`` grain. - ``SalesDailyGenerator.__init__`` gains optional ``channels: ChannelConfig | None = None``. Disabled / unset path consumes zero new rng draws and emits rows without a ``channel`` key (DB ``server_default='in_store'`` applies), preserving the byte-identical regression invariant. - ``generate()`` runs ``_validate_channels()`` once at entry. Rejects channels outside the SQL allow-list, negative weights, all-zero mix, negative ``online_promo_uplift``, or ``online_substitution_to_instore`` outside ``[0, 1]``. - Per emitted row (after stockout-skip): ``_maybe_apply_channel`` builds the effective mix (``online_substitution_to_instore`` shifts weight from in_store → online during promos), draws a channel via ``rng.choices``, and applies ``online_promo_uplift`` to online rows on promo dates. One rng draw per emitted row. - 19 new tests cover regression invariant (no kwarg, disabled config, no rng consumption) + channel distribution (subset of mix keys, single-channel deterministic, dominant most common, zero-weight never chosen) + online promo uplift (fires for online + promo, not for in_store) + substitution shift (more online during promo, zero substitution = no shift) + 6 validation cases + row shape (channel key present/absent). Phase 2 chunk B complete (5/6 paired slices + 1/6 follow-up #94). Next: Chunk C — DataSeeder orchestration + new endpoints + integration tests + docs. * feat(data,api): seeder phase 2 chunk c1 — orchestration + endpoints (#92) extend GenerateParams with 5 enable flags + channel_mix / lifecycle / bundle / markdown / lead-time fields; channel_mix validator enforces the SQL allow-list and at least one positive weight. Service layer translates the new params into ChannelConfig / LifecycleConfig / BundleConfig / MarkdownConfig / LeadTimeConfig overrides. DataSeeder.generate_full now wires LifecycleGenerator + BundleGenerator + MarkdownGenerator + ReplenishmentGenerator + ChannelConfig. Product lifecycle dates are fetched alongside base_price in a single query and threaded into SalesDailyGenerator. A new _normalize_promotion_records helper enforces a uniform key set across the mixed pct_off / bundle / bogo / markdown promo records so the bulk pg_insert builds a valid multi-row VALUES clause. delete_data drops replenishment_event first (leaf table). verify_data_integrity gains 3 Phase 2 invariants: bundle member-ID consistency, lifecycle date ordering, replenishment fill rate. append_data mirrors the new return signature and fetches lifecycle dates from existing products. new endpoints: GET /seeder/channels returns the SQL allow-list; GET /dimensions/products/{id}/lifecycle-curve returns the reference demand-multiplier curve via LifecycleGenerator.multiplier_for, using default LifecycleConfig ramp parameters and the product's own launch_date / discontinue_date. SeederStatus + SeederResult both grow a replenishment_events count. disabled-path regression invariant preserved: every Phase 2 flag defaults off and consumes zero rng when off. * feat(data,docs): seeder phase 2 chunk c2 — integration tests + docs (#92) test_phase2_integration.py covers the disabled-path regression (no Phase 2 rows when toggles are off), per-feature enabled tests (lifecycle populates dates, bundles convert promotions with bundle_member_product_ids non-NULL, markdowns can emit rows when lifecycle is also on, replenishment respects received_qty <= ordered_qty, multichannel writes distinct channels), full-on verify_data_integrity returning an empty error list, and delete ordering that wipes replenishment_event without FK violations. Tests are marked @pytest.mark.integration so they only run against the real docker-compose Postgres. docs/DATA-SEEDER.md adds a Phase 2 retail-depth section documenting all five toggles with example JSON payloads, the two new endpoints (GET /seeder/channels, GET /dimensions/products/{id}/lifecycle-curve), and three new Data Integrity checks. * feat(release): trigger v0.2.8 release for seeder phases 1+2 (#98) (#99) * feat(api,ui): expose seeder date range and scale controls (#82) (#83) Surface the existing GenerateParams knobs in the admin Data Seeder panel (scenario, date range, store/product counts, seed, sparsity) so operators no longer have to drop to the CLI to seed a different year. Form state persists in localStorage and a reset-to-defaults button is provided. Also fixes a latent service-layer bug: when overriding stores/products on a scenario preset, _build_config_from_params replaced the whole DimensionConfig and silently dropped scenario-customized store_regions, store_types, product_categories, and product_brands. Now uses dataclasses.replace so only the count fields change. Adds two regression tests covering holiday_rush + custom store/product counts. * feat(docs,repo): land claude.md and docs/_base reference suite (#86) (#87) Closes #86. Generated via the /w7_generating-claudemd skill in HEURISTIC_MODE (docs/_kB/repo-map/ KB not yet present). Adds: - CLAUDE.md (116 lines, 812 words; references .claude/rules/* and docs/_base/* via @imports — within the 150-line / 1800-word skill budget) - docs/_base/ARCHITECTURE.md (system boundaries, components, comm patterns, deploy chain) - docs/_base/API_CONTRACTS.md (HTTP surface across 12 slices + WebSocket + external integrations) - docs/_base/RUNBOOKS.md (common incidents, release/rollback, WSL/pnpm traps from prior session HANDOFF) - docs/_base/SECURITY.md (threat model, hard rules from security-patterns.md, scanning matrix) - docs/_base/RULES.md (Change Authority Matrix + invariants + forbidden patterns, consolidated from .claude/rules/*) - docs/_base/DOMAIN_MODEL.md (bounded contexts, aggregates, invariants, ubiquitous language) - docs/_base/DEV_GUIDE.md (human-maintained stub — {FILL IN} markers for a maintainer to complete) - docs/_base/REPO_MAP_INDEX.md (index across README, PHASE docs, ADRs, PRPs, .claude/, docs/_base/) - docs/_base/PIPELINE_CONTRACT.md (CI/CD stages, merge gates, release flow) .gitignore adjustments: - Remove `CLAUDE.md` (was blocking the doc from being shared and from being read by Claude in fresh clones) - Add `CLAUDE.local.md` (personal-prefs file — local-only by design) - Stale `.claude` duplicates on lines 2 and 5 left for a separate cleanup PR (deduping won't change behavior since `.claude/` is already tracked) Re-run the skill after a future mapping-repo-context run to drop the remaining 5 [UNVERIFIED] meta-flags. * feat(data,api): seeder Phase 1 — exogenous signals, multi-seasonality, changepoints, returns, substitution (#88) * fix(ci): pin third-party github actions by sha (#84) Closes #84. Per .claude/rules/security-patterns.md: "Pin third-party GitHub Actions by full 40-char SHA"; first-party actions/* may use major-version. Pinned (third-party): - googleapis/release-please-action@v5 → @45996ed1f6d02564a971a2fa1b5860e934307cf7 # v5.0.0 - astral-sh/setup-uv@v7 (×8 across all five workflows) → @37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7.6.0 - github/codeql-action/upload-sarif@v4 → @c6f931105cb2c34c8f901cc885ba1e2e259cf745 # v4.34.0 Left as major-tag (first-party actions/* — rule-permitted): - actions/checkout@v6 - actions/upload-artifact@v7 Dependabot watches .github/workflows/ weekly and will bump these forward. * chore(repo): gitignore local session artifacts (#90) * fix(ci): pin uv run with --frozen to stop transient resolution failures (#95) (#96) every uv run invocation in ci.yml, schema-validation.yml, and phase-snapshot.yml now uses --frozen. without it, uv re-resolves the dependency graph at command time and crashes when a freshly published pydantic-ai-slim version's [mistral] extra requires a mistralai version that does not yet exist on PyPI — observed on PR #93's most recent push where all five blocking CI jobs went red 75 minutes after a green run on the same branch with the same lockfile. dependency-check.yml's pip-audit calls deliberately retain the re-resolve behavior; that workflow's purpose is to pick up newly published vulnerabilities. uv sync --frozen --all-extras --dev was already in place to install the lock; this patch propagates the same intent to every subsequent uv run. * feat(data,db): seeder phase 2 — retail-depth foundation + lifecycle generator (#92) (#93) * feat(data,db): seeder phase 2 chunk A — retail-depth schema + configs (#92) Lays the foundation for Phase 2 retail depth without changing any generator behaviour: - Alembic migration a8b9c0d1e234 adds sales_daily.channel (NOT NULL, server default 'in_store'), product lifecycle fields (lifecycle_stage, launch_date, discontinue_date, pack_size, subcategory), promotion kind discriminator with JSONB bundle_member_product_ids, and a new replenishment_event table. All additive; retail_standard rows are unchanged. - ORM mirrors the schema, including a load-bearing JSONB(none_as_null=True) so the bundle-members CHECK fires. - Five new config dataclasses (ChannelConfig, LifecycleConfig, BundleConfig, MarkdownConfig, LeadTimeConfig) wired to SeederConfig with disabled defaults so all existing scenarios produce byte-identical row counts. - 25 integration tests cover the new CHECK + nullability constraints; 8 unit tests guard the config defaults + regression invariant across every ScenarioPreset. * feat(data): seeder phase 2 chunk B (1/5) — product lifecycle generator (#92) First slice of Phase 2 generators. Strict regression invariant: with ``LifecycleConfig.enable=False`` (default) ProductGenerator's output and rng-draw sequence are byte-identical to pre-Phase-2. - ProductGenerator gains optional ``lifecycle_config`` + ``date_range`` parameters. When enabled, each product row carries ``subcategory``, ``pack_size``, ``launch_date``, ``discontinue_date``, ``lifecycle_stage``. - New ``LifecycleGenerator`` (pure compute, no DB) computes per-(product, date) demand multipliers across intro/growth/maturity/decline/ discontinued segments. Disabled path returns 1.0 without touching rng. - 14 new unit tests cover the regression invariant + each ramp segment + discontinue override + reproducibility under enabled mode. Remaining chunk B work (next commits on this branch): - BundleGenerator (BOGO + bundle promotions) - MarkdownGenerator (clearance pricing) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (2/5) — bundle/BOGO generator (#92) Second slice of Phase 2 generators. Same regression invariant as B 1/5: with ``BundleConfig.enable=False`` (default) ``BundleGenerator.apply`` leaves both the promotion list and the rng state byte-identical. - New ``BundleGenerator`` (pure compute, no DB) wraps ``PromotionGenerator``'s output: per-promo ``bundle_probability`` chance to convert to ``kind='bundle'`` or ``kind='bogo'`` (split by ``bogo_share_within_bundles``), drawing 2–``max_bundle_size`` member product IDs (host excluded) and a discount in ``[bundle_discount_pct_min, bundle_discount_pct_max]`` quantized to ``Numeric(5, 4)``. ``discount_amount`` is cleared on converted rows to keep the row internally consistent with the new ``discount_pct``. - Locked rng order per converted promo: ``random()`` (convert?) → ``random()`` (bogo?) → ``randint()`` (n_members) → ``sample()`` (members) → ``uniform()`` (discount). Per-host pool-too-small skip happens before any rng draw so the stream stays stable across runs where only the product pool shrinks. - 18 new unit tests cover the regression invariant (no mutation, no rng consumption) + kind allow-list + member-pool sourcing + count + discount range + BOGO/bundle split at extremes + reproducibility + best-effort skip for small pools + config validation. Remaining chunk B work: - MarkdownGenerator (clearance pricing — needs Open Q on inventory age coupling resolved before starting) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (3/5) — markdown generator (#92) Third slice of Phase 2 generators. Same regression invariant as B 1/5 and B 2/5: with ``MarkdownConfig.enable=False`` (default) the generator emits empty containers and consumes zero rng state. - New ``MarkdownGenerator`` (pure compute, no DB) emits ``Promotion(kind='markdown')`` rows + companion ``PriceHistory`` drop rows + a ``markdown_dates`` lookup keyed by ``(store_id, product_id)`` for the future ``SalesDailyGenerator`` lift integration in chunk B 5/5. - Two triggers ship in this slice: - ``lifecycle_decline`` — chain-wide markdown (``store_id=None``) starting on the first date a product enters the ``decline`` stage according to a passed-in ``LifecycleGenerator``. Skips products without lifecycle attrs; emits no rows when lifecycle is disabled. - ``stockout_risk`` — per-``(store, product)`` markdown ending the day before each observed stockout, lasting ``markdown_duration_days`` days, clamped to the seeded range start. Overlapping windows are deduped within each ``(store, product)`` series. - ``trigger='age_days'`` is deferred — raises ``NotImplementedError`` pointing at issue #94 (follow-up). The default trigger remains ``lifecycle_decline`` so scenarios that just flip the enable bit still produce meaningful output. - Even the enabled path is fully deterministic (no rng draws). The ``rng`` constructor parameter is kept for API consistency with peer Phase 2 generators in case future variants need randomness. - 21 new unit tests cover the regression invariant + lifecycle_decline correctness (chain-wide, skipping missing lifecycle, clamp-to-range, no decline = no output) + stockout_risk correctness (per-store, end-day-before-stockout, overlap dedupe, clamp-to-start, unknown product, dict-order independence) + age_days NotImplementedError + config validation (depth bounds, duration bounds). Remaining chunk B work: - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (4/5) — replenishment generator (#92) Fourth slice of Phase 2 generators. Same regression invariant: with ``LeadTimeConfig.enable=False`` (default) the generator returns ``[]`` and consumes zero rng state. - New ``ReplenishmentGenerator`` (pure compute, no DB) emits ``replenishment_event`` dicts. Per ``(store, product)`` it places a PO every ``order_frequency_days`` starting at ``dates[0]``. Each PO consumes two locked rng draws: ``gauss(mean_lead_time_days, lead_time_sigma_days)`` clamped to ``>= 0`` → ``gauss(fill_rate_mean, fill_rate_sigma)`` clamped to ``[0, 1]``. ``ordered_qty = base_demand * (order_frequency_days + safety_stock_days)``; ``received_qty = round(ordered_qty * fill_rate)`` defensively clamped to ``[0, ordered_qty]``. - Receipts whose ``date_received = date_placed + lead_time_days`` fall past ``dates[-1]`` are dropped to keep the FK to ``calendar`` valid. - Sorted iteration over ``(store_id, product_id)`` makes the rng stream stable regardless of input ordering. - 21 new unit tests cover the regression invariant + record shape + ordered_qty formula + dates-within-range + reproducibility + input-order independence + extreme fill rates (zero/full) + zero lead time + output sort order + 7 config-validation cases. Downstream coupling: a follow-up commit will adjust ``InventorySnapshotGenerator`` to consume these events so realistic stockout windows emerge between scheduled receipts. This slice only emits the rows. Remaining chunk B work: - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (5a/6) — lifecycle multiplier into sales (#92) First half of B 5/5 (split per Open Q3 — channel integration deferred until semantics are confirmed). Wires the LifecycleGenerator multiplier into ``SalesDailyGenerator`` while preserving the byte-identical regression invariant. - ``SalesDailyGenerator.__init__`` gains optional ``lifecycle: LifecycleGenerator | None = None``. Defaults preserve pre-Phase-2 behavior for every existing caller. - ``SalesDailyGenerator.generate`` gains optional ``product_lifecycle_data: dict[int, tuple[date | None, date | None]] | None = None``. Missing or unspecified entries fall back to ``(None, None)`` so the multiplier evaluates to 1.0. - ``_compute_demand`` gains ``product_discontinue_date`` and applies the lifecycle multiplier guarded by ``self.lifecycle is not None and self.lifecycle.enabled``. The pre-Phase-2 ``new_product_ramp_days`` linear ramp is suppressed when lifecycle is enabled, preventing double-attenuation at launch. - 10 new tests cover the regression invariant (no kwargs / explicit None / disabled config / no rng consumption when disabled), enabled correctness (pre-launch zero, post-discontinue zero, intro < maturity, decline < maturity), legacy-ramp suppression (no double-apply when lifecycle on; still fires when lifecycle is None), and the lookup fallback (missing product_id evaluates to 1.0). The B 5b/6 channel integration is held until Open Q3 resolves between (b) dominant per row, (c) random per row from channel_mix weights, or (d) aggregated with primary channel column. Remaining Phase 2 work: - B 5b/6 — SalesDailyGenerator channel split (pending Q3) - Chunk C — DataSeeder orchestration + endpoints + integration tests * feat(data): seeder phase 2 chunk B (5b/6) — channel split into sales (#92) Second half of B 5/5. Resolves Open Q3 with semantic (c): each emitted ``sales_daily`` row gets its ``channel`` drawn from ``channel_mix`` via ``rng.choices``, preserving the existing ``(date, store, product)`` grain. - ``SalesDailyGenerator.__init__`` gains optional ``channels: ChannelConfig | None = None``. Disabled / unset path consumes zero new rng draws and emits rows without a ``channel`` key (DB ``server_default='in_store'`` applies), preserving the byte-identical regression invariant. - ``generate()`` runs ``_validate_channels()`` once at entry. Rejects channels outside the SQL allow-list, negative weights, all-zero mix, negative ``online_promo_uplift``, or ``online_substitution_to_instore`` outside ``[0, 1]``. - Per emitted row (after stockout-skip): ``_maybe_apply_channel`` builds the effective mix (``online_substitution_to_instore`` shifts weight from in_store → online during promos), draws a channel via ``rng.choices``, and applies ``online_promo_uplift`` to online rows on promo dates. One rng draw per emitted row. - 19 new tests cover regression invariant (no kwarg, disabled config, no rng consumption) + channel distribution (subset of mix keys, single-channel deterministic, dominant most common, zero-weight never chosen) + online promo uplift (fires for online + promo, not for in_store) + substitution shift (more online during promo, zero substitution = no shift) + 6 validation cases + row shape (channel key present/absent). Phase 2 chunk B complete (5/6 paired slices + 1/6 follow-up #94). Next: Chunk C — DataSeeder orchestration + new endpoints + integration tests + docs. * feat(data,api): seeder phase 2 chunk c1 — orchestration + endpoints (#92) extend GenerateParams with 5 enable flags + channel_mix / lifecycle / bundle / markdown / lead-time fields; channel_mix validator enforces the SQL allow-list and at least one positive weight. Service layer translates the new params into ChannelConfig / LifecycleConfig / BundleConfig / MarkdownConfig / LeadTimeConfig overrides. DataSeeder.generate_full now wires LifecycleGenerator + BundleGenerator + MarkdownGenerator + ReplenishmentGenerator + ChannelConfig. Product lifecycle dates are fetched alongside base_price in a single query and threaded into SalesDailyGenerator. A new _normalize_promotion_records helper enforces a uniform key set across the mixed pct_off / bundle / bogo / markdown promo records so the bulk pg_insert builds a valid multi-row VALUES clause. delete_data drops replenishment_event first (leaf table). verify_data_integrity gains 3 Phase 2 invariants: bundle member-ID consistency, lifecycle date ordering, replenishment fill rate. append_data mirrors the new return signature and fetches lifecycle dates from existing products. new endpoints: GET /seeder/channels returns the SQL allow-list; GET /dimensions/products/{id}/lifecycle-curve returns the reference demand-multiplier curve via LifecycleGenerator.multiplier_for, using default LifecycleConfig ramp parameters and the product's own launch_date / discontinue_date. SeederStatus + SeederResult both grow a replenishment_events count. disabled-path regression invariant preserved: every Phase 2 flag defaults off and consumes zero rng when off. * feat(data,docs): seeder phase 2 chunk c2 — integration tests + docs (#92) test_phase2_integration.py covers the disabled-path regression (no Phase 2 rows when toggles are off), per-feature enabled tests (lifecycle populates dates, bundles convert promotions with bundle_member_product_ids non-NULL, markdowns can emit rows when lifecycle is also on, replenishment respects received_qty <= ordered_qty, multichannel writes distinct channels), full-on verify_data_integrity returning an empty error list, and delete ordering that wipes replenishment_event without FK violations. Tests are marked @pytest.mark.integration so they only run against the real docker-compose Postgres. docs/DATA-SEEDER.md adds a Phase 2 retail-depth section documenting all five toggles with example JSON payloads, the two new endpoints (GET /seeder/channels, GET /dimensions/products/{id}/lifecycle-curve), and three new Data Integrity checks. * feat(release): trigger v0.2.8 release for seeder phases 1+2 (#98) * chore(main): release 0.2.8 (#100)
Closes #102. Adds a new "Common Incidents" entry to docs/_base/RUNBOOKS.md covering the trap hit during the v0.2.8 release: gh pr merge --merge uses the PR title verbatim as the merge-commit subject, so a chore(...) PR title makes release-please skip the bump. Also adds a warning callout to the "Cut a release" block. Symptom + diagnosis + prevention (web UI / --subject / feat: title) + recovery (empty feat(release): trigger commit, ref PRs #99→#100→#101).
Closes #104. The Settings tests documented in RUNBOOKS.md as ".env-bleed cases" now pass with or without a local .env present. test_settings_has_defaults: pass _env_file=None to Settings() so asserted defaults are not overridden by a developer's .env. test_validate_*_key_missing (×3): patch app.features.agents.agents.base.get_settings via monkeypatch.setattr because get_settings is @lru_cache'd in app/core/config.py — the cached singleton was already built with .env values at import time, so monkeypatch.setenv(..., "") had no effect on the cached attribute. The replacement lambda returns Settings(_env_file=None, <provider>_api_key="") so the missing-key branch fires deterministically regardless of host env. Verified: 15/15 targeted tests pass with .env present AND with .env absent (CI parity). Full unit suite goes from 897 -> 901 pass; mypy + pyright clean.
…107) Spot-verified each [UNVERIFIED] claim against the actual code and the GitHub API, then replaced the placeholders with cited prose: - ARCHITECTURE.md: header note, hosted-demo claim (app_env="production" is config-only), observability table (no Prometheus/OTEL/Sentry) - API_CONTRACTS.md: full /agents/stream contract — 6 event_type literals with per-event data schemas, cited to schemas.py:229 - SECURITY.md: API bind default cited to config.py:32-33; secrets- detection row notes detect-private-key pre-commit hook - PIPELINE_CONTRACT.md: branch protection from gh api — dev has 4 required status checks, main has only approval requirement, both block force-push/delete, enforce_admins=false - DOMAIN_MODEL.md: header meta-note (no body tags) DEV_GUIDE.md {FILL IN} stubs left untouched (human-maintained per file header). [ASSUMPTION] compliance-scope tags left as-is (confirmed assumptions, not unverified).
…#109) (#110) * feat(features): add pydantic configs and phase 2 fixtures for featuresets (#109) Schema-only, additive PR landing the foundation for PRP-3.1B/C/D parallel implementation: - LifecycleConfig: include_days_since_launch, include_days_since_discontinue, lag_days (1-30). Continuous-only encoding per PRP-3.1 decisions log §1. - ReplenishmentConfig: include_days_since_last, include_count_window, lag_days, count_window_days (7-60). - PromotionConfig: kinds_to_track (tuple[Literal["pct_off","bogo","bundle", "markdown"], ...]) with non-empty/unique validator, include_active, include_intensity, lag_days. Generalized from MarkdownConfig per decisions §3. FeatureSetConfig gains three optional sub-config slots (all `T | None = None`, positioned between exogenous_config and imputation_config) and get_enabled_features() emits "lifecycle","replenishment","promotion" tokens. Phase 2 DataFrame fixtures (phase2_product_attrs_df, phase2_replenishment_events_df, phase2_promotion_rows_df) added to conftest.py with sequential / derivable values so downstream leakage tests can mathematically detect contamination. Behavioral note: FeatureConfigBase.config_hash() now uses model_dump_json(exclude_none=True) so that adding new optional fields is hash-invariant for callers that don't set them (additive-contract guarantee required by PRD §6/§11; previously violated by Pydantic's default null-key serialization). New baseline hash for FeatureSetConfig(name="x") is 6c12b1a783eccdd4 and pinned via a snapshot test. Tests: 46 passing in test_schemas.py (incl. 18 new cases across the three new Configs + 3 new TestFeatureSetConfig cases). Full featuresets module sweep: 75 passing, zero regressions. Forecasting + registry consumers of config_hash: 188 passing. No service.py edits, no routes.py edits, no DB migration. * docs(docs): add PRP-3.1A through 3.1E for phase 2 feature wiring (#109) Lands the 5-slice PRP set defining the parallel execution plan for issue #109 (wire phase 2 seeder columns into time-safe featuresets): - PRP-3.1A (3.1A): pydantic configs + phase 2 fixtures (schema-only, IMPLEMENTED by the preceding commit). - PRP-3.1B: lifecycle compute method (days_since_launch / days_since_discontinue, continuous-only encoding). - PRP-3.1C: replenishment compute method (days_since_last_replenishment + rolling event count via async event-table JOIN). - PRP-3.1D: promotion compute method (per-kind active + intensity features, chain-wide vs store-specific JOIN, NULL-discount handling). - PRP-3.1E: end-to-end integration test + docs update (PHASE/3-FEATURE_ENGINEERING.md, _base/DOMAIN_MODEL.md, examples). Recommended sequence: A → B → (C ∥ D) → E. Each downstream PRP cites exact file paths + line numbers in the current repo, has executable validation gates (Level 1-5: ruff / mypy+pyright / unit / leakage / HTTP smoke), and targets a 9/10 confidence score. Each PRP is self-contained: an implementing agent can read one PRP + edit 2-3 files + run 5-6 commands and ship a green PR.
* feat(features): implement replenishment compute method (#109) Land PRP-3.1C — the replenishment compute slice for time-safe featuresets. Add `FeatureEngineeringService._compute_replenishment_features` plus a sidecar events attribute, a one-line orchestrator branch after the exogenous family, and `FeatureDataLoader.load_replenishment_events` with a SQL-side `date <= cutoff_date` filter for time-safety before pandas sees the rows. Replenishment features land as two columns when the config is set: * `days_since_last_replenishment_lag{N}` (float64, NaN sentinel) * `replenishment_count_w{W}_lag{N}` (int64, 0 for no-events) Cross-series isolation is enforced by `merge_asof(..., by=["store_id", "product_id"])` plus per-entity `groupby.shift(...)` — belt-and-braces. The rolling count uses `shift(1).rolling(W).sum()` (NEVER `rolling().shift()`), and `lag_days > 1` layers an additional shift on top of the canonical safety boundary (PRP-3.1C §15 Decision C). Tests: * `TestReplenishmentLeakage` (4 cases) asserts shift(N) invariance, shift(1).rolling(W) ordering, cross-series isolation, and the on-cutoff inclusion semantics implied by the SQL `<=` filter. * `TestReplenishmentFeatures` (5 cases) covers happy path, zero-event entity, single-event entity, cutoff alignment, and dtype contracts. No schema, route, or migration changes — additive only. The PRP-3.1A `config_hash_unchanged_when_phase2_omitted` snapshot still passes. * chore(features): apply ruff format to replenishment compute (#109)
Closes the PRP-3.1 umbrella. Ships:
* New app/features/featuresets/tests/test_phase2_integration.py
(@pytest.mark.integration) -- exercises POST /featuresets/compute
against a real Postgres with lifecycle + replenishment + promotion
configs ALL set, and pins the additive-contract snapshot at the HTTP
boundary (config_hash="6c12b1a783eccdd4", the same value the
PRP-3.1A schema-layer twin pins).
* New TestPhase2CrossConfigLeakage in test_leakage.py (two cases)
composing all three Phase 2 configs with lag_config, asserting
groupby([store,product]).shift(1) invariance survives the compose
and that Phase 2 columns at row 0 are NaN where applicable.
* docs/PHASE/3-FEATURE_ENGINEERING.md -- new "Phase 2 Features
(Retail-Depth)" section documenting the three families, their config
classes, output columns, and the cross-cutting time-safety guarantee.
* docs/_base/DOMAIN_MODEL.md -- three new Ubiquitous Language rows
(days_since_launch, replenishment event, promotion (kind)).
* examples/compute_features_demo.py -- create_phase2_config() helper
and --phase2 CLI flag printing Phase 2 columns from the response.
One small production-code change beyond the PRP's docs+tests scope:
the cutoff_date field on ComputeFeaturesRequest / PreviewFeaturesRequest
gains strict=False. With model-level strict=True, FastAPI's
validate_python rejects every JSON ISO date string ("2024-02-29") with
a 422 -- meaning the endpoints were unreachable from any JSON client.
The override is field-scoped, additive, and unblocks the PRP's stated
integration-test entry point.
Validation: ruff/format clean; mypy/pyright 0 errors; 106 featuresets
unit tests pass; 3 phase2 integration tests pass; 952 app-wide unit
tests pass.
Refs PRPs/PRP-3.1E-phase2-e2e-integration-and-docs.md.
…118) Extends FeatureDataLoader so the lifecycle feature family lights up at the HTTP boundary -- closes the last open gap from the PRP-3.1 umbrella (was a PRP-3.1E §16 Open Question 2 follow-up). Changes: * FeatureDataLoader.load_product_attrs(db, product_ids) -- new method loading product.id, launch_date, discontinue_date. No cutoff filter needed (launch/discontinue are timeless product attributes, not facts). Returns an empty DataFrame with the correct columns when no products match, mirroring the load_calendar_data / load_replenishment_events shape. * compute_features_for_series() calls load_product_attrs when config.lifecycle_config is set and merges onto the sales frame via product_id BEFORE constructing the FeatureEngineeringService. The merge attaches the same per-product values to every sales row; _compute_lifecycle_features derives the per-row delta + shift(N) downstream. * test_phase2_integration.py: PHASE2_EXPECTED_COLUMNS expanded to include days_since_launch_lag1 and days_since_discontinue_lag1. The seeded product has launch_date=date(2023, 6, 1) and discontinue_date=None, so launch resolves to a real integer; the discontinue column appears with all-NaN values (header still emitted). * docs/PHASE/3-FEATURE_ENGINEERING.md: removed the "at the HTTP boundary, the current FeatureDataLoader does not yet join..." note and replaced with a positive statement about the end-to-end wiring. Validation: * uv run ruff check . && uv run ruff format --check . -- clean. * uv run mypy app/ -- 0 errors. * uv run pyright app/features/featuresets/ -- 0 errors. * uv run pytest app/features/featuresets/ -v -m "not integration" -- 106 passed. * uv run pytest app/features/featuresets/tests/test_phase2_integration.py -v -m integration -- 3 passed (now exercising the lifecycle path end-to-end against real Postgres). * uv run pytest app/ -m "not integration" -- 952 passed (no regression outside featuresets).
… bodies (#117) (#119) Audits ConfigDict(strict=True) request models across app/features/**/schemas.py and lands the repo-wide policy from issue #117. Without the override, FastAPI's _compat/v2.py:175 calls validate_python on the JSON-parsed dict and Pydantic strict-mode rejects every ISO date string -- every HTTP caller would 422. * TrainRequest.train_start_date / train_end_date now ship with field-level strict=False (mirrors PR #115 on ComputeFeaturesRequest / PreviewFeaturesRequest). * PredictRequest has no JSON-non-native fields; test pins this expectation so a future contributor adding a date field is forced to update the override. * New TestRequestJSONDateGotcha in featuresets/test_schemas.py and forecasting/test_schemas.py exercise Model.model_validate(dict_with_iso_string) -- the exact path FastAPI takes -- so regressions land in the fast unit pass. * New test_routes.py per slice POSTs JSON with ISO date strings and asserts no date_type 422 errors (integration-marked, runs against real Postgres in CI). * docs/_base/SECURITY.md grows a "Pydantic v2 strict mode on FastAPI request bodies" subsection under Input Validation documenting the policy and rationale. Closes #117.
#120) (#121) Adds app/core/tests/test_strict_mode_policy.py — an AST-walker pytest invariant that fails CI when any field on a ConfigDict(strict=True) request model under app/features/**/schemas.py is typed date | datetime | time | UUID | Decimal without the matching Field(strict=False, ...) override. Same regression class shipped twice in 14 days (PR #115 closing #109, PR #119 closing #117). Policy lived only in docs/_base/SECURITY.md, enforced by human reviewers; this codifies it as a never-weaken executable invariant, mirroring the test_leakage.py precedent. - 4 known-good baseline fields PASS on commit one (no schema churn) - Baseline-guard test prevents vacuous-green silent-pass - Negative-fixture self-test proves the walker classifies correctly - AST-only, zero new dependencies - docs/_base/SECURITY.md gains an "Enforced by:" cross-reference - PRPs/PRP-14-strict-mode-policy-linter.md is the full design rationale Depends on the SECURITY.md "Pydantic v2 strict mode on FastAPI request bodies" subsection introduced on fix/api-features-strict-mode-audit (commit 3eb3d4e). This branch is based on that fix branch; the PR should land after the #117 fix lands on dev. Closes #120.
#113) (#122) PRP-3.1C's pseudocode originally referenced W=3 as the smallest meaningful trailing window for replenishment-count features, but PRP-3.1A's schema set `ge=7` with no recorded rationale. The mismatch forced `test_count_window_uses_shift_then_rolling` to use W=7 instead of the original W=3 probe. This change aligns the schema with PRP-3.1C's intent: - `count_window_days = Field(default=14, ge=3, le=60)` (was `ge=7`) - Update boundary test to probe the new floor (count_window_days=2) - Refresh leakage-test docstring and docs/PHASE/3 range comment All existing call sites already use W=7 or W=14, so no behavioral regression. `shift(1).rolling(W, min_periods=1).sum()` works for any W>=1; W=3 is the smallest window that can detect a cadence trend. Closes #113.
…#123) Mirrors `dev`'s four required status checks onto `main` (Lint & Format, Type Check, Test, Migration Check; strict=true). Documents the new protection in three places so future contributors and the release-trigger runbook stay accurate: - `docs/_base/PIPELINE_CONTRACT.md` — Merge Conditions table now shows the four `main` gates explicitly; verification-date refreshed to 2026-05-14; enforce_admins note expanded to call out that the empty-`feat:` trigger PR must now wait for the same four checks before admin-merge. - `docs/GIT-GITHUB-GUIDE.md` — branch summary + release-flow step 2 reflect the enforced gates. - `docs/_base/RUNBOOKS.md` — release-please recovery procedure step 4 now spells out the new wait-for-CI requirement explicitly. `enforce_admins: false` preserved per #108 out-of-scope, so the release-trigger admin-merge workaround still works (it just has to wait for CI like every other PR to `main`). The protection itself is applied out-of-band via `gh api -X PUT repos/w7-mgfcode/ForecastLabAI/branches/main/protection` because branch-protection writes are not reviewable in a PR diff; this commit documents the post-write state. Closes #108.
There was a problem hiding this comment.
Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (29)
📝 WalkthroughWalkthroughAdds Phase 2 lifecycle/replenishment/promotion features across DB, services, and tests; extends dimensions and featuresets APIs; expands seeder with new generators and endpoints; introduces migrations and ORM updates; pins CI actions and uses uv --frozen; updates docs, examples, frontend admin form, and bumps version to 0.2.8. ChangesPhase 2 Retail-Depth and Platform Extensions
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as FastAPI /featuresets
participant Loader as FeatureDataLoader
participant Service as FeatureEngineeringService
participant DB
participant Sidecars as Promo/Repl frames
Client->>API: POST /featuresets/compute (Phase2 configs)
API->>Loader: load product attrs / replenishment events
Loader->>DB: SELECT attrs/events (cutoff <= date)
DB-->>Loader: DataFrames
API->>Service: compute_features(df, sidecars)
Service->>Sidecars: read promo/repl frames
Service-->>API: features + columns + config_hash
API-->>Client: 200 OK JSON
Estimated code review effort🎯 5 (Critical) | ⏱️ ~150 minutes Possibly related issues
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
#108) Merge base between dev and main was v0.2.7 (c43d6cd) — the v0.2.8 release commits on main (PRs #97 → #99 → #100) were never absorbed by dev. The next dev → main PR (#124) failed to merge cleanly until those commits flow back. Conflict resolution: 6 `docs/_base/*.md` files conflicted as add/add where dev had the verified content (PR #107 closed the `[UNVERIFIED]` tags) and main still carried the older `[UNVERIFIED]` markers. Took dev's version for all six — the verified content is correct; the `[UNVERIFIED]` version is stale. After this lands, PR #124 (dev → main, `feat(release):` titled to trigger the v0.2.9 release-please bump) becomes mergeable cleanly. Resolved files: - docs/_base/API_CONTRACTS.md - docs/_base/ARCHITECTURE.md - docs/_base/DOMAIN_MODEL.md - docs/_base/PIPELINE_CONTRACT.md - docs/_base/RUNBOOKS.md - docs/_base/SECURITY.md
…e-0-2-9 chore(repo): back-merge main into dev to absorb v0.2.8 release commits (#108)
Summary
Cut a release tag (
v0.2.9expected per pre-1.0 PATCH config) bundling everything merged todevsince v0.2.8. The PR is titledfeat(release):so the merge-commit subject bumps release-please — avoids the trap PRs #97 → #99 recovered from (RUNBOOKS.md: "release-please skipped the bump after a dev → main merge").References #108 (the most recently-touched issue — release-boundary tightening — is also the headline change in this bundle).
What's bundled (
origin/devahead oforigin/mainby 14 squash commits)[UNVERIFIED]removal)The
mainbranch protection write referenced by #108 has not been applied yet (verified viagh api .../main/protection2026-05-14 →required_status_checks: null). Without it, the four newmaingates from #123 are aspirational — they won't actually block this merge.To make this PR exercise the new gates, run before merging:
Payload at
/tmp/main-protection.json(local to the agent host). After the PUT, this PR's CI checks (Lint & Format,Type Check,Test,Migration Check) become required at the protection layer; the merge button stays disabled until they go green.If the PUT is deferred, the PR can still be merged via admin override — the change shipping in this bundle still goes out, the new gates just kick in for the next release boundary.
Merge instructions (read before clicking)
gh pr merge <N> --merge(NO--squash) — keeps the 14 underlying commits intact for release-please to traverse.--subject— the PR title is already a valid bumpingfeat:subject, the default behavior is correct.chore(main): release 0.2.9Release PR to appear shortly after merge, opened by release-please. Merge that to tagv0.2.9.Post-merge verification
Summary by CodeRabbit
Release Notes
New Features
GET /dimensions/products/{id}/lifecycle-curve)Bug Fixes / Improvements