chore(repo): merge dev to main — seeder phases 1+2 + ci/docs hardening (#92) by w7-mgfcode · Pull Request #97 · w7-mgfcode/ForecastLabAI

w7-mgfcode · 2026-05-12T02:51:01Z

Summary

Release prep merge: dev → main. 7 commits since v0.2.7. release-please will open a follow-up Release PR (pre-1.0 PATCH bump: 0.2.7 → 0.2.8).

Highlights

feat(data,db): seeder Phase 2 — retail-depth foundation + lifecycle generator (feat(data): seeder Phase 2 — retail depth (channels, lifecycle, bundles/BOGO, markdowns, lead-time) #92, feat(data,db): seeder phase 2 — retail-depth foundation + lifecycle generator (#92) #93)
feat(data,api): seeder Phase 1 — exogenous signals, multi-seasonality, changepoints, returns, substitution (feat(data): seeder Phase 1 — exogenous signals, multi-seasonality, changepoints, returns, substitution #88)
feat(docs,repo): land CLAUDE.md + docs/_base/ reference suite (feat(docs): land CLAUDE.md and docs/_base/ reference suite #86, feat(docs,repo): land claude.md and docs/_base reference suite (#86) #87)
feat(api,ui): expose seeder date range and scale controls (feat(ui): expose date range and scale controls in seeder admin panel #82, feat(api,ui): expose seeder date range and scale controls (#82) #83)
fix(ci): pin uv run --frozen to stop transient resolution failures (ci: uv run without --frozen re-resolves dependencies and breaks CI when new pydantic-ai versions publish #95, fix(ci): pin uv run with --frozen to stop transient resolution failures (#95) #96)
fix(ci): pin third-party GitHub Actions by SHA (fix(ci): pin third-party github actions by sha #84)
chore(repo): gitignore local session artifacts (chore(repo): gitignore local session artifacts #90)

CI status on `dev`

All recent runs ✅ — CI, Schema Validation green on 4357ddf (HEAD of dev).

Test plan

ruff check + ruff format --check (203 files clean)
mypy --strict app/ (188 files, 0 errors)
pyright --strict app/ (0 errors, 50 type-warnings only)
pytest -m "not integration" (897 passed locally; 4 known .env-bleed cases documented in docs/_base/RUNBOOKS.md — CI is green because CI has no .env)
CI re-runs on the PR
After merge: release-please opens the tagging PR for 0.2.8
After tag: verify cd-release.yml uploads the wheel to the GitHub Release

Summary by CodeRabbit

Release Notes

New Features
- Phase 1 seeding: Exogenous signals (weather, macro, events), returns generation, demand substitution, yearly seasonality, and demand changepoints.
- Phase 2 seeding: Product lifecycles, bundle/BOGO promotions, multichannel sales, clearance markdowns, and replenishment events.
- New API endpoints for product lifecycle curves, exogenous signal queries, and sales channel enumeration.
- Enhanced seeder UI with persistent form state, date range picker, and expanded parameter controls.
Documentation
- Added comprehensive architecture, domain model, API contracts, developer guide, and operational runbooks.
Infrastructure
- Pinned GitHub Actions to specific commit hashes for enhanced security and reproducibility.

Surface the existing GenerateParams knobs in the admin Data Seeder panel (scenario, date range, store/product counts, seed, sparsity) so operators no longer have to drop to the CLI to seed a different year. Form state persists in localStorage and a reset-to-defaults button is provided. Also fixes a latent service-layer bug: when overriding stores/products on a scenario preset, _build_config_from_params replaced the whole DimensionConfig and silently dropped scenario-customized store_regions, store_types, product_categories, and product_brands. Now uses dataclasses.replace so only the count fields change. Adds two regression tests covering holiday_rush + custom store/product counts.

@imports

…87) Closes #86. Generated via the /w7_generating-claudemd skill in HEURISTIC_MODE (docs/_kB/repo-map/ KB not yet present). Adds: - CLAUDE.md (116 lines, 812 words; references .claude/rules/* and docs/_base/* via @imports — within the 150-line / 1800-word skill budget) - docs/_base/ARCHITECTURE.md (system boundaries, components, comm patterns, deploy chain) - docs/_base/API_CONTRACTS.md (HTTP surface across 12 slices + WebSocket + external integrations) - docs/_base/RUNBOOKS.md (common incidents, release/rollback, WSL/pnpm traps from prior session HANDOFF) - docs/_base/SECURITY.md (threat model, hard rules from security-patterns.md, scanning matrix) - docs/_base/RULES.md (Change Authority Matrix + invariants + forbidden patterns, consolidated from .claude/rules/*) - docs/_base/DOMAIN_MODEL.md (bounded contexts, aggregates, invariants, ubiquitous language) - docs/_base/DEV_GUIDE.md (human-maintained stub — {FILL IN} markers for a maintainer to complete) - docs/_base/REPO_MAP_INDEX.md (index across README, PHASE docs, ADRs, PRPs, .claude/, docs/_base/) - docs/_base/PIPELINE_CONTRACT.md (CI/CD stages, merge gates, release flow) .gitignore adjustments: - Remove `CLAUDE.md` (was blocking the doc from being shared and from being read by Claude in fresh clones) - Add `CLAUDE.local.md` (personal-prefs file — local-only by design) - Stale `.claude` duplicates on lines 2 and 5 left for a separate cleanup PR (deduping won't change behavior since `.claude/` is already tracked) Re-run the skill after a future mapping-repo-context run to drop the remaining 5 [UNVERIFIED] meta-flags.

…, changepoints, returns, substitution (#88)

Closes #84. Per .claude/rules/security-patterns.md: "Pin third-party GitHub Actions by full 40-char SHA"; first-party actions/* may use major-version. Pinned (third-party): - googleapis/release-please-action@v5 → @45996ed1f6d02564a971a2fa1b5860e934307cf7 # v5.0.0 - astral-sh/setup-uv@v7 (×8 across all five workflows) → @37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7.6.0 - github/codeql-action/upload-sarif@v4 → @c6f931105cb2c34c8f901cc885ba1e2e259cf745 # v4.34.0 Left as major-tag (first-party actions/* — rule-permitted): - actions/checkout@v6 - actions/upload-artifact@v7 Dependabot watches .github/workflows/ weekly and will bump these forward.

…es (#95) (#96) every uv run invocation in ci.yml, schema-validation.yml, and phase-snapshot.yml now uses --frozen. without it, uv re-resolves the dependency graph at command time and crashes when a freshly published pydantic-ai-slim version's [mistral] extra requires a mistralai version that does not yet exist on PyPI — observed on PR #93's most recent push where all five blocking CI jobs went red 75 minutes after a green run on the same branch with the same lockfile. dependency-check.yml's pip-audit calls deliberately retain the re-resolve behavior; that workflow's purpose is to pick up newly published vulnerabilities. uv sync --frozen --all-extras --dev was already in place to install the lock; this patch propagates the same intent to every subsequent uv run.

…enerator (#92) (#93) * feat(data,db): seeder phase 2 chunk A — retail-depth schema + configs (#92) Lays the foundation for Phase 2 retail depth without changing any generator behaviour: - Alembic migration a8b9c0d1e234 adds sales_daily.channel (NOT NULL, server default 'in_store'), product lifecycle fields (lifecycle_stage, launch_date, discontinue_date, pack_size, subcategory), promotion kind discriminator with JSONB bundle_member_product_ids, and a new replenishment_event table. All additive; retail_standard rows are unchanged. - ORM mirrors the schema, including a load-bearing JSONB(none_as_null=True) so the bundle-members CHECK fires. - Five new config dataclasses (ChannelConfig, LifecycleConfig, BundleConfig, MarkdownConfig, LeadTimeConfig) wired to SeederConfig with disabled defaults so all existing scenarios produce byte-identical row counts. - 25 integration tests cover the new CHECK + nullability constraints; 8 unit tests guard the config defaults + regression invariant across every ScenarioPreset. * feat(data): seeder phase 2 chunk B (1/5) — product lifecycle generator (#92) First slice of Phase 2 generators. Strict regression invariant: with ``LifecycleConfig.enable=False`` (default) ProductGenerator's output and rng-draw sequence are byte-identical to pre-Phase-2. - ProductGenerator gains optional ``lifecycle_config`` + ``date_range`` parameters. When enabled, each product row carries ``subcategory``, ``pack_size``, ``launch_date``, ``discontinue_date``, ``lifecycle_stage``. - New ``LifecycleGenerator`` (pure compute, no DB) computes per-(product, date) demand multipliers across intro/growth/maturity/decline/ discontinued segments. Disabled path returns 1.0 without touching rng. - 14 new unit tests cover the regression invariant + each ramp segment + discontinue override + reproducibility under enabled mode. Remaining chunk B work (next commits on this branch): - BundleGenerator (BOGO + bundle promotions) - MarkdownGenerator (clearance pricing) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (2/5) — bundle/BOGO generator (#92) Second slice of Phase 2 generators. Same regression invariant as B 1/5: with ``BundleConfig.enable=False`` (default) ``BundleGenerator.apply`` leaves both the promotion list and the rng state byte-identical. - New ``BundleGenerator`` (pure compute, no DB) wraps ``PromotionGenerator``'s output: per-promo ``bundle_probability`` chance to convert to ``kind='bundle'`` or ``kind='bogo'`` (split by ``bogo_share_within_bundles``), drawing 2–``max_bundle_size`` member product IDs (host excluded) and a discount in ``[bundle_discount_pct_min, bundle_discount_pct_max]`` quantized to ``Numeric(5, 4)``. ``discount_amount`` is cleared on converted rows to keep the row internally consistent with the new ``discount_pct``. - Locked rng order per converted promo: ``random()`` (convert?) → ``random()`` (bogo?) → ``randint()`` (n_members) → ``sample()`` (members) → ``uniform()`` (discount). Per-host pool-too-small skip happens before any rng draw so the stream stays stable across runs where only the product pool shrinks. - 18 new unit tests cover the regression invariant (no mutation, no rng consumption) + kind allow-list + member-pool sourcing + count + discount range + BOGO/bundle split at extremes + reproducibility + best-effort skip for small pools + config validation. Remaining chunk B work: - MarkdownGenerator (clearance pricing — needs Open Q on inventory age coupling resolved before starting) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (3/5) — markdown generator (#92) Third slice of Phase 2 generators. Same regression invariant as B 1/5 and B 2/5: with ``MarkdownConfig.enable=False`` (default) the generator emits empty containers and consumes zero rng state. - New ``MarkdownGenerator`` (pure compute, no DB) emits ``Promotion(kind='markdown')`` rows + companion ``PriceHistory`` drop rows + a ``markdown_dates`` lookup keyed by ``(store_id, product_id)`` for the future ``SalesDailyGenerator`` lift integration in chunk B 5/5. - Two triggers ship in this slice: - ``lifecycle_decline`` — chain-wide markdown (``store_id=None``) starting on the first date a product enters the ``decline`` stage according to a passed-in ``LifecycleGenerator``. Skips products without lifecycle attrs; emits no rows when lifecycle is disabled. - ``stockout_risk`` — per-``(store, product)`` markdown ending the day before each observed stockout, lasting ``markdown_duration_days`` days, clamped to the seeded range start. Overlapping windows are deduped within each ``(store, product)`` series. - ``trigger='age_days'`` is deferred — raises ``NotImplementedError`` pointing at issue #94 (follow-up). The default trigger remains ``lifecycle_decline`` so scenarios that just flip the enable bit still produce meaningful output. - Even the enabled path is fully deterministic (no rng draws). The ``rng`` constructor parameter is kept for API consistency with peer Phase 2 generators in case future variants need randomness. - 21 new unit tests cover the regression invariant + lifecycle_decline correctness (chain-wide, skipping missing lifecycle, clamp-to-range, no decline = no output) + stockout_risk correctness (per-store, end-day-before-stockout, overlap dedupe, clamp-to-start, unknown product, dict-order independence) + age_days NotImplementedError + config validation (depth bounds, duration bounds). Remaining chunk B work: - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (4/5) — replenishment generator (#92) Fourth slice of Phase 2 generators. Same regression invariant: with ``LeadTimeConfig.enable=False`` (default) the generator returns ``[]`` and consumes zero rng state. - New ``ReplenishmentGenerator`` (pure compute, no DB) emits ``replenishment_event`` dicts. Per ``(store, product)`` it places a PO every ``order_frequency_days`` starting at ``dates[0]``. Each PO consumes two locked rng draws: ``gauss(mean_lead_time_days, lead_time_sigma_days)`` clamped to ``>= 0`` → ``gauss(fill_rate_mean, fill_rate_sigma)`` clamped to ``[0, 1]``. ``ordered_qty = base_demand * (order_frequency_days + safety_stock_days)``; ``received_qty = round(ordered_qty * fill_rate)`` defensively clamped to ``[0, ordered_qty]``. - Receipts whose ``date_received = date_placed + lead_time_days`` fall past ``dates[-1]`` are dropped to keep the FK to ``calendar`` valid. - Sorted iteration over ``(store_id, product_id)`` makes the rng stream stable regardless of input ordering. - 21 new unit tests cover the regression invariant + record shape + ordered_qty formula + dates-within-range + reproducibility + input-order independence + extreme fill rates (zero/full) + zero lead time + output sort order + 7 config-validation cases. Downstream coupling: a follow-up commit will adjust ``InventorySnapshotGenerator`` to consume these events so realistic stockout windows emerge between scheduled receipts. This slice only emits the rows. Remaining chunk B work: - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (5a/6) — lifecycle multiplier into sales (#92) First half of B 5/5 (split per Open Q3 — channel integration deferred until semantics are confirmed). Wires the LifecycleGenerator multiplier into ``SalesDailyGenerator`` while preserving the byte-identical regression invariant. - ``SalesDailyGenerator.__init__`` gains optional ``lifecycle: LifecycleGenerator | None = None``. Defaults preserve pre-Phase-2 behavior for every existing caller. - ``SalesDailyGenerator.generate`` gains optional ``product_lifecycle_data: dict[int, tuple[date | None, date | None]] | None = None``. Missing or unspecified entries fall back to ``(None, None)`` so the multiplier evaluates to 1.0. - ``_compute_demand`` gains ``product_discontinue_date`` and applies the lifecycle multiplier guarded by ``self.lifecycle is not None and self.lifecycle.enabled``. The pre-Phase-2 ``new_product_ramp_days`` linear ramp is suppressed when lifecycle is enabled, preventing double-attenuation at launch. - 10 new tests cover the regression invariant (no kwargs / explicit None / disabled config / no rng consumption when disabled), enabled correctness (pre-launch zero, post-discontinue zero, intro < maturity, decline < maturity), legacy-ramp suppression (no double-apply when lifecycle on; still fires when lifecycle is None), and the lookup fallback (missing product_id evaluates to 1.0). The B 5b/6 channel integration is held until Open Q3 resolves between (b) dominant per row, (c) random per row from channel_mix weights, or (d) aggregated with primary channel column. Remaining Phase 2 work: - B 5b/6 — SalesDailyGenerator channel split (pending Q3) - Chunk C — DataSeeder orchestration + endpoints + integration tests * feat(data): seeder phase 2 chunk B (5b/6) — channel split into sales (#92) Second half of B 5/5. Resolves Open Q3 with semantic (c): each emitted ``sales_daily`` row gets its ``channel`` drawn from ``channel_mix`` via ``rng.choices``, preserving the existing ``(date, store, product)`` grain. - ``SalesDailyGenerator.__init__`` gains optional ``channels: ChannelConfig | None = None``. Disabled / unset path consumes zero new rng draws and emits rows without a ``channel`` key (DB ``server_default='in_store'`` applies), preserving the byte-identical regression invariant. - ``generate()`` runs ``_validate_channels()`` once at entry. Rejects channels outside the SQL allow-list, negative weights, all-zero mix, negative ``online_promo_uplift``, or ``online_substitution_to_instore`` outside ``[0, 1]``. - Per emitted row (after stockout-skip): ``_maybe_apply_channel`` builds the effective mix (``online_substitution_to_instore`` shifts weight from in_store → online during promos), draws a channel via ``rng.choices``, and applies ``online_promo_uplift`` to online rows on promo dates. One rng draw per emitted row. - 19 new tests cover regression invariant (no kwarg, disabled config, no rng consumption) + channel distribution (subset of mix keys, single-channel deterministic, dominant most common, zero-weight never chosen) + online promo uplift (fires for online + promo, not for in_store) + substitution shift (more online during promo, zero substitution = no shift) + 6 validation cases + row shape (channel key present/absent). Phase 2 chunk B complete (5/6 paired slices + 1/6 follow-up #94). Next: Chunk C — DataSeeder orchestration + new endpoints + integration tests + docs. * feat(data,api): seeder phase 2 chunk c1 — orchestration + endpoints (#92) extend GenerateParams with 5 enable flags + channel_mix / lifecycle / bundle / markdown / lead-time fields; channel_mix validator enforces the SQL allow-list and at least one positive weight. Service layer translates the new params into ChannelConfig / LifecycleConfig / BundleConfig / MarkdownConfig / LeadTimeConfig overrides. DataSeeder.generate_full now wires LifecycleGenerator + BundleGenerator + MarkdownGenerator + ReplenishmentGenerator + ChannelConfig. Product lifecycle dates are fetched alongside base_price in a single query and threaded into SalesDailyGenerator. A new _normalize_promotion_records helper enforces a uniform key set across the mixed pct_off / bundle / bogo / markdown promo records so the bulk pg_insert builds a valid multi-row VALUES clause. delete_data drops replenishment_event first (leaf table). verify_data_integrity gains 3 Phase 2 invariants: bundle member-ID consistency, lifecycle date ordering, replenishment fill rate. append_data mirrors the new return signature and fetches lifecycle dates from existing products. new endpoints: GET /seeder/channels returns the SQL allow-list; GET /dimensions/products/{id}/lifecycle-curve returns the reference demand-multiplier curve via LifecycleGenerator.multiplier_for, using default LifecycleConfig ramp parameters and the product's own launch_date / discontinue_date. SeederStatus + SeederResult both grow a replenishment_events count. disabled-path regression invariant preserved: every Phase 2 flag defaults off and consumes zero rng when off. * feat(data,docs): seeder phase 2 chunk c2 — integration tests + docs (#92) test_phase2_integration.py covers the disabled-path regression (no Phase 2 rows when toggles are off), per-feature enabled tests (lifecycle populates dates, bundles convert promotions with bundle_member_product_ids non-NULL, markdowns can emit rows when lifecycle is also on, replenishment respects received_qty <= ordered_qty, multichannel writes distinct channels), full-on verify_data_integrity returning an empty error list, and delete ordering that wipes replenishment_event without FK violations. Tests are marked @pytest.mark.integration so they only run against the real docker-compose Postgres. docs/DATA-SEEDER.md adds a Phase 2 retail-depth section documenting all five toggles with example JSON payloads, the two new endpoints (GET /seeder/channels, GET /dimensions/products/{id}/lifecycle-curve), and three new Data Integrity checks.

sourcery-ai

Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters

coderabbitai · 2026-05-12T02:51:37Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Implements Phase 1/2 seeder features with new configs, generators, orchestration, and schema; adds lifecycle-curve API; extends seeder API; updates Admin UI with persisted parameters; adds extensive tests and docs; and pins CI GitHub Actions by SHA with uv --frozen.

Changes

Seeder Phase 1/2 features, lifecycle curve, and infra

Layer / File(s)	Summary
CI workflows: pin actions and freeze uv runs `.github/workflows/*`	Pins actions by SHA; executes ruff/mypy/pyright/pytest/alembic via uv --frozen.
Repo ignores and CLAUDE operating index `.gitignore`, `CLAUDE.md`	Adds CLAUDE.md and ignores local session artifacts.
Alembic migrations for Phase 1/2 tables and columns `alembic/versions/*`	Creates exogenous_signal, sales_returns, replenishment_event; extends product/promotion/sales_daily with constraints.
ORM model updates and DB constraint tests `app/features/data_platform/*`	Extends ORM models and adds integration tests for new constraints.
Dimensions: lifecycle curve API, schemas, service `app/features/dimensions/*`	Adds schemas, service method, and GET lifecycle-curve route.
Seeder API routes, schemas, service (Phase1 I/O) `app/features/seeder/{routes,schemas,service}.py`	Adds channels and exogenous endpoints; expands params/status; implements query_exogenous.
Seeder API/service unit tests `app/features/seeder/tests/*`	Covers exogenous route/service, config preservation, and status fields.
Seeder config dataclasses (Phase 1/2) `app/shared/seeder/config.py`	Introduces extensive config dataclasses and wires into SeederConfig.
Seeder generators (Phase 1/2) and exports `app/shared/seeder/generators/*`	Implements/extends exogenous, lifecycle, channels, bundles, markdowns, replenishment, returns, and sales daily.
Generator/config tests and Phase 1 integration `app/shared/seeder/tests/*`	Adds comprehensive unit/integration tests for Phase 1 behavior and invariants.
Phase 2 integration and lifecycle/markdowns/replenishment tests `app/shared/seeder/tests/*`	Adds Phase 2 DB integrations, lifecycle sales integration, markdowns, replenishment suites.
Seeder core orchestration and integrity checks `app/shared/seeder/core.py`, tests	Extends orchestration; adds counts, lifecycle mapping, exogenous weather, integrity checks, delete ordering.
Docs: DATA-SEEDER and base docs set `docs/*`	Documents Phase 1/2 features, APIs, architecture, domain, pipeline, rules, runbooks, security, index.
Frontend Admin seeder UI enhancements `frontend/src/pages/admin.tsx`	Adds persisted multi-parameter form, date-range picker, and updated generate flow.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

fix(ci): pin third-party github actions by sha #84: Pins third‑party GitHub Actions by SHA across workflows.
feat(data): seeder Phase 1 — exogenous signals, multi-seasonality, changepoints, returns, substitution #88: Implements Phase 1 seeder features across config, generators, routes, migrations, and tests.
feat(data): seeder Phase 2 — retail depth (channels, lifecycle, bundles/BOGO, markdowns, lead-time) #92: Implements Phase 2 retail-depth features (channels, lifecycle, bundles, markdowns, lead-time).
feat(docs): land CLAUDE.md and docs/_base/ reference suite #86: Adds CLAUDE.md and base docs consistent with this PR.
chore(repo): gitignore local session artifacts #90: Updates .gitignore for local session artifacts.
feat(ui): expose date range and scale controls in seeder admin panel #82: Extends Admin seeder UI and backend param handling.

Possibly related PRs

w7-mgfcode/ForecastLabAI#67: Touches the same seeder generators, core, and tests.
w7-mgfcode/ForecastLabAI#68: Modifies seeder routes/service/schemas and Admin seeder UI.
w7-mgfcode/ForecastLabAI#13: Updates data-platform schema and migrations in related areas.

Suggested labels

autorelease: pending

Suggested reviewers

w7-learn
w7-l7ab

Poem

A rabbit seeds storms and sunny skies,
Bundles and markdowns in data arise.
Channels choose paths, lifecycles flow,
Returns hop back with a gentle glow.
Pipelines pinned, the burrow secure—
Harvest of facts, crisp and pure. 🥕

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch dev

@imports

* chore(repo): merge dev to main — seeder phases 1+2 + ci/docs hardening (#92) (#97) * feat(api,ui): expose seeder date range and scale controls (#82) (#83) Surface the existing GenerateParams knobs in the admin Data Seeder panel (scenario, date range, store/product counts, seed, sparsity) so operators no longer have to drop to the CLI to seed a different year. Form state persists in localStorage and a reset-to-defaults button is provided. Also fixes a latent service-layer bug: when overriding stores/products on a scenario preset, _build_config_from_params replaced the whole DimensionConfig and silently dropped scenario-customized store_regions, store_types, product_categories, and product_brands. Now uses dataclasses.replace so only the count fields change. Adds two regression tests covering holiday_rush + custom store/product counts. * feat(docs,repo): land claude.md and docs/_base reference suite (#86) (#87) Closes #86. Generated via the /w7_generating-claudemd skill in HEURISTIC_MODE (docs/_kB/repo-map/ KB not yet present). Adds: - CLAUDE.md (116 lines, 812 words; references .claude/rules/* and docs/_base/* via @imports — within the 150-line / 1800-word skill budget) - docs/_base/ARCHITECTURE.md (system boundaries, components, comm patterns, deploy chain) - docs/_base/API_CONTRACTS.md (HTTP surface across 12 slices + WebSocket + external integrations) - docs/_base/RUNBOOKS.md (common incidents, release/rollback, WSL/pnpm traps from prior session HANDOFF) - docs/_base/SECURITY.md (threat model, hard rules from security-patterns.md, scanning matrix) - docs/_base/RULES.md (Change Authority Matrix + invariants + forbidden patterns, consolidated from .claude/rules/*) - docs/_base/DOMAIN_MODEL.md (bounded contexts, aggregates, invariants, ubiquitous language) - docs/_base/DEV_GUIDE.md (human-maintained stub — {FILL IN} markers for a maintainer to complete) - docs/_base/REPO_MAP_INDEX.md (index across README, PHASE docs, ADRs, PRPs, .claude/, docs/_base/) - docs/_base/PIPELINE_CONTRACT.md (CI/CD stages, merge gates, release flow) .gitignore adjustments: - Remove `CLAUDE.md` (was blocking the doc from being shared and from being read by Claude in fresh clones) - Add `CLAUDE.local.md` (personal-prefs file — local-only by design) - Stale `.claude` duplicates on lines 2 and 5 left for a separate cleanup PR (deduping won't change behavior since `.claude/` is already tracked) Re-run the skill after a future mapping-repo-context run to drop the remaining 5 [UNVERIFIED] meta-flags. * feat(data,api): seeder Phase 1 — exogenous signals, multi-seasonality, changepoints, returns, substitution (#88) * fix(ci): pin third-party github actions by sha (#84) Closes #84. Per .claude/rules/security-patterns.md: "Pin third-party GitHub Actions by full 40-char SHA"; first-party actions/* may use major-version. Pinned (third-party): - googleapis/release-please-action@v5 → @45996ed1f6d02564a971a2fa1b5860e934307cf7 # v5.0.0 - astral-sh/setup-uv@v7 (×8 across all five workflows) → @37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7.6.0 - github/codeql-action/upload-sarif@v4 → @c6f931105cb2c34c8f901cc885ba1e2e259cf745 # v4.34.0 Left as major-tag (first-party actions/* — rule-permitted): - actions/checkout@v6 - actions/upload-artifact@v7 Dependabot watches .github/workflows/ weekly and will bump these forward. * chore(repo): gitignore local session artifacts (#90) * fix(ci): pin uv run with --frozen to stop transient resolution failures (#95) (#96) every uv run invocation in ci.yml, schema-validation.yml, and phase-snapshot.yml now uses --frozen. without it, uv re-resolves the dependency graph at command time and crashes when a freshly published pydantic-ai-slim version's [mistral] extra requires a mistralai version that does not yet exist on PyPI — observed on PR #93's most recent push where all five blocking CI jobs went red 75 minutes after a green run on the same branch with the same lockfile. dependency-check.yml's pip-audit calls deliberately retain the re-resolve behavior; that workflow's purpose is to pick up newly published vulnerabilities. uv sync --frozen --all-extras --dev was already in place to install the lock; this patch propagates the same intent to every subsequent uv run. * feat(data,db): seeder phase 2 — retail-depth foundation + lifecycle generator (#92) (#93) * feat(data,db): seeder phase 2 chunk A — retail-depth schema + configs (#92) Lays the foundation for Phase 2 retail depth without changing any generator behaviour: - Alembic migration a8b9c0d1e234 adds sales_daily.channel (NOT NULL, server default 'in_store'), product lifecycle fields (lifecycle_stage, launch_date, discontinue_date, pack_size, subcategory), promotion kind discriminator with JSONB bundle_member_product_ids, and a new replenishment_event table. All additive; retail_standard rows are unchanged. - ORM mirrors the schema, including a load-bearing JSONB(none_as_null=True) so the bundle-members CHECK fires. - Five new config dataclasses (ChannelConfig, LifecycleConfig, BundleConfig, MarkdownConfig, LeadTimeConfig) wired to SeederConfig with disabled defaults so all existing scenarios produce byte-identical row counts. - 25 integration tests cover the new CHECK + nullability constraints; 8 unit tests guard the config defaults + regression invariant across every ScenarioPreset. * feat(data): seeder phase 2 chunk B (1/5) — product lifecycle generator (#92) First slice of Phase 2 generators. Strict regression invariant: with ``LifecycleConfig.enable=False`` (default) ProductGenerator's output and rng-draw sequence are byte-identical to pre-Phase-2. - ProductGenerator gains optional ``lifecycle_config`` + ``date_range`` parameters. When enabled, each product row carries ``subcategory``, ``pack_size``, ``launch_date``, ``discontinue_date``, ``lifecycle_stage``. - New ``LifecycleGenerator`` (pure compute, no DB) computes per-(product, date) demand multipliers across intro/growth/maturity/decline/ discontinued segments. Disabled path returns 1.0 without touching rng. - 14 new unit tests cover the regression invariant + each ramp segment + discontinue override + reproducibility under enabled mode. Remaining chunk B work (next commits on this branch): - BundleGenerator (BOGO + bundle promotions) - MarkdownGenerator (clearance pricing) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (2/5) — bundle/BOGO generator (#92) Second slice of Phase 2 generators. Same regression invariant as B 1/5: with ``BundleConfig.enable=False`` (default) ``BundleGenerator.apply`` leaves both the promotion list and the rng state byte-identical. - New ``BundleGenerator`` (pure compute, no DB) wraps ``PromotionGenerator``'s output: per-promo ``bundle_probability`` chance to convert to ``kind='bundle'`` or ``kind='bogo'`` (split by ``bogo_share_within_bundles``), drawing 2–``max_bundle_size`` member product IDs (host excluded) and a discount in ``[bundle_discount_pct_min, bundle_discount_pct_max]`` quantized to ``Numeric(5, 4)``. ``discount_amount`` is cleared on converted rows to keep the row internally consistent with the new ``discount_pct``. - Locked rng order per converted promo: ``random()`` (convert?) → ``random()`` (bogo?) → ``randint()`` (n_members) → ``sample()`` (members) → ``uniform()`` (discount). Per-host pool-too-small skip happens before any rng draw so the stream stays stable across runs where only the product pool shrinks. - 18 new unit tests cover the regression invariant (no mutation, no rng consumption) + kind allow-list + member-pool sourcing + count + discount range + BOGO/bundle split at extremes + reproducibility + best-effort skip for small pools + config validation. Remaining chunk B work: - MarkdownGenerator (clearance pricing — needs Open Q on inventory age coupling resolved before starting) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (3/5) — markdown generator (#92) Third slice of Phase 2 generators. Same regression invariant as B 1/5 and B 2/5: with ``MarkdownConfig.enable=False`` (default) the generator emits empty containers and consumes zero rng state. - New ``MarkdownGenerator`` (pure compute, no DB) emits ``Promotion(kind='markdown')`` rows + companion ``PriceHistory`` drop rows + a ``markdown_dates`` lookup keyed by ``(store_id, product_id)`` for the future ``SalesDailyGenerator`` lift integration in chunk B 5/5. - Two triggers ship in this slice: - ``lifecycle_decline`` — chain-wide markdown (``store_id=None``) starting on the first date a product enters the ``decline`` stage according to a passed-in ``LifecycleGenerator``. Skips products without lifecycle attrs; emits no rows when lifecycle is disabled. - ``stockout_risk`` — per-``(store, product)`` markdown ending the day before each observed stockout, lasting ``markdown_duration_days`` days, clamped to the seeded range start. Overlapping windows are deduped within each ``(store, product)`` series. - ``trigger='age_days'`` is deferred — raises ``NotImplementedError`` pointing at issue #94 (follow-up). The default trigger remains ``lifecycle_decline`` so scenarios that just flip the enable bit still produce meaningful output. - Even the enabled path is fully deterministic (no rng draws). The ``rng`` constructor parameter is kept for API consistency with peer Phase 2 generators in case future variants need randomness. - 21 new unit tests cover the regression invariant + lifecycle_decline correctness (chain-wide, skipping missing lifecycle, clamp-to-range, no decline = no output) + stockout_risk correctness (per-store, end-day-before-stockout, overlap dedupe, clamp-to-start, unknown product, dict-order independence) + age_days NotImplementedError + config validation (depth bounds, duration bounds). Remaining chunk B work: - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (4/5) — replenishment generator (#92) Fourth slice of Phase 2 generators. Same regression invariant: with ``LeadTimeConfig.enable=False`` (default) the generator returns ``[]`` and consumes zero rng state. - New ``ReplenishmentGenerator`` (pure compute, no DB) emits ``replenishment_event`` dicts. Per ``(store, product)`` it places a PO every ``order_frequency_days`` starting at ``dates[0]``. Each PO consumes two locked rng draws: ``gauss(mean_lead_time_days, lead_time_sigma_days)`` clamped to ``>= 0`` → ``gauss(fill_rate_mean, fill_rate_sigma)`` clamped to ``[0, 1]``. ``ordered_qty = base_demand * (order_frequency_days + safety_stock_days)``; ``received_qty = round(ordered_qty * fill_rate)`` defensively clamped to ``[0, ordered_qty]``. - Receipts whose ``date_received = date_placed + lead_time_days`` fall past ``dates[-1]`` are dropped to keep the FK to ``calendar`` valid. - Sorted iteration over ``(store_id, product_id)`` makes the rng stream stable regardless of input ordering. - 21 new unit tests cover the regression invariant + record shape + ordered_qty formula + dates-within-range + reproducibility + input-order independence + extreme fill rates (zero/full) + zero lead time + output sort order + 7 config-validation cases. Downstream coupling: a follow-up commit will adjust ``InventorySnapshotGenerator`` to consume these events so realistic stockout windows emerge between scheduled receipts. This slice only emits the rows. Remaining chunk B work: - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (5a/6) — lifecycle multiplier into sales (#92) First half of B 5/5 (split per Open Q3 — channel integration deferred until semantics are confirmed). Wires the LifecycleGenerator multiplier into ``SalesDailyGenerator`` while preserving the byte-identical regression invariant. - ``SalesDailyGenerator.__init__`` gains optional ``lifecycle: LifecycleGenerator | None = None``. Defaults preserve pre-Phase-2 behavior for every existing caller. - ``SalesDailyGenerator.generate`` gains optional ``product_lifecycle_data: dict[int, tuple[date | None, date | None]] | None = None``. Missing or unspecified entries fall back to ``(None, None)`` so the multiplier evaluates to 1.0. - ``_compute_demand`` gains ``product_discontinue_date`` and applies the lifecycle multiplier guarded by ``self.lifecycle is not None and self.lifecycle.enabled``. The pre-Phase-2 ``new_product_ramp_days`` linear ramp is suppressed when lifecycle is enabled, preventing double-attenuation at launch. - 10 new tests cover the regression invariant (no kwargs / explicit None / disabled config / no rng consumption when disabled), enabled correctness (pre-launch zero, post-discontinue zero, intro < maturity, decline < maturity), legacy-ramp suppression (no double-apply when lifecycle on; still fires when lifecycle is None), and the lookup fallback (missing product_id evaluates to 1.0). The B 5b/6 channel integration is held until Open Q3 resolves between (b) dominant per row, (c) random per row from channel_mix weights, or (d) aggregated with primary channel column. Remaining Phase 2 work: - B 5b/6 — SalesDailyGenerator channel split (pending Q3) - Chunk C — DataSeeder orchestration + endpoints + integration tests * feat(data): seeder phase 2 chunk B (5b/6) — channel split into sales (#92) Second half of B 5/5. Resolves Open Q3 with semantic (c): each emitted ``sales_daily`` row gets its ``channel`` drawn from ``channel_mix`` via ``rng.choices``, preserving the existing ``(date, store, product)`` grain. - ``SalesDailyGenerator.__init__`` gains optional ``channels: ChannelConfig | None = None``. Disabled / unset path consumes zero new rng draws and emits rows without a ``channel`` key (DB ``server_default='in_store'`` applies), preserving the byte-identical regression invariant. - ``generate()`` runs ``_validate_channels()`` once at entry. Rejects channels outside the SQL allow-list, negative weights, all-zero mix, negative ``online_promo_uplift``, or ``online_substitution_to_instore`` outside ``[0, 1]``. - Per emitted row (after stockout-skip): ``_maybe_apply_channel`` builds the effective mix (``online_substitution_to_instore`` shifts weight from in_store → online during promos), draws a channel via ``rng.choices``, and applies ``online_promo_uplift`` to online rows on promo dates. One rng draw per emitted row. - 19 new tests cover regression invariant (no kwarg, disabled config, no rng consumption) + channel distribution (subset of mix keys, single-channel deterministic, dominant most common, zero-weight never chosen) + online promo uplift (fires for online + promo, not for in_store) + substitution shift (more online during promo, zero substitution = no shift) + 6 validation cases + row shape (channel key present/absent). Phase 2 chunk B complete (5/6 paired slices + 1/6 follow-up #94). Next: Chunk C — DataSeeder orchestration + new endpoints + integration tests + docs. * feat(data,api): seeder phase 2 chunk c1 — orchestration + endpoints (#92) extend GenerateParams with 5 enable flags + channel_mix / lifecycle / bundle / markdown / lead-time fields; channel_mix validator enforces the SQL allow-list and at least one positive weight. Service layer translates the new params into ChannelConfig / LifecycleConfig / BundleConfig / MarkdownConfig / LeadTimeConfig overrides. DataSeeder.generate_full now wires LifecycleGenerator + BundleGenerator + MarkdownGenerator + ReplenishmentGenerator + ChannelConfig. Product lifecycle dates are fetched alongside base_price in a single query and threaded into SalesDailyGenerator. A new _normalize_promotion_records helper enforces a uniform key set across the mixed pct_off / bundle / bogo / markdown promo records so the bulk pg_insert builds a valid multi-row VALUES clause. delete_data drops replenishment_event first (leaf table). verify_data_integrity gains 3 Phase 2 invariants: bundle member-ID consistency, lifecycle date ordering, replenishment fill rate. append_data mirrors the new return signature and fetches lifecycle dates from existing products. new endpoints: GET /seeder/channels returns the SQL allow-list; GET /dimensions/products/{id}/lifecycle-curve returns the reference demand-multiplier curve via LifecycleGenerator.multiplier_for, using default LifecycleConfig ramp parameters and the product's own launch_date / discontinue_date. SeederStatus + SeederResult both grow a replenishment_events count. disabled-path regression invariant preserved: every Phase 2 flag defaults off and consumes zero rng when off. * feat(data,docs): seeder phase 2 chunk c2 — integration tests + docs (#92) test_phase2_integration.py covers the disabled-path regression (no Phase 2 rows when toggles are off), per-feature enabled tests (lifecycle populates dates, bundles convert promotions with bundle_member_product_ids non-NULL, markdowns can emit rows when lifecycle is also on, replenishment respects received_qty <= ordered_qty, multichannel writes distinct channels), full-on verify_data_integrity returning an empty error list, and delete ordering that wipes replenishment_event without FK violations. Tests are marked @pytest.mark.integration so they only run against the real docker-compose Postgres. docs/DATA-SEEDER.md adds a Phase 2 retail-depth section documenting all five toggles with example JSON payloads, the two new endpoints (GET /seeder/channels, GET /dimensions/products/{id}/lifecycle-curve), and three new Data Integrity checks. * feat(release): trigger v0.2.8 release for seeder phases 1+2 (#98) (#99) * feat(api,ui): expose seeder date range and scale controls (#82) (#83) Surface the existing GenerateParams knobs in the admin Data Seeder panel (scenario, date range, store/product counts, seed, sparsity) so operators no longer have to drop to the CLI to seed a different year. Form state persists in localStorage and a reset-to-defaults button is provided. Also fixes a latent service-layer bug: when overriding stores/products on a scenario preset, _build_config_from_params replaced the whole DimensionConfig and silently dropped scenario-customized store_regions, store_types, product_categories, and product_brands. Now uses dataclasses.replace so only the count fields change. Adds two regression tests covering holiday_rush + custom store/product counts. * feat(docs,repo): land claude.md and docs/_base reference suite (#86) (#87) Closes #86. Generated via the /w7_generating-claudemd skill in HEURISTIC_MODE (docs/_kB/repo-map/ KB not yet present). Adds: - CLAUDE.md (116 lines, 812 words; references .claude/rules/* and docs/_base/* via @imports — within the 150-line / 1800-word skill budget) - docs/_base/ARCHITECTURE.md (system boundaries, components, comm patterns, deploy chain) - docs/_base/API_CONTRACTS.md (HTTP surface across 12 slices + WebSocket + external integrations) - docs/_base/RUNBOOKS.md (common incidents, release/rollback, WSL/pnpm traps from prior session HANDOFF) - docs/_base/SECURITY.md (threat model, hard rules from security-patterns.md, scanning matrix) - docs/_base/RULES.md (Change Authority Matrix + invariants + forbidden patterns, consolidated from .claude/rules/*) - docs/_base/DOMAIN_MODEL.md (bounded contexts, aggregates, invariants, ubiquitous language) - docs/_base/DEV_GUIDE.md (human-maintained stub — {FILL IN} markers for a maintainer to complete) - docs/_base/REPO_MAP_INDEX.md (index across README, PHASE docs, ADRs, PRPs, .claude/, docs/_base/) - docs/_base/PIPELINE_CONTRACT.md (CI/CD stages, merge gates, release flow) .gitignore adjustments: - Remove `CLAUDE.md` (was blocking the doc from being shared and from being read by Claude in fresh clones) - Add `CLAUDE.local.md` (personal-prefs file — local-only by design) - Stale `.claude` duplicates on lines 2 and 5 left for a separate cleanup PR (deduping won't change behavior since `.claude/` is already tracked) Re-run the skill after a future mapping-repo-context run to drop the remaining 5 [UNVERIFIED] meta-flags. * feat(data,api): seeder Phase 1 — exogenous signals, multi-seasonality, changepoints, returns, substitution (#88) * fix(ci): pin third-party github actions by sha (#84) Closes #84. Per .claude/rules/security-patterns.md: "Pin third-party GitHub Actions by full 40-char SHA"; first-party actions/* may use major-version. Pinned (third-party): - googleapis/release-please-action@v5 → @45996ed1f6d02564a971a2fa1b5860e934307cf7 # v5.0.0 - astral-sh/setup-uv@v7 (×8 across all five workflows) → @37802adc94f370d6bfd71619e3f0bf239e1f3b78 # v7.6.0 - github/codeql-action/upload-sarif@v4 → @c6f931105cb2c34c8f901cc885ba1e2e259cf745 # v4.34.0 Left as major-tag (first-party actions/* — rule-permitted): - actions/checkout@v6 - actions/upload-artifact@v7 Dependabot watches .github/workflows/ weekly and will bump these forward. * chore(repo): gitignore local session artifacts (#90) * fix(ci): pin uv run with --frozen to stop transient resolution failures (#95) (#96) every uv run invocation in ci.yml, schema-validation.yml, and phase-snapshot.yml now uses --frozen. without it, uv re-resolves the dependency graph at command time and crashes when a freshly published pydantic-ai-slim version's [mistral] extra requires a mistralai version that does not yet exist on PyPI — observed on PR #93's most recent push where all five blocking CI jobs went red 75 minutes after a green run on the same branch with the same lockfile. dependency-check.yml's pip-audit calls deliberately retain the re-resolve behavior; that workflow's purpose is to pick up newly published vulnerabilities. uv sync --frozen --all-extras --dev was already in place to install the lock; this patch propagates the same intent to every subsequent uv run. * feat(data,db): seeder phase 2 — retail-depth foundation + lifecycle generator (#92) (#93) * feat(data,db): seeder phase 2 chunk A — retail-depth schema + configs (#92) Lays the foundation for Phase 2 retail depth without changing any generator behaviour: - Alembic migration a8b9c0d1e234 adds sales_daily.channel (NOT NULL, server default 'in_store'), product lifecycle fields (lifecycle_stage, launch_date, discontinue_date, pack_size, subcategory), promotion kind discriminator with JSONB bundle_member_product_ids, and a new replenishment_event table. All additive; retail_standard rows are unchanged. - ORM mirrors the schema, including a load-bearing JSONB(none_as_null=True) so the bundle-members CHECK fires. - Five new config dataclasses (ChannelConfig, LifecycleConfig, BundleConfig, MarkdownConfig, LeadTimeConfig) wired to SeederConfig with disabled defaults so all existing scenarios produce byte-identical row counts. - 25 integration tests cover the new CHECK + nullability constraints; 8 unit tests guard the config defaults + regression invariant across every ScenarioPreset. * feat(data): seeder phase 2 chunk B (1/5) — product lifecycle generator (#92) First slice of Phase 2 generators. Strict regression invariant: with ``LifecycleConfig.enable=False`` (default) ProductGenerator's output and rng-draw sequence are byte-identical to pre-Phase-2. - ProductGenerator gains optional ``lifecycle_config`` + ``date_range`` parameters. When enabled, each product row carries ``subcategory``, ``pack_size``, ``launch_date``, ``discontinue_date``, ``lifecycle_stage``. - New ``LifecycleGenerator`` (pure compute, no DB) computes per-(product, date) demand multipliers across intro/growth/maturity/decline/ discontinued segments. Disabled path returns 1.0 without touching rng. - 14 new unit tests cover the regression invariant + each ramp segment + discontinue override + reproducibility under enabled mode. Remaining chunk B work (next commits on this branch): - BundleGenerator (BOGO + bundle promotions) - MarkdownGenerator (clearance pricing) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (2/5) — bundle/BOGO generator (#92) Second slice of Phase 2 generators. Same regression invariant as B 1/5: with ``BundleConfig.enable=False`` (default) ``BundleGenerator.apply`` leaves both the promotion list and the rng state byte-identical. - New ``BundleGenerator`` (pure compute, no DB) wraps ``PromotionGenerator``'s output: per-promo ``bundle_probability`` chance to convert to ``kind='bundle'`` or ``kind='bogo'`` (split by ``bogo_share_within_bundles``), drawing 2–``max_bundle_size`` member product IDs (host excluded) and a discount in ``[bundle_discount_pct_min, bundle_discount_pct_max]`` quantized to ``Numeric(5, 4)``. ``discount_amount`` is cleared on converted rows to keep the row internally consistent with the new ``discount_pct``. - Locked rng order per converted promo: ``random()`` (convert?) → ``random()`` (bogo?) → ``randint()`` (n_members) → ``sample()`` (members) → ``uniform()`` (discount). Per-host pool-too-small skip happens before any rng draw so the stream stays stable across runs where only the product pool shrinks. - 18 new unit tests cover the regression invariant (no mutation, no rng consumption) + kind allow-list + member-pool sourcing + count + discount range + BOGO/bundle split at extremes + reproducibility + best-effort skip for small pools + config validation. Remaining chunk B work: - MarkdownGenerator (clearance pricing — needs Open Q on inventory age coupling resolved before starting) - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (3/5) — markdown generator (#92) Third slice of Phase 2 generators. Same regression invariant as B 1/5 and B 2/5: with ``MarkdownConfig.enable=False`` (default) the generator emits empty containers and consumes zero rng state. - New ``MarkdownGenerator`` (pure compute, no DB) emits ``Promotion(kind='markdown')`` rows + companion ``PriceHistory`` drop rows + a ``markdown_dates`` lookup keyed by ``(store_id, product_id)`` for the future ``SalesDailyGenerator`` lift integration in chunk B 5/5. - Two triggers ship in this slice: - ``lifecycle_decline`` — chain-wide markdown (``store_id=None``) starting on the first date a product enters the ``decline`` stage according to a passed-in ``LifecycleGenerator``. Skips products without lifecycle attrs; emits no rows when lifecycle is disabled. - ``stockout_risk`` — per-``(store, product)`` markdown ending the day before each observed stockout, lasting ``markdown_duration_days`` days, clamped to the seeded range start. Overlapping windows are deduped within each ``(store, product)`` series. - ``trigger='age_days'`` is deferred — raises ``NotImplementedError`` pointing at issue #94 (follow-up). The default trigger remains ``lifecycle_decline`` so scenarios that just flip the enable bit still produce meaningful output. - Even the enabled path is fully deterministic (no rng draws). The ``rng`` constructor parameter is kept for API consistency with peer Phase 2 generators in case future variants need randomness. - 21 new unit tests cover the regression invariant + lifecycle_decline correctness (chain-wide, skipping missing lifecycle, clamp-to-range, no decline = no output) + stockout_risk correctness (per-store, end-day-before-stockout, overlap dedupe, clamp-to-start, unknown product, dict-order independence) + age_days NotImplementedError + config validation (depth bounds, duration bounds). Remaining chunk B work: - ReplenishmentGenerator (lead-time-driven replenishment_event rows) - SalesDailyGenerator channel + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (4/5) — replenishment generator (#92) Fourth slice of Phase 2 generators. Same regression invariant: with ``LeadTimeConfig.enable=False`` (default) the generator returns ``[]`` and consumes zero rng state. - New ``ReplenishmentGenerator`` (pure compute, no DB) emits ``replenishment_event`` dicts. Per ``(store, product)`` it places a PO every ``order_frequency_days`` starting at ``dates[0]``. Each PO consumes two locked rng draws: ``gauss(mean_lead_time_days, lead_time_sigma_days)`` clamped to ``>= 0`` → ``gauss(fill_rate_mean, fill_rate_sigma)`` clamped to ``[0, 1]``. ``ordered_qty = base_demand * (order_frequency_days + safety_stock_days)``; ``received_qty = round(ordered_qty * fill_rate)`` defensively clamped to ``[0, ordered_qty]``. - Receipts whose ``date_received = date_placed + lead_time_days`` fall past ``dates[-1]`` are dropped to keep the FK to ``calendar`` valid. - Sorted iteration over ``(store_id, product_id)`` makes the rng stream stable regardless of input ordering. - 21 new unit tests cover the regression invariant + record shape + ordered_qty formula + dates-within-range + reproducibility + input-order independence + extreme fill rates (zero/full) + zero lead time + output sort order + 7 config-validation cases. Downstream coupling: a follow-up commit will adjust ``InventorySnapshotGenerator`` to consume these events so realistic stockout windows emerge between scheduled receipts. This slice only emits the rows. Remaining chunk B work: - SalesDailyGenerator channel split + lifecycle multiplier integration * feat(data): seeder phase 2 chunk B (5a/6) — lifecycle multiplier into sales (#92) First half of B 5/5 (split per Open Q3 — channel integration deferred until semantics are confirmed). Wires the LifecycleGenerator multiplier into ``SalesDailyGenerator`` while preserving the byte-identical regression invariant. - ``SalesDailyGenerator.__init__`` gains optional ``lifecycle: LifecycleGenerator | None = None``. Defaults preserve pre-Phase-2 behavior for every existing caller. - ``SalesDailyGenerator.generate`` gains optional ``product_lifecycle_data: dict[int, tuple[date | None, date | None]] | None = None``. Missing or unspecified entries fall back to ``(None, None)`` so the multiplier evaluates to 1.0. - ``_compute_demand`` gains ``product_discontinue_date`` and applies the lifecycle multiplier guarded by ``self.lifecycle is not None and self.lifecycle.enabled``. The pre-Phase-2 ``new_product_ramp_days`` linear ramp is suppressed when lifecycle is enabled, preventing double-attenuation at launch. - 10 new tests cover the regression invariant (no kwargs / explicit None / disabled config / no rng consumption when disabled), enabled correctness (pre-launch zero, post-discontinue zero, intro < maturity, decline < maturity), legacy-ramp suppression (no double-apply when lifecycle on; still fires when lifecycle is None), and the lookup fallback (missing product_id evaluates to 1.0). The B 5b/6 channel integration is held until Open Q3 resolves between (b) dominant per row, (c) random per row from channel_mix weights, or (d) aggregated with primary channel column. Remaining Phase 2 work: - B 5b/6 — SalesDailyGenerator channel split (pending Q3) - Chunk C — DataSeeder orchestration + endpoints + integration tests * feat(data): seeder phase 2 chunk B (5b/6) — channel split into sales (#92) Second half of B 5/5. Resolves Open Q3 with semantic (c): each emitted ``sales_daily`` row gets its ``channel`` drawn from ``channel_mix`` via ``rng.choices``, preserving the existing ``(date, store, product)`` grain. - ``SalesDailyGenerator.__init__`` gains optional ``channels: ChannelConfig | None = None``. Disabled / unset path consumes zero new rng draws and emits rows without a ``channel`` key (DB ``server_default='in_store'`` applies), preserving the byte-identical regression invariant. - ``generate()`` runs ``_validate_channels()`` once at entry. Rejects channels outside the SQL allow-list, negative weights, all-zero mix, negative ``online_promo_uplift``, or ``online_substitution_to_instore`` outside ``[0, 1]``. - Per emitted row (after stockout-skip): ``_maybe_apply_channel`` builds the effective mix (``online_substitution_to_instore`` shifts weight from in_store → online during promos), draws a channel via ``rng.choices``, and applies ``online_promo_uplift`` to online rows on promo dates. One rng draw per emitted row. - 19 new tests cover regression invariant (no kwarg, disabled config, no rng consumption) + channel distribution (subset of mix keys, single-channel deterministic, dominant most common, zero-weight never chosen) + online promo uplift (fires for online + promo, not for in_store) + substitution shift (more online during promo, zero substitution = no shift) + 6 validation cases + row shape (channel key present/absent). Phase 2 chunk B complete (5/6 paired slices + 1/6 follow-up #94). Next: Chunk C — DataSeeder orchestration + new endpoints + integration tests + docs. * feat(data,api): seeder phase 2 chunk c1 — orchestration + endpoints (#92) extend GenerateParams with 5 enable flags + channel_mix / lifecycle / bundle / markdown / lead-time fields; channel_mix validator enforces the SQL allow-list and at least one positive weight. Service layer translates the new params into ChannelConfig / LifecycleConfig / BundleConfig / MarkdownConfig / LeadTimeConfig overrides. DataSeeder.generate_full now wires LifecycleGenerator + BundleGenerator + MarkdownGenerator + ReplenishmentGenerator + ChannelConfig. Product lifecycle dates are fetched alongside base_price in a single query and threaded into SalesDailyGenerator. A new _normalize_promotion_records helper enforces a uniform key set across the mixed pct_off / bundle / bogo / markdown promo records so the bulk pg_insert builds a valid multi-row VALUES clause. delete_data drops replenishment_event first (leaf table). verify_data_integrity gains 3 Phase 2 invariants: bundle member-ID consistency, lifecycle date ordering, replenishment fill rate. append_data mirrors the new return signature and fetches lifecycle dates from existing products. new endpoints: GET /seeder/channels returns the SQL allow-list; GET /dimensions/products/{id}/lifecycle-curve returns the reference demand-multiplier curve via LifecycleGenerator.multiplier_for, using default LifecycleConfig ramp parameters and the product's own launch_date / discontinue_date. SeederStatus + SeederResult both grow a replenishment_events count. disabled-path regression invariant preserved: every Phase 2 flag defaults off and consumes zero rng when off. * feat(data,docs): seeder phase 2 chunk c2 — integration tests + docs (#92) test_phase2_integration.py covers the disabled-path regression (no Phase 2 rows when toggles are off), per-feature enabled tests (lifecycle populates dates, bundles convert promotions with bundle_member_product_ids non-NULL, markdowns can emit rows when lifecycle is also on, replenishment respects received_qty <= ordered_qty, multichannel writes distinct channels), full-on verify_data_integrity returning an empty error list, and delete ordering that wipes replenishment_event without FK violations. Tests are marked @pytest.mark.integration so they only run against the real docker-compose Postgres. docs/DATA-SEEDER.md adds a Phase 2 retail-depth section documenting all five toggles with example JSON payloads, the two new endpoints (GET /seeder/channels, GET /dimensions/products/{id}/lifecycle-curve), and three new Data Integrity checks. * feat(release): trigger v0.2.8 release for seeder phases 1+2 (#98) * chore(main): release 0.2.8 (#100)

w7-mgfcode added 7 commits May 11, 2026 20:42

feat(data,api): seeder Phase 1 — exogenous signals, multi-seasonality…

37e3a2b

…, changepoints, returns, substitution (#88)

chore(repo): gitignore local session artifacts (#90)

1df38a1

sourcery-ai Bot reviewed May 12, 2026

View reviewed changes

w7-mgfcode merged commit 7f97c5e into main May 12, 2026
13 of 14 checks passed

This was referenced May 12, 2026

release: cut v0.2.8 for seeder phases 1+2 + ci/docs hardening #98

Closed

feat(release): trigger v0.2.8 release for seeder phases 1+2 (#98) #99

Merged

This was referenced May 18, 2026

feat: release v0.2.10 — demo showcase page + e2e pipeline #134

Merged

feat: release v0.2.11 — visualization fixes, job picker, demo showcase #158

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(repo): merge dev to main — seeder phases 1+2 + ci/docs hardening (#92)#97

chore(repo): merge dev to main — seeder phases 1+2 + ci/docs hardening (#92)#97
w7-mgfcode merged 7 commits into
mainfrom
dev

w7-mgfcode commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

w7-mgfcode commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Highlights

CI status on dev

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

w7-mgfcode commented May 12, 2026 •

edited by coderabbitai Bot

Loading

CI status on `dev`

coderabbitai Bot commented May 12, 2026 •

edited

Loading