Skip to content

release: promote dev to main for v0.2.20#326

Merged
w7-mgfcode merged 41 commits into
mainfrom
dev
May 31, 2026
Merged

release: promote dev to main for v0.2.20#326
w7-mgfcode merged 41 commits into
mainfrom
dev

Conversation

@w7-mgfcode
Copy link
Copy Markdown
Owner

Summary

Promotes the #324 showcase_rich safer-promote → scenario-replay cascade fix and the showcase manual demo guide from dev to main for the next release cut (v0.2.20). dev is 41 commits ahead of main; this PR cuts the accumulated work, with the #324 release-blocker as the headline.

Key landed work

  • Scenario replay now resolves the champion via ctx.winning_run_id instead of the swapped demo-production alias.
  • safer_promote_flow writes real-shape, parseable artifact_uri values (no more unparseable placeholder).
  • Alias-restore safeguard (_restore_demo_alias_after_failure) added on pipeline failure so demo-production is never left on the worse-WAPE run.
  • scenario_simulate_and_save E2E tolerance removed — the step must now pass.
  • Showcase manual demo guide added (docs/user-guide/showcase-manual-demo-guide.md).

Validation

  • ruff check — passed
  • ruff format --check — passed
  • demo unit tests — passed (58)
  • non-integration tests (pytest -m "not integration") — passed
  • integration test_run_demo_showcase_rich_full_epic — passed (live docker Postgres)
  • mypy app/ — only pre-existing xgboost stub baseline errors (none in changed files)

Issue

Closes #324 after merge to main / the release flow (GitHub auto-closes on merge to the default branch).

⚠️ Merge instruction

Merge this PR via the GitHub web UI — NOT gh pr merge --merge — to avoid the release-please merge-commit-subject trap (a conventional non-bumping subject makes release-please skip the bump). See docs/_base/RUNBOOKS.md § "release-please skipped the bump after a dev → main merge".

w7-mgfcode and others added 30 commits May 25, 2026 15:33
chore(repo): back-merge main → dev (v0.2.19)
Lands the docs-only Forecast Intelligence roadmap — 4 INITIAL docs + 3
PRPs, no production code. Dependency-chained execution: PRP-35 first,
PRP-36 + PRP-37 follow. Tracked by epic issue #295.

- INITIAL roadmap (A/B/C + index)
- PRP-35 Feature Frame V2 — V1 frozen, V2 ships as sibling builders,
  dispatch at service layer only, load-bearing leakage spec
- PRP-36 Model Zoo + Backtesting — new baselines, per-horizon-bucket
  metrics, comparable-runs with feature_frame_version key
- PRP-37 Interactive UI — partial-execution gates, shadcn@4.7.0 pin,
  per-component @radix-ui/react-X imports
docs(prp): add forecast intelligence planning docs
Bundles three carryover concerns from prior local demo work into one PR.

* fix(data) — PriceHistoryGenerator could emit a row with valid_to <
  valid_from when a change roll fired on the window's first day. That
  violates ck_price_history_valid_dates and crashed the seeder during
  ingest. The fix skips the degenerate row.

* feat(data) — three new local-host scripts that drive the public API
  to enrich the demo DB without raw SQL writes:
  - seed_phase2_only: re-runs Phase 2 generators (replenishment,
    exogenous, returns, lifecycle) against existing dimensions
  - seed_historical_activity: submits varied train/predict/backtest
    jobs across 2024-Q4 -> 2026-Q1 cutoffs through /jobs
  - seed_registry_from_jobs: walks completed train jobs, runs the
    canonical pending -> running -> success transition + alias stamps

* chore(repo) — uv.lock refreshes forecastlabai 0.2.18 -> 0.2.19
  to match the release-please-merged version bump.

Excluded intentionally: alembic/a2b3c4d5e6f7 + rag/models.py — the
migration is self-marked "local-only demo" (truncates document_chunk,
drops HNSW index, hardcodes 2560 for qwen3) and would wipe any non-
qwen3 user's RAG corpus on upgrade. Stays uncommitted locally.
Three corrections to register_one and fetch_completed_train_jobs:

* pagination — `page * len(jobs) >= total` stops too early when the
  last page is partial. Switch to accumulated-count + short-page
  detection (exit when len(jobs) < page_size or len(out) >= total).
* model_path validation — empty / directory paths slipped through
  because Path("") resolves to cwd and Path.exists() returns True for
  directories. Require non-empty path and Path.is_file() for both the
  raw and cwd-relative candidates.
* duplicate detection — `r.status_code >= 400` blanket-swallowed
  registry downtime and validation errors as idempotent skips. Narrow
  the skip to HTTP 409 (the actual DuplicateRunError code per
  registry/routes.py:113) and raise RuntimeError on other 4xx / 5xx
  with the response body for diagnostics.

Python 3.12-only `def chunked[U](...)` syntax in seed_phase2_only.py
is intentional — `pyproject.toml:6` already pins `requires-python =
">=3.12"`.
…ools

feat(data,repo): local demo tooling + seeder price-history fix
Lands V2 feature-frame contract as additive, opt-in surface alongside frozen V1.
Training + scenarios + shared builders complete; backtesting V2 dispatch deferred
to follow-up tracked in #299. V1 callers unchanged.

- Shared layer: V2 manifest (38 default / 53 max columns), sidecars, row builders
- Training: TrainRequest gains feature_frame_version + feature_groups (opt-in)
- Scenarios: build_future_frame dispatches V1/V2 via bundle metadata
- 3 LOAD-BEARING leakage specs land alongside the V1 spec
- No Alembic migration (V2 reads existing tables, writes nothing)
- V1 bundles load/predict/scenario-simulate/backtest unchanged
Contract Refresh gate (PRP-36 Task 1) executed against dev @ f2bf7c8.
Patches PRP-36 to match what PRP-35 actually shipped vs what it assumed.

- Add PRPs/ai_docs/prp-35-final-contract-snapshot.md (one-off, authoritative)
- Item 7: backtesting V2 dispatch deferred to #299 (not in PRP-35 final scope)
- Task 8: re-scoped to V1 fold-path bucket metrics only; V2 lands with #299
- Item 10: DEFAULT_V2_GROUPS order corrected (CALENDAR at position 2)

No implementation code touched. PRP-36 execution remains gated until this lands.
docs(prp): refresh PRP-36 after Feature Frame V2
PRP-36 (Forecast Intelligence — Slice B). Promote the model layer from
"a regression model + three baselines" to a disciplined model zoo with
fair, leakage-safe comparison on top of the PRP-35 Feature Frame V2
contract.

Models (under model_factory + _MODEL_FAMILY_MAP):
- weighted_moving_average — linear or exponential weighting (always-on)
- seasonal_average — average of last N seasonal cycles, optional trim (always-on)
- trend_regression_baseline — Ridge over elapsed-day + dow/month one-hots (always-on)
- random_forest — sklearn RandomForestRegressor, n_jobs=1, gated by forecast_enable_random_forest

Backtesting metrics (additive — V1 fold path only):
- aggregated_metrics gains rmse alongside mae/smape/wape/bias
- FoldResult.horizon_bucket_metrics — per-bucket dict keyed by h_1_7 / h_8_14 / h_15_28 / h_29_plus (empty buckets dropped)
- ModelBacktestResult.bucketed_aggregated_metrics — per-bucket means across folds
- V2 backtesting dispatch remains DEFERRED to #299

Registry comparable-run rule:
- RegistryService._find_duplicate now distinguishes V1 vs V2 (runs with different feature_frame_version are NOT duplicates)
- New find_comparable_runs(grain, overlapping window, same V, status==SUCCESS)
- RunCreate.runtime_info_extras lets callers pin feature_frame_version + feature_groups
- RunResponse.feature_frame_version + feature_groups computed from runtime_info (legacy runs surface None)

Ops staleness:
- New StaleReason enum value FEATURE_FRAME_VERSION_MISMATCH — a V1 alias with a newer V2 comparable run reports this instead of NEWER_SUCCESS_RUN
- AliasHealth and ModelHealthEntry expose alias_feature_frame_version + comparable_run_feature_frame_version

Explainability:
- New explainers: WeightedMovingAverageExplainer, SeasonalAverageExplainer, TrendRegressionBaselineExplainer
- Factory + service plumb weight_strategy / decay / lookback_cycles / trim_outliers
- HGBR keeps raising FeatureImportanceUnavailableError (422 path unchanged)

Other:
- examples/forecasting/model_zoo_compare.py — read-only diagnostic that backtests every available model on a single grain and prints aggregate + per-bucket WAPE
- docs/_base/API_CONTRACTS.md, DOMAIN_MODEL.md, docs/optional-features/05 + 09 updated

Validation:
- ruff check / format clean
- mypy --strict / pyright --strict clean (3 mypy + 8 pyright pre-existing xgboost/lightgbm errors only; CI runs --all-extras)
- 1574 non-integration tests pass; load-bearing leakage specs unchanged
- alembic check — NO new migration (all new state rides existing JSONB columns)

Out of scope (deferred):
- V2 backtesting fold dispatch — #299
- PRP-37 UI / dashboard — Slice C
- /explain/forecast handler for random_forest — needs bundle reload, separate PRP
CodeRabbit review on PR #303 surfaced one bug-risk + one consistency
issue + one missing test + one doc typo + an overall refactor request.
All five addressed.

1. BUG-RISK — _run_feature_frame_version returned None for missing
   JSONB keys while _feature_frame_version_filter treats them as V=1.
   _alias_staleness compared None != 1 and spuriously surfaced
   FEATURE_FRAME_VERSION_MISMATCH for a legacy alias against an
   explicit-V=1 comparable run. Normalized the ops helper to return
   V=1 for missing keys (matches the registry filter contract). The
   schema-side RunResponse.feature_frame_version still surfaces None
   so UIs can distinguish "no V info" from "V=1".

2. REFACTOR — Extracted shared pure helpers in forecasting/models.py:
   - compute_weighted_average_weights
   - compute_seasonal_average_for_offset
   - build_trend_baseline_design_row
   The forecasters' fit/predict + the three new explainers now call
   them as the single source of truth. No more two-place drift risk
   when a default changes.

3. CONSISTENCY — SeasonalAverageExplainer.sample_dispersion now
   measures the same array the forecast was averaged from
   (post-trim when trim_outliers is on; raw otherwise). Description
   updated to match.

4. TESTING — Added test_invalid_min_samples_leaf_raises to round
   out RandomForestForecaster's constructor-validation branches.

5. TYPO — docs/optional-features/09-…governance.md uses the
   `stale_reason` field-name form (no hyphen) to match
   DOMAIN_MODEL.md / API_CONTRACTS.md.

Plus: two new ops tests pin the new V=1 normalization contract
(`_run_feature_frame_version_rejects_unsupported_value`,
`_alias_staleness_legacy_run_treated_as_v1_no_spurious_mismatch`).

Validation: ruff / mypy --strict / pyright --strict clean (same 3+8
pre-existing xgboost/lightgbm errors only). 1577 non-integration
tests pass (+3 new). Leakage specs unchanged.
…acktesting

feat(forecast): add model zoo and backtesting comparison
docs(prp): refresh prp37 after model zoo contracts
PRP-37 — Forecast Intelligence Slice C. Operator-facing surface for the
PRP-35 V2 feature contract and the PRP-36 model zoo, backtest buckets,
and V-aware ops fields. Backend untouched (per PRP-37); every visible
value is read from an existing backend response.

Surfaces:
- /visualize/forecast: family Tabs + model-type Select + V1/V2 Select +
  conditional feature-pack toggles + train submission.
- /visualize/backtest: per-horizon-bucket table + chart, RMSE tile,
  baseline-vs-feature-aware comparison table.
- /visualize/planner: scenario method badge.
- /visualize/batch: 5 sweep presets + multi-model x V1/V2 matrix picker.
- /explorer/run-detail: Feature frame panel (V1/V2 + per-group columns +
  per-column safety chips).
- /explorer/run-compare: Champion compatibility badge + Feature frame
  version row.
- /ops: stale-alias V mismatch chip, model-health explainer columns,
  safer Promote dialog (artifact verify + worse-WAPE ack + V-mismatch ack).

Adds 11 components under components/forecast-intelligence/, 1 chart
under components/charts/, 2 lib modules under lib/, with colocated
vitest tests for every component and helper. api.ts extended with
PRP-35/PRP-36 wire types (all Optional, additive). use-runs gains an
optional feature_frame_version param (not forwarded to the backend list
endpoint; no server-side filter exists).

Validation: pnpm tsc --noEmit + pnpm lint + pnpm test --run all clean
(202 frontend tests). Backend regression suite (forecasting +
backtesting + registry + ops, non-integration) 518 passed.
feat(ui): add interactive forecast intelligence UI (PRP-37) (#305)
PRP-37 introduced `trainFamily` as the train-card form state (useState at
L68), shadowing an existing PRP-31 derived `trainFamily` at L120 sourced
from `useJobFeatureMetadata`. Babel/Vite reject the file with `Cannot
redeclare block-scoped variable 'trainFamily'`; tsc reports TS2451 at
both sites. The forecast page does not mount.

Rename L120 to `loadedTrainFamily` (distinct semantics: derived from a
loaded predict job's training metadata, used by the `ModelFamilyBadge`
in the Model details collapsible). The L68 form state keeps the
`trainFamily` name.
…licate

fix(ui): rename duplicate trainFamily binding in forecast page
…g-lifecycle

feat(api,ui): showcase pipeline richer data and v2 foundation (#309)
…ing-artifacts

docs(docs): add rich showcase planning artifacts
PRP-39 — extend the showcase_rich demo pipeline with three new decision-
phase steps (champion_compat_compare, stale_alias_trigger,
safer_promote_flow) and a new portfolio phase (batch_preset). The
decision lifecycle now demonstrates V1-vs-V2 champion-compat verdicts,
the stale-alias V-mismatch chip on /ops, and the safer-Promote dialog
gates. The portfolio phase runs the quick_baseline_sweep preset (3
stores x 2 products x 3 baselines = 18 items) via /batch/forecasting.

Backend:
- app/features/demo/pipeline.py — 4 new step functions, PHASE_PORTFOLIO
  constant, BATCH_PRESET_QUICK_BASELINE_SWEEP_MODELS module constant,
  DemoContext additive fields (compat_compare_result,
  stale_alias_run_id, original_demo_alias_run_id, batch_id,
  batch_status), step_cleanup extension that restores the
  demo-production alias to its pre-swap target (R15).
- app/features/demo/tests/test_pipeline.py — 8 new unit tests (4 step
  functions, 2 skip paths, 2 cleanup scenarios) + extended canned
  responses for /ops/summary, /batch/forecasting, /registry/runs?...,
  /registry/aliases/{name}, /registry/compare/{a}/{b}; lockstep
  test_phase_table_showcase_rich expanded to 18 rows.
- tests/test_e2e_demo.py — new test_run_demo_showcase_rich_decision_
  portfolio integration test asserting the four new step events fire
  and R15 alias restoration completes.

Frontend:
- PHASE_DEFS.ts — appends 3 decision-phase rows + portfolio phase row;
  PHASE_ORDER + PHASE_LABEL extend with 'portfolio'.
- showcase.tsx — resolveInspectHref gains 4 new case arms targeting
  /explorer/runs/compare, /ops, and /visualize/batch/{batch_id}.
- demo-step-card.tsx — 4 new mini-summary chip-line components.
- demo-step-card.test.tsx (new) — 6 render tests covering chip-lines
  and Inspect button behaviour.
- PHASE_DEFS.test.ts + use-demo-pipeline.test.ts — extended to assert
  the new 18-step showcase_rich layout.

Docs:
- docs/_base/RUNBOOKS.md — 8 new failure-mode entries under the
  /showcase pipeline section covering the 4 new steps (skip / fail
  diagnostics, R15 cleanup recovery).

Drift resolutions (per PRPs/ai_docs/prp-39-contract-probe-report.md):
- D1 (compare envelope): champion_compat_compare derives compatible +
  comparable_reason client-side; mirrors the frontend
  computeCompatibility predicate.
- D2 (quick_baseline_sweep): preset expansion stays in the demo slice
  (Option A); no preset_id on BatchSubmitRequest.
- D3 (sync settle): /batch/forecasting normally returns terminal status
  on submit; the 90 s poll loop is a safety net.

WebSocket schema additive only — no StepEvent / DemoRunRequest field
changes. Relative-anchor phase insertion (PHASE_PORTFOLIO between
PHASE_DECISION and PHASE_VERIFY) keeps the slice merge-order
independent of PRP-40.
…tfolio-lifecycle

feat(api,ui): showcase pipeline decision + portfolio lifecycle (#316)
Adds two new phases to the in-process /showcase demo pipeline on the
showcase_rich scenario (PRP-40): planning (2 steps — scenario_simulate_and_save,
multi_plan_compare) and knowledge (3 steps — embedding_provider_probe,
rag_index_subset, rag_retrieve_probe). Both phases insert BEFORE the verify
phase via a relative anchor so PRP-39 (sibling, parallel) rebases cleanly.

Backend additive contracts:

- IndexProjectDocsRequest.path_prefix: str | None = None on
  app/features/rag/schemas.py — restricts the docs/ root scan to a sub-path
  with a path-traversal guard. Default None preserves wholesale-scan behavior.
- _parse_artifact_key + _embedding_provider_reachable helpers on
  app/features/demo/pipeline.py — R16 (scenarios.run_id is the artifact
  key, not model_run.run_id) and provider-presence-only probe per
  security-patterns.md.
- DemoContext gains scenario_artifact_key / price_cut_scenario_id /
  holiday_scenario_id / embedding_unreachable fields (None on demo_minimal).

Showcase_rich step count: 14 → 19. The knowledge phase SKIPs gracefully when
no embedding provider is reachable; pipeline still goes green.

Frontend mirrors the lockstep contract: PHASE_DEFS.ts ALL_STEPS gains five
new entries + PHASE_ORDER + PHASE_LABEL add planning / knowledge;
showcase.tsx resolveInspectHref switch deep-links the new step cards into
/visualize/planner (with optional ?scenario_id), /knowledge, and /admin.
Five new mini-summary helpers on demo-step-card.tsx render the per-step
detail strips.

Vertical-slice rule preserved — demo never imports scenarios / rag /
config / registry; every call goes through httpx.ASGITransport.

Docs: docs/_base/API_CONTRACTS.md notes the new path_prefix field and the
two new phases; docs/_base/RUNBOOKS.md adds five new step failure-mode
entries to the "Showcase page pipeline fails at step X" section.
…wledge-lifecycle

feat(api,ui): PRP-40 showcase planning + knowledge lifecycle (#315)
PRP-41 drift items D1-D7 + fold in issue #311:

- D1-D4: KPI strip counters now read the actual emitted step.data keys
  (completed_items, total_chunks, total_aliases, scenario_id +
  winner_scenario_id) — never invented names.
- D5: GET /ops/model-health takes only ?limit (no grain param).
- D6: ops_snapshot derives stale_aliases_count + total_aliases from
  OpsSummaryResponse.aliases (no flat keys on the response).
- D7: replaced stale line citations with symbol refs so the PRP does not
  re-stale on the next pipeline.py expansion.
- Folded issue #311 (phase accordion lock after completion) into scope
  as polish item 7 + acceptance criterion D10 — load-bearing for the
  Inspect-Artifacts post-run UX.
- Noted issue #312 (phase2 enrichment idempotency) as a dogfood
  prerequisite (out of PRP-41 scope, must land before the manual
  checklist runs).

Docs only change. No code touched.
docs(docs): refresh initial 41 after prp 39 40 (#313)
A second POST /seeder/phase2-enrichment against an already-enriched scope
no longer raises IntegrityError (previously: uq_exogenous_signal_per_store
surfaced as HTTP 500, blocking PRP-41's manual showcase_rich dogfood).

- exogenous_signal: pg_insert(...).on_conflict_do_nothing() — target-free
  ON CONFLICT covers both partial unique indexes (global + per-store)
- replenishment_event / sales_returns (no natural-key unique constraint):
  section-level existence check inside the seeded date window; skip the
  whole insert when rows already exist
- product lifecycle UPDATE: already idempotent under a fixed seed; left
  in place
- Phase2EnrichmentResponse gains additive records_skipped: dict[str, int]
- Defensive IntegrityError -> ConflictError(409) wrap as the belt-and-braces
  net (the idempotency guards above should make it unreachable)

Adds a non-destructive integration test (app/features/seeder/tests/
test_phase2_idempotency.py) that calls the endpoint twice against the
live Postgres and asserts records_skipped > 0 + records_created == 0 on
the second pass for all three insert tables.
w7-mgfcode and others added 11 commits May 26, 2026 17:05
CI's freshly-migrated DB has only stray data from earlier tests'
fixtures — too sparse for the ReplenishmentGenerator or ReturnsGenerator
to produce any records. The original assertion required
records_skipped > 0 for all three insert tables, which fails when
nothing was generated.

Relaxed: assert created == 0 on the second call for all three insert
tables (the canonical idempotency proof — no new rows). Sanity guard
soft-skips when none of the three tables exercised the idempotency
path (otherwise the test would be meaningless). The exogenous_signal
ON CONFLICT DO NOTHING path — the original IntegrityError surface — is
exercised by any DB with at least one Store + one date.
…idempotent

fix(data): make phase2 enrichment idempotent (#312)
…t-ops-polish

docs(docs): add prp 41 showcase agent ops polish
PRP-41 — fourth and FINAL slice of the /showcase upgrade epic
(PRP-38..41). Adds two new pipeline phases on scenario=showcase_rich
plus cross-cutting UI polish that closes issue #311.

Pipeline (backend / app/features/demo/pipeline.py)

- step_agent_hitl_flow: HITL approval round-trip on the experiment
  agent. Drives POST /agents/sessions + /chat + /approve via
  ASGITransport; surfaces an intermediate step_complete
  (status=running, awaiting_approval=true) for the FE to render the
  Approve button; absorbs 400 "No pending action" when the FE
  pre-empts; 90 s hard timeout falls back to skip so a hung agent
  never wedges the run.
- step_ops_snapshot: 3 GET calls to /ops/summary +
  /ops/retraining-candidates + /ops/model-health, derives a 5-key
  KPI payload (stale_aliases_count, retraining_candidates_count,
  total_runs, total_aliases, degrading_health_count). warn (never
  fail) on all-three-failed.
- _phase_table() — design Z: unified `agents` phase id for BOTH
  scenarios; SHOWCASE_RICH swaps step_agent for step_agent_hitl_flow
  and appends an ops phase carrying ops_snapshot before cleanup.
  SHOWCASE_RICH = 24 rows / 10 phases; DEMO_MINIMAL = 11 rows
  (unchanged shape under the new agents phase id).
- _Client.yield_event hook + run_pipeline event-sink drain. The
  orchestrator stamps step_index / total_steps / phase_index /
  phase_total / phase_name on every drained intermediate event.

Frontend (UI)

- PHASE_DEFS.ts — design Z restructure: BOTH the legacy `agent` step
  and the new `agent_hitl_flow` live under the unified `agents`
  phase id; new DEMO_MINIMAL_ONLY_STEP_NAMES set complements
  SHOWCASE_RICH_STEP_NAMES so the filter selects the right step per
  scenario (lockstep test pins 24 tuples / 10 phases).
- DemoPhasePanel.tsx — adds onValueChange handler + local useState
  (closes issue #311 / D10): post-pipeline-complete the operator
  can finally expand any phase without snapping back to the
  fallback.
- demo-step-card.tsx — HitlFlowSummary chip-line + OpsSnapshotMiniGrid
  + one-click ApproveButton (only renders when status=running AND
  awaiting_approval=true).
- showcase.tsx — five new chrome additions:
  - ShowcaseKpiStrip — 5-tile KPI strip above the controls card.
  - RunHistoryStrip — localStorage FIFO 5 with Replay button.
  - Stop button (visible mid-run) — closes the WS so the backend's
    WebSocketDisconnect releases the pipeline lock.
  - InspectArtifactsPanel — 10 deep-link cards rendered after
    pipeline_complete.
  - resolveInspectHref switch extended with agent_hitl_flow → CHAT,
    ops_snapshot → OPS.
- use-demo-pipeline.ts — stop() callback exposed via
  UseDemoPipelineResult; DemoSummary.v2RunId added (mapped from
  pipeline_complete event.data.v2_run_id).

Docs

- docs/user-guide/showcase-walkthrough.md — drops 7 "planned"
  markers across PRP-38/39/40/41 phases; adds concrete prose for
  Agents (HITL) + Ops snapshot + the 5 polish items + performance
  budgets table refresh + screenshot placeholders.
- docs/_base/RUNBOOKS.md — 5 new failure-mode entries (23-27):
  agent_hitl_flow no-key / timeout / no-trigger, ops_snapshot
  all-failed, Stop button mid-run.

Tests

- Backend: 9 new tests in test_pipeline.py (HITL: happy / no-key /
  session-fail / no-tool / 4xx-absorb / timeout + Ops: happy / warn /
  empty); lockstep test rewrite 23 → 24 tuples; 5 new canned-response
  fixtures for /ops/* endpoints.
- Frontend: 22 new vitest cases across 5 test files
  (DemoPhasePanel onValueChange, ShowcaseKpiStrip 5-tile derivation,
  InspectArtifactsPanel 10-card grid, RunHistoryStrip localStorage
  FIFO, demo-step-card HITL + Approve + Ops mini-grid).
- E2E: test_run_demo_showcase_rich_full_epic asserts PRP-41 contract
  shapes hold when the steps execute; tolerates a pre-existing
  PRP-39/40 cascade (scenario_simulate_and_save can fail to parse
  the safer_promote_flow placeholder artifact_uri) documented in
  RUNBOOKS.md entry 18.

Validation

- ruff + format clean; mypy + pyright strict (only pre-existing
  xgboost/lightgbm stub gaps remain — documented in PRP body).
- 1635 unit tests pass; 249 frontend tests pass.
- Vertical-slice guard empty: zero imports from agents/ops/registry/
  scenarios/rag in app/features/demo/.

Out of scope (explicit)

- No new backend endpoints, no new schemas, no Alembic migrations.
- No widening of agent_require_approval (save_scenario already
  listed; HITL step consumes it).
- No CRLF/LF line-ending normalisation bundled in.

Contract probe report: PRPs/ai_docs/prp-41-contract-probe-report.md
…lish

feat(api,ui): showcase pipeline agent ops final polish (PRP-41)
…tifact-uri

fix(api): repair showcase safer promote cascade
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Important

Review skipped

Too many files!

This PR contains 152 files, which is 2 over the limit of 150.

To get a review, narrow the scope:
• coderabbit review --type committed # exclude uncommitted changes
• coderabbit review --dir # limit to a subdirectory
• coderabbit review --base # compare against a closer base

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7214510b-2938-4eb3-bb34-b0f8a2bf99f7

📥 Commits

Reviewing files that changed from the base of the PR and between 09e1efd and 6dd1708.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (152)
  • .env.example
  • .gitignore
  • PR-BODY-DRAFT.md
  • PRPs/INITIAL/INITIAL-forecast-intelligence-A-feature-frame-v2.md
  • PRPs/INITIAL/INITIAL-forecast-intelligence-B-model-zoo-backtesting.md
  • PRPs/INITIAL/INITIAL-forecast-intelligence-C-interactive-ui.md
  • PRPs/INITIAL/INITIAL-forecast-intelligence-index.md
  • PRPs/INITIAL/INITIAL-showcase-38-data-modeling-lifecycle.md
  • PRPs/INITIAL/INITIAL-showcase-39-decision-portfolio-lifecycle.md
  • PRPs/INITIAL/INITIAL-showcase-40-planning-knowledge-lifecycle.md
  • PRPs/INITIAL/INITIAL-showcase-41-agent-ops-polish.md
  • PRPs/INITIAL/INITIAL-showcase-rich-demo-control-center.md
  • PRPs/INITIAL/INITIAL-showcase-rich-demo-index.md
  • PRPs/PRP-35-forecast-intelligence-A-feature-frame-v2.md
  • PRPs/PRP-36-forecast-intelligence-B-model-zoo-backtesting.md
  • PRPs/PRP-37-forecast-intelligence-C-interactive-ui.md
  • PRPs/PRP-38-showcase-data-modeling-lifecycle.md
  • PRPs/PRP-39-showcase-decision-portfolio-lifecycle.md
  • PRPs/PRP-40-showcase-planning-knowledge-lifecycle.md
  • PRPs/PRP-41-showcase-agent-ops-polish.md
  • PRPs/ai_docs/prp-35-final-contract-snapshot.md
  • PRPs/ai_docs/prp-37-contract-probe-report.md
  • PRPs/ai_docs/prp-38-contract-probe-report.md
  • PRPs/ai_docs/prp-39-contract-probe-report.md
  • PRPs/ai_docs/prp-40-contract-probe-report.md
  • PRPs/ai_docs/prp-41-contract-probe-report.md
  • app/core/config.py
  • app/features/backtesting/metrics.py
  • app/features/backtesting/schemas.py
  • app/features/backtesting/service.py
  • app/features/backtesting/tests/test_metrics.py
  • app/features/backtesting/tests/test_service.py
  • app/features/demo/pipeline.py
  • app/features/demo/schemas.py
  • app/features/demo/tests/test_pipeline.py
  • app/features/demo/tests/test_schemas.py
  • app/features/explainability/explainers.py
  • app/features/explainability/schemas.py
  • app/features/explainability/service.py
  • app/features/explainability/tests/test_explainers.py
  • app/features/forecasting/feature_metadata.py
  • app/features/forecasting/models.py
  • app/features/forecasting/routes.py
  • app/features/forecasting/schemas.py
  • app/features/forecasting/service.py
  • app/features/forecasting/tests/test_feature_metadata.py
  • app/features/forecasting/tests/test_random_forest_forecaster.py
  • app/features/forecasting/tests/test_regression_features_v2_leakage.py
  • app/features/forecasting/tests/test_seasonal_average_forecaster.py
  • app/features/forecasting/tests/test_trend_regression_baseline_forecaster.py
  • app/features/forecasting/tests/test_weighted_moving_average_forecaster.py
  • app/features/forecasting/v2_loaders.py
  • app/features/ops/schemas.py
  • app/features/ops/service.py
  • app/features/ops/tests/test_service.py
  • app/features/rag/schemas.py
  • app/features/rag/service.py
  • app/features/rag/tests/test_service.py
  • app/features/registry/schemas.py
  • app/features/registry/service.py
  • app/features/registry/tests/test_schemas.py
  • app/features/registry/tests/test_service.py
  • app/features/scenarios/feature_frame.py
  • app/features/scenarios/service.py
  • app/features/scenarios/tests/test_future_frame_v2_leakage.py
  • app/features/seeder/routes.py
  • app/features/seeder/schemas.py
  • app/features/seeder/service.py
  • app/features/seeder/tests/test_phase2_idempotency.py
  • app/features/seeder/tests/test_routes.py
  • app/shared/feature_frames/__init__.py
  • app/shared/feature_frames/contract_v2.py
  • app/shared/feature_frames/rows_v2.py
  • app/shared/feature_frames/sidecar.py
  • app/shared/feature_frames/tests/test_contract_v2.py
  • app/shared/feature_frames/tests/test_leakage_v2.py
  • app/shared/seeder/config.py
  • app/shared/seeder/generators/facts.py
  • app/shared/seeder/tests/test_config.py
  • docs/_base/API_CONTRACTS.md
  • docs/_base/DOMAIN_MODEL.md
  • docs/_base/RUNBOOKS.md
  • docs/optional-features/05-advanced-ml-model-zoo.md
  • docs/optional-features/09-model-champion-challenger-governance.md
  • docs/optional-features/10-baseforecaster-feature-contract.md
  • docs/user-guide/advanced-forecasting-guide.md
  • docs/user-guide/dashboard-guide.md
  • docs/user-guide/showcase-manual-demo-guide.md
  • docs/user-guide/showcase-walkthrough.md
  • examples/forecasting/feature_frame_v2_preview.py
  • examples/forecasting/model_zoo_compare.py
  • frontend/src/components/charts/backtest-horizon-buckets-chart.test.tsx
  • frontend/src/components/charts/backtest-horizon-buckets-chart.tsx
  • frontend/src/components/demo/DemoPhasePanel.test.tsx
  • frontend/src/components/demo/DemoPhasePanel.tsx
  • frontend/src/components/demo/HorizonBucketsMini.test.tsx
  • frontend/src/components/demo/HorizonBucketsMini.tsx
  • frontend/src/components/demo/InspectArtifactsPanel.test.tsx
  • frontend/src/components/demo/InspectArtifactsPanel.tsx
  • frontend/src/components/demo/PHASE_DEFS.test.ts
  • frontend/src/components/demo/PHASE_DEFS.ts
  • frontend/src/components/demo/RunHistoryStrip.test.tsx
  • frontend/src/components/demo/RunHistoryStrip.tsx
  • frontend/src/components/demo/ScenarioPicker.test.tsx
  • frontend/src/components/demo/ScenarioPicker.tsx
  • frontend/src/components/demo/ShowcaseKpiStrip.test.tsx
  • frontend/src/components/demo/ShowcaseKpiStrip.tsx
  • frontend/src/components/demo/demo-step-card.test.tsx
  • frontend/src/components/demo/demo-step-card.tsx
  • frontend/src/components/forecast-intelligence/batch-matrix-picker.test.tsx
  • frontend/src/components/forecast-intelligence/batch-matrix-picker.tsx
  • frontend/src/components/forecast-intelligence/batch-preset-select.test.tsx
  • frontend/src/components/forecast-intelligence/batch-preset-select.tsx
  • frontend/src/components/forecast-intelligence/batch-preset-utils.ts
  • frontend/src/components/forecast-intelligence/champion-compatibility-badge.test.tsx
  • frontend/src/components/forecast-intelligence/champion-compatibility-badge.tsx
  • frontend/src/components/forecast-intelligence/champion-compatibility-utils.ts
  • frontend/src/components/forecast-intelligence/feature-frame-panel.test.tsx
  • frontend/src/components/forecast-intelligence/feature-frame-panel.tsx
  • frontend/src/components/forecast-intelligence/feature-frame-select.test.tsx
  • frontend/src/components/forecast-intelligence/feature-frame-select.tsx
  • frontend/src/components/forecast-intelligence/feature-groups-toggle.test.tsx
  • frontend/src/components/forecast-intelligence/feature-groups-toggle.tsx
  • frontend/src/components/forecast-intelligence/horizon-bucket-table.test.tsx
  • frontend/src/components/forecast-intelligence/horizon-bucket-table.tsx
  • frontend/src/components/forecast-intelligence/model-family-tabs.test.tsx
  • frontend/src/components/forecast-intelligence/model-family-tabs.tsx
  • frontend/src/components/forecast-intelligence/model-type-select.test.tsx
  • frontend/src/components/forecast-intelligence/model-type-select.tsx
  • frontend/src/components/forecast-intelligence/model-type-utils.ts
  • frontend/src/components/forecast-intelligence/promote-confirmation-dialog.test.tsx
  • frontend/src/components/forecast-intelligence/promote-confirmation-dialog.tsx
  • frontend/src/hooks/use-demo-pipeline.test.ts
  • frontend/src/hooks/use-demo-pipeline.ts
  • frontend/src/hooks/use-runs.ts
  • frontend/src/lib/feature-frame-utils.test.ts
  • frontend/src/lib/feature-frame-utils.ts
  • frontend/src/lib/horizon-bucket-utils.test.ts
  • frontend/src/lib/horizon-bucket-utils.ts
  • frontend/src/pages/explorer/run-compare.tsx
  • frontend/src/pages/explorer/run-detail.tsx
  • frontend/src/pages/ops.tsx
  • frontend/src/pages/showcase.tsx
  • frontend/src/pages/visualize/backtest.tsx
  • frontend/src/pages/visualize/batch.tsx
  • frontend/src/pages/visualize/forecast.tsx
  • frontend/src/pages/visualize/planner.tsx
  • frontend/src/types/api.ts
  • scripts/seed_historical_activity.py
  • scripts/seed_phase2_only.py
  • scripts/seed_registry_from_jobs.py
  • tests/test_e2e_demo.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@w7-mgfcode w7-mgfcode merged commit bae2df0 into main May 31, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(showcase): make safer promote artifact URI compatible with scenario replay

1 participant