Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/00_overview/DASHBOARD.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ _Top-level index across MVP1 → GA v1+ as of **2026-06-04**. Click a release na
| Release | Theme | Progress | Status |
|---|---|---|---|
| [MVP1 / v0.1](MVP1_DASHBOARD.md) | The Loop | 94 / 94 scoped done | **Complete** |
| [MVP2 / v0.2](MVP2_DASHBOARD.md) | Three-Engine + Real Signals | 14 / 25 scoped done · 26 remaining | **In progress** |
| [MVP2 / v0.2](MVP2_DASHBOARD.md) | Three-Engine + Real Signals | 15 / 25 scoped done · 25 remaining | **In progress** |
| MVP3 / v0.3 | Observable | — | **Not yet scoped** |
| GA v1 / v1.0 | Production-ready | — | **Not yet scoped** |

Expand Down
88 changes: 45 additions & 43 deletions docs/00_overview/MVP2_DASHBOARD.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/00_overview/dashboard.html
Original file line number Diff line number Diff line change
Expand Up @@ -392,7 +392,7 @@ <h2>Releases</h2>
<div class="roadmap-row">
<div class="release-name"><a href="mvp2_dashboard.html">MVP2 / v0.2</a></div>
<div class="theme">Three-Engine + Real Signals</div>
<div class="progress">14 / 25 scoped done · 26 remaining</div>
<div class="progress">15 / 25 scoped done · 25 remaining</div>
<span class="state-pill in_progress">In progress</span>
</div>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
| Digest worker | [`backend/workers/digest.py`](../../../../backend/workers/digest.py) | Line 1289 reads `auto_followup_depth = study.config.get("auto_followup_depth")` and, if not None, enqueues `enqueue_followup_study` via Arq with deterministic `_job_id=f"enqueue_followup_study:{study_id}"`. The digest also persists `suggested_followups` as JSONB on the `digests` row before this dispatch. |
| Digest model | [`backend/app/db/models/digest.py`](../../../../backend/app/db/models/digest.py) | `Digest.suggested_followups: Mapped[list[dict[str, Any]]]` — NOT NULL, JSONB, server_default `'[]'::jsonb`. 1:1 with `studies` via UNIQUE FK on `study_id`. Consumers read via `parse_followup_list()` per spec D-defensive-ingest. |
| Study model | [`backend/app/db/models/study.py`](../../../../backend/app/db/models/study.py) | `studies.config: JSONB` carries `auto_followup_depth`. Self-FK `parent_study_id`. `parent_proposal_id` + `parent_proposal_followup_index` ([lines 86-97](../../../../backend/app/db/models/study.py#L86-L97)) are the lineage columns the manual "Run this followup" path uses — DB CHECK `studies_parent_proposal_pair_check` requires both-set-or-both-NULL. |
| Proposal model | [`backend/app/db/models/proposal.py`](../../../../backend/app/db/models/proposal.py) | `Proposal.status` CHECK constraint — `status IN ('pending', 'pr_opened', 'pr_merged', 'rejected')` ([line 42](../../../../backend/app/db/models/proposal.py#L42)). **No `superseded` value today** — adding one requires a migration (deferred to Phase 3 `phase3_idea.md`). |
| Proposal model | [`backend/app/db/models/proposal.py`](../../../../backend/app/db/models/proposal.py) | `Proposal.status` CHECK constraint — `status IN ('pending', 'pr_opened', 'pr_merged', 'rejected')` ([line 42](../../../../backend/app/db/models/proposal.py#L42)). **No `superseded` value today** — adding one requires a migration (deferred to Phase 3 `feat_overnight_final_solution_phase3/idea.md`). |
| Chain endpoint | [`backend/app/api/v1/studies.py:856-867`](../../../../backend/app/api/v1/studies.py#L856-L867) + Pydantic `StudyChainLink` at [`schemas.py:867-885`](../../../../backend/app/api/v1/schemas.py#L867-L885) | `GET /api/v1/studies/{id}/chain` returns `links: list[StudyChainLink]` with the rolled-up `best_link_id` + `cumulative_lift` + `stop_reason` + `proposal_id_for_best_link`. The `StudyChainLink` shape is explicitly extensible (the convergence-indicator spec FR-7 added `convergence_verdict` as a soft-contract additive field — see [`convergence.py:77-89`](../../../../backend/app/domain/study/convergence.py#L77-L89)). |
| Chain panel | [`ui/src/components/studies/auto-followup-chain-panel.tsx`](../../../../../ui/src/components/studies/auto-followup-chain-panel.tsx) | Calls `useStudyChain(studyId)` and renders the ordered link list + cumulative-lift + stop-reason + best-config CTA per feat_overnight_autopilot FR-4. Reusing this panel — no replacement. |
| Wizard depth selector | [`ui/src/components/studies/create-study-modal.tsx:1460-1468`](../../../../../ui/src/components/studies/create-study-modal.tsx#L1460-L1468) | The `🌙 Run overnight (compound automatically)` label + `cs-auto-followup` testid + `InfoTooltip glossaryKey="overnight_autopilot"` + `Select` writing `auto_followup_depth: 0..5` into `config`. This feature ADDS a strategy toggle immediately below it. |
Expand Down Expand Up @@ -87,8 +87,8 @@ No URL changes. The chain panel and the wizard mount at their existing positions
### Out of scope

- Any change to `evaluate_chain_gate`, the budget peek, the depth decrement, the cancel cascade, or the layer-1/layer-2 idempotency contract. The strategy dispatch happens AFTER all of these.
- A `superseded` value on `proposals.status` (Phase 3 → `phase3_idea.md`). MVP2 leans on the existing `/chain` endpoint's `best_link_id` + `proposal_id_for_best_link` to give the operator a single morning artifact; marking non-winning links' proposals `superseded` is a separate UX decision + migration that's not required for the core "explore + roll up" capability.
- A standalone morning summary card on the `/studies` list (Phase 2 → `phase2_idea.md`, coordinates with the existing `feat_overnight_studies_summary_card` sibling idea).
- A `superseded` value on `proposals.status` (Phase 3 → `feat_overnight_final_solution_phase3/idea.md`). MVP2 leans on the existing `/chain` endpoint's `best_link_id` + `proposal_id_for_best_link` to give the operator a single morning artifact; marking non-winning links' proposals `superseded` is a separate UX decision + migration that's not required for the core "explore + roll up" capability.
- A standalone morning summary card on the `/studies` list (Phase 2 → `feat_overnight_final_solution_phase2/idea.md`, coordinates with the existing `feat_overnight_studies_summary_card` sibling idea).
Comment on lines +90 to +91
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The references to feat_overnight_final_solution_phase3/idea.md and feat_overnight_final_solution_phase2/idea.md are plain backticked strings with incorrect relative paths. Since this file was moved to implemented_features/, these should be updated to clickable relative links with the correct paths:

  • For Phase 3: [Phase 3](../../planned_features/02_mvp2/feat_overnight_final_solution_phase3/idea.md)
  • For Phase 2: [Phase 2](../../planned_features/02_mvp2/feat_overnight_final_solution_phase2/idea.md)

Please update these lines to:

- A `superseded` value on `proposals.status` ([Phase 3](../../planned_features/02_mvp2/feat_overnight_final_solution_phase3/idea.md)). MVP2 leans on the existing `/chain` endpoint's `best_link_id` + `proposal_id_for_best_link` to give the operator a single morning artifact; marking non-winning links' proposals `superseded` is a separate UX decision + migration that's not required for the core "explore + roll up" capability.
- A standalone morning summary card on the `/studies` list ([Phase 2](../../planned_features/02_mvp2/feat_overnight_final_solution_phase2/idea.md), coordinates with the existing `feat_overnight_studies_summary_card` sibling idea).

- A new follow-up kind, a change to the digest LLM prompt, or a change to the digest's structured-output schema.
- Multi-child fan-out per parent. The shipped engine's linear-chain invariant (D-7 of `feat_overnight_autopilot`) holds — strategy selection picks ONE follow-up per link.
- Operator-pickable mid-chain strategy switching. Strategy is set at study create and inherited verbatim by descendants.
Expand All @@ -106,8 +106,8 @@ No URL changes. The chain panel and the wizard mount at their existing positions
### Phase boundaries

- **Phase 1 (this spec, MVP2):** FR-1 through FR-9 — the strategy wire contract, the wizard toggle, the worker dispatch, the cycle guard, the chain endpoint additive field, the panel badge, telemetry, tutorial, glossary key. Ships the autonomous cross-knob/cross-template exploration capability behind an opt-in toggle.
- **Phase 2 (deferred to [`phase2_idea.md`](phase2_idea.md)):** Dedicated morning summary card surfacing the rolled-up winner + the explored path + total lift, separate from the chain panel. Coordinates with [`feat_overnight_studies_summary_card`](../feat_overnight_studies_summary_card/idea.md). Rationale for deferral: the existing `/chain` endpoint already exposes the data needed; a polished morning card is a UX add-on that should follow rather than block the capability.
- **Phase 3 (deferred to [`phase3_idea.md`](phase3_idea.md)):** Proposal `superseded` status value + state-transition logic that marks non-winning chain links' proposals `superseded` so the morning artifact is unambiguously *one* answer. Rationale for deferral: requires a migration that reopens shipped schema (CHECK constraint on `proposals.status`) and a UX decision on whether superseded proposals appear in the `/proposals` index at all. Phase 1 delivers cross-knob exploration; Phase 3 polishes the rollup. Build it when an incident or design partner asks for the cleaner index.
- **Phase 2 (deferred to [`feat_overnight_final_solution_phase2/idea.md`](../../planned_features/02_mvp2/feat_overnight_final_solution_phase2/idea.md)):** Dedicated morning summary card surfacing the rolled-up winner + the explored path + total lift, separate from the chain panel. Coordinates with [`feat_overnight_studies_summary_card`](../feat_overnight_studies_summary_card/idea.md). Rationale for deferral: the existing `/chain` endpoint already exposes the data needed; a polished morning card is a UX add-on that should follow rather than block the capability.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The relative link to feat_overnight_studies_summary_card is broken:
[feat_overnight_studies_summary_card](../feat_overnight_studies_summary_card/idea.md)

Since this file was moved to implemented_features/2026_06_04_feat_overnight_final_solution/, the relative path needs to go up two levels to reach docs/00_overview/ and then down to planned_features/02_mvp2/feat_overnight_studies_summary_card/feature_spec.md.

Please update the link to:
[feat_overnight_studies_summary_card](../../planned_features/02_mvp2/feat_overnight_studies_summary_card/feature_spec.md)

- **Phase 3 (deferred to [`feat_overnight_final_solution_phase3/idea.md`](../../planned_features/02_mvp2/feat_overnight_final_solution_phase3/idea.md)):** Proposal `superseded` status value + state-transition logic that marks non-winning chain links' proposals `superseded` so the morning artifact is unambiguously *one* answer. Rationale for deferral: requires a migration that reopens shipped schema (CHECK constraint on `proposals.status`) and a UX decision on whether superseded proposals appear in the `/proposals` index at all. Phase 1 delivers cross-knob exploration; Phase 3 polishes the rollup. Build it when an incident or design partner asks for the cleaner index.

---

Expand Down Expand Up @@ -678,15 +678,15 @@ If the chain produces an unexpected swap_template result the operator wants to a
- [ ] Coverage gate ≥ 80% holds.
- [ ] Rollout gates from §16 satisfied (no schema change, no migration, no flag).
- [ ] `docs/01_architecture/api-conventions.md` + `data-model.md` + `ui-architecture.md` + `tutorial-first-study.md` updated.
- [ ] Phase 2 + Phase 3 deferred-work tracking files (`phase2_idea.md`, `phase3_idea.md`) exist alongside this spec.
- [ ] Phase 2 + Phase 3 deferred-work tracking files exist as their own planned_features folders (`feat_overnight_final_solution_phase2/`, `feat_overnight_final_solution_phase3/`).
- [ ] No open questions remain in §19.

## 19) Open questions and decision log

### Open questions

- **OQ-1 (resolved at GPT-5.5 cycle 1, finding C1-B1)** — How does the chain-panel badge resolve the "short template name" for a `swap_template` link's display? **Resolved as D-11**: per-link `GET /api/v1/query-templates/{id}` fetch from the frontend (FR-7 updated). Rationale: at most 0–5 extra small fetches per chain, already TanStack-Query-cached client-side, keeps `/chain`'s response shape stable.
- **OQ-2 (resolved at GPT-5.5 cycle 2, finding C2-B3)** — Should the strategy toggle ALSO show as a read-only line on the study detail page? **Resolved as D-15**: deferred to Phase 2 (`phase2_idea.md`). The chain-panel badges per link (FR-7) already surface the strategy a chain link followed; an extra detail-page line would be a redundant secondary surface. If operator feedback during MVP2 says the chain panel is too far down the page to spot quickly, Phase 2 picks it up as part of the morning summary card scope.
- **OQ-2 (resolved at GPT-5.5 cycle 2, finding C2-B3)** — Should the strategy toggle ALSO show as a read-only line on the study detail page? **Resolved as D-15**: deferred to Phase 2 (`feat_overnight_final_solution_phase2/idea.md`). The chain-panel badges per link (FR-7) already surface the strategy a chain link followed; an extra detail-page line would be a redundant secondary surface. If operator feedback during MVP2 says the chain panel is too far down the page to spot quickly, Phase 2 picks it up as part of the morning summary card scope.

_No open questions remain — §18's "no open questions" gate is satisfied._

Expand All @@ -709,4 +709,4 @@ _No open questions remain — §18's "no open questions" gate is satisfied._
- Rationale: a single, unambiguous contract everywhere (FR-3, FR-5, FR-6, AC-3, AC-6, AC-9, AC-12, AC-18 all reconcile). The earlier draft had FR-3 and FR-5/AC-12 contradicting each other on the legacy-path persistence — D-12 resolves in favor of the clean-legacy contract.
- **D-13 (2026-06-03, GPT-5.5 cycle 1 finding C1-A3 accept)** — `auto_followup_strategy` field type is `str | None` (NOT `Literal[...]`). The pair-and-value check happens in the `_validate_auto_followup_strategy` model_validator with the message-prefix path so the canonical `AUTO_FOLLOWUP_STRATEGY_INVALID` error code reaches the response envelope. Mirrors the existing `_validate_auto_followup_depth` pattern — a Pydantic `Literal[...]` at field-level would surface generic `VALIDATION_ERROR` for unknown values, violating §8.6's error-code contract.
- **D-14 (2026-06-03, GPT-5.5 cycle 1 finding C1-A4 accept)** — The wizard does NOT write `auto_followup_visited_template_ids`. The worker is the sole writer. The anchor's missing key is treated as `[anchor.template_id]` by the worker. The create-study contract test asserts a wizard-submitted `auto_followup_visited_template_ids` is 422-rejected. Rationale: single-writer rule eliminates the "two writers must agree on the seed value" coordination surface.
- **D-15 (2026-06-03, GPT-5.5 cycle 2 finding C2-B3 accept)** — Strategy read-only line on the study detail page (OQ-2) is deferred to Phase 2 (`phase2_idea.md`). The FR-7 per-link chain-panel badges are sufficient for MVP2; an extra detail-page line is redundant and would crowd the existing detail-page layout. Phase 2 picks it up if operator feedback says the chain panel is too far down to spot quickly during morning review.
- **D-15 (2026-06-03, GPT-5.5 cycle 2 finding C2-B3 accept)** — Strategy read-only line on the study detail page (OQ-2) is deferred to Phase 2 (`feat_overnight_final_solution_phase2/idea.md`). The FR-7 per-link chain-panel badges are sufficient for MVP2; an extra detail-page line is redundant and would crowd the existing detail-page layout. Phase 2 picks it up if operator feedback says the chain panel is too far down to spot quickly during morning review.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Implementation Plan — Overnight → final solution (autonomous cross-knob tuning)

**Date:** 2026-06-03
**Status:** Ready for Execution
**Status:** Complete (PR #440, squash-merged `1e9522a0` 2026-06-04)
**Primary spec:** [`feature_spec.md`](feature_spec.md)
**Policy source(s):** [`CLAUDE.md`](../../../../CLAUDE.md) (Absolute Rules), [`docs/01_architecture/api-conventions.md`](../../../../01_architecture/api-conventions.md)

Expand Down Expand Up @@ -30,7 +30,7 @@
| FR-9 (glossary key) | Epic 1 / Story 1.2 | `overnight_strategy` glossary key ships with the wizard toggle |
| FR-9 (tutorial + runbook) | Epic 4 / Story 4.1 | Tutorial Step 12 sub-section + autopilot runbook event section |

All spec FRs covered. No deferred FRs in Phase 1 (Phase 2 + Phase 3 tracked in `phase2_idea.md` + `phase3_idea.md`).
All spec FRs covered. No deferred FRs in Phase 1 (Phase 2 + Phase 3 tracked in `feat_overnight_final_solution_phase2/idea.md` + `feat_overnight_final_solution_phase3/idea.md`).

## 2) Delivery structure

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Pipeline Status — feat_overnight_final_solution

**Release:** mvp2

## Idea
- Status: Complete
- File: idea.md

## Spec
- Status: Approved
- Date: 2026-06-03
- File: feature_spec.md
- Cross-model review: GPT-5.5 passed (2 cycles to convergence; 0 High-severity findings at cycle 2)
- Cycle 1: 11 findings (6 High, 5 Medium, 0 Low) — all 11 accepted and applied
- Cycle 2: 6 findings (0 High, 5 Medium, 1 Low) — all 6 accepted and applied (internal-consistency cleanups from cycle 1 edits)
- Phases: 3 total (Phase 1 covered by this spec; Phase 2 + Phase 3 deferred with `feat_overnight_final_solution_phase2/idea.md` + `feat_overnight_final_solution_phase3/idea.md`)

## Plan
- Status: Approved
- Date: 2026-06-03
- File: implementation_plan.md
- Cross-model review: GPT-5.5 passed (2 cycles; cycle 1: 10 findings (5 High, 5 Medium) all accepted+applied; cycle 2: 0 findings — converged)
- Stories: 7 across 4 epics (Epic 1 schema+wizard, Epic 2 worker dispatch, Epic 3 chain surface, Epic 4 docs)
- Phases covered: Phase 1 (Phase 2 + 3 split out to their own planned_features folders `feat_overnight_final_solution_phase2/` + `feat_overnight_final_solution_phase3/` at finalization)

## Implementation
- Status: Complete
- Date: 2026-06-04
- PR: #440 (squash-merged `1e9522a0`)
- CI: green (all 17 `pr.yml` checks)
- Stories: 7/7 complete across 4 epics
- Cross-model review: Gemini 1 finding (rejected — hunk-isolated `child_id` false positive); GPT-5.5 final review 3 findings (0 High; 2 Medium + 1 Low all accepted + applied in `ac2fdc8a`)
- Tests: 17 domain unit + 10 worker integration + 11 contract + 4 schema unit + 6 wizard vitest + 2 chain-panel vitest + 4 enum source-of-truth + 1 glossary value-lock
- Deferred: Phase 2 (`feat_overnight_final_solution_phase2/idea.md` — morning summary card) + Phase 3 (`feat_overnight_final_solution_phase3/idea.md` — proposal `superseded` status) remain tracked; tangential `chore_e2e_overnight_strategy_radix_select_timing` + adjacent `feat_proposal_full_param_space_view` ideas filed
Loading