feat(tui): v2 PR 5 — full screen-tree redesign + IA + render-cost teardown (stacks on #693)#696
feat(tui): v2 PR 5 — full screen-tree redesign + IA + render-cost teardown (stacks on #693)#696kelsonpw wants to merge 42 commits into
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Primary tasks render glyph twice (double glyph)
- Removed the
StateBadgecomponent (which rendered its own glyph) from the primary tasks section and inlined just the label text, matching the pattern used by background tasks to render the glyph exactly once per row.
- Removed the
Or push these changes by commenting:
@cursor push 97259f1ad4
Preview (97259f1ad4)
diff --git a/src/ui/tui/screens/StatusOverlayScreen.tsx b/src/ui/tui/screens/StatusOverlayScreen.tsx
--- a/src/ui/tui/screens/StatusOverlayScreen.tsx
+++ b/src/ui/tui/screens/StatusOverlayScreen.tsx
@@ -56,24 +56,13 @@
import { ChoiceStatus } from '../../../lib/orchestration/checkpoints/choices.js';
import { VerificationStatus } from '../../../lib/orchestration/checkpoints/verifications.js';
import { resolveMode } from '../utils/mode-badge.js';
-import {
- lifecycleDisplay,
- type LifecycleDisplay,
-} from '../utils/lifecycle-display.js';
+import { lifecycleDisplay } from '../utils/lifecycle-display.js';
import { TaskLifecycle } from '../../../lib/orchestration/lifecycle.js';
interface StatusOverlayScreenProps {
store: WizardStore;
}
-/** Compact "glyph + label" badge used in every section. */
-const StateBadge = ({ display }: { display: LifecycleDisplay }) => (
- <Text color={display.color} bold>
- {display.glyph}{' '}
- <Text color={display.color}>{display.label}</Text>
- </Text>
-);
-
/**
* Section header — bold, secondary color, with a count badge.
* Pulled out so the operator overview's many sections share the same
@@ -262,7 +251,7 @@
{display.glyph}{' '}
</Text>
<Text color={Colors.body}>
- <StateBadge display={display} /> — {t.label}
+ <Text color={display.color}>{display.label}</Text> — {t.label}
</Text>
</Box>
);You can send follow-ups to the cloud agent here.
19ef9dd to
b4d4e59
Compare
ff44bb5 to
db7a96c
Compare
2fb5920 to
5f4273d
Compare
bd3bee6 to
ca03480
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed:
shallowObjectEqualfails for objects with different keys- Added
Object.prototype.hasOwnProperty.call(b, k)check before the value comparison so objects with the same key count but different key names are correctly detected as unequal.
- Added
Or push these changes by commenting:
@cursor push 931ca6d84a
Preview (931ca6d84a)
diff --git a/src/ui/tui/hooks/useWizardSelector.ts b/src/ui/tui/hooks/useWizardSelector.ts
--- a/src/ui/tui/hooks/useWizardSelector.ts
+++ b/src/ui/tui/hooks/useWizardSelector.ts
@@ -107,7 +107,8 @@
const bk = Object.keys(b);
if (ak.length !== bk.length) return false;
for (const k of ak) {
- if (!Object.is(a[k], b[k])) return false;
+ if (!Object.prototype.hasOwnProperty.call(b, k) || !Object.is(a[k], b[k]))
+ return false;
}
return true;
}You can send follow-ups to the cloud agent here.
…e task label (#695) PR #688 made the inline status pill flush with the content area, sitting directly under the Tasks list. Tier 6 of the resolver returns the in-progress canonical task's `activeForm` — which is the SAME string ProgressList already renders for that row above the pill. The result was a visible duplicate: the Tasks list showed `› Detecting project setup` and the pill below showed `◇ Detecting project setup`. Fix: in `resolveRunStatusPill`, suppress tier 6 by returning `undefined` whenever any canonical task is in_progress. Higher-priority tiers (file writes, tool activity, event-plan-await, currentActivity, post-agent steps) keep firing because they carry signal the Tasks list does NOT show. Tier 7 (`pushStatus` cold-start fallback) is also skipped while a canonical task is in_progress to prevent stale narration leaking in once tier 6 stops covering for it. Tests: pinned the new contract in `run-status-pill.test.ts` (suppression, no over-suppression of tiers 1-5, the screenshot scenario, and the "no tier 7 leak" guard), updated `RunScreen.statusPill.test.tsx` and `RunScreen.spacing.test.tsx` to match.
a45ea7d to
4551d44
Compare
* perf(build): bundle wizard via tsup for faster cold-start
Replaces the per-file `tsc` JS emit with a single tsup-driven bundle so
cold-start parses one file (`dist/bin.js`) instead of resolving and
loading 343 individual modules from `dist/src/`. Type declarations are
still emitted via a separate `tsc --emitDeclarationOnly` pass so the
package's `.d.ts` surface is unchanged.
Lazy-loads the heaviest externals on the cold-start path:
- axios in src/utils/urls.ts (only used in detectRegionFromToken)
- axios in src/utils/oauth.ts (only used in OAuth exchanges)
- axios + apiClient in src/lib/api.ts (cached promise, first GraphQL
call pays the import cost)
- fast-glob in src/utils/environment.ts (only used in
detectEnvVarPrefix during framework detection)
Profile-instrumented numbers: cumulative require time drops from
~1.5 s / 1034 calls to ~0.25 s / 626 calls. Wall-clock --version
median drops 260ms -> 250ms on a fast Mac (Node startup is the
fixed-cost floor); savings are larger on slower hardware where IO
dominates.
Smoke tests cover --version, status --json, and mcp serve against
the bundled artifact so regressions in the publish path are caught
in vitest.
Build is deterministic — two consecutive `pnpm build:bundle` runs
produce byte-identical bin.js and bin.js.map.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(bugbot): clear cached lazy-import promises on rejection
The new lazy-load patterns for `axiosModulePromise`/`apiClientPromise`
(in `lib/api.ts`), `axiosPromise` (in `utils/urls.ts`), and `fgPromise`
(in `utils/environment.ts`) cached the dynamic-import promise via `??=`
but never cleared the cache on rejection. A transient `import()` failure
(broken install, partial filesystem, transient I/O) would poison every
subsequent caller in the process with the same stale rejection forever.
Switch to the same null-on-catch pattern already used by
`loadDefaultDriver` in `agent-driver.ts`: store the cached promise, wire
a `.catch()` that nulls the cache and re-throws so callers still see the
original error, and let the next call retry the import cleanly.
* test(bugbot): pin lazy-import rejection-clearing contract on detectRegionFromToken
Add a regression test that mocks axios's `.default` getter to throw on
the first call and asserts `detectRegionFromToken` re-attempts the
import after a working axios is doMock'd in. Without the
rejection-clearing branch in `loadAxios`, the second call would replay
the cached rejection instead of returning a region.
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4551d44 to
8a800c0
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: StatusOverlayScreen shows mode badge in interactive mode
- Added a
showBadgeguard (mode.key !== 'interactive') to conditionally render the mode badge in StatusOverlayScreen, matching the existing suppression logic in HeaderBar.
- Added a
Or push these changes by commenting:
@cursor push a111fe2fd9
Preview (a111fe2fd9)
diff --git a/src/ui/tui/screens/StatusOverlayScreen.tsx b/src/ui/tui/screens/StatusOverlayScreen.tsx
--- a/src/ui/tui/screens/StatusOverlayScreen.tsx
+++ b/src/ui/tui/screens/StatusOverlayScreen.tsx
@@ -157,6 +157,7 @@
const lsp = data.status.lastStoppingPoint;
const mode = resolveMode();
+ const showBadge = mode.key !== 'interactive';
// Split active tasks into "primary" (running/waiting/blocked) and
// "background" (everything else among the active set — supervisor's
@@ -205,10 +206,14 @@
<Text bold color={Colors.accent}>
{Icons.diamond} Operator overview
</Text>
- <Text color={Colors.subtle}> {Icons.dot} </Text>
- <Text color={mode.color} bold>
- [{mode.label}]
- </Text>
+ {showBadge && (
+ <>
+ <Text color={Colors.subtle}> {Icons.dot} </Text>
+ <Text color={mode.color} bold>
+ [{mode.label}]
+ </Text>
+ </>
+ )}
</Box>
<Text color={Colors.body}>{summary}</Text>
<Text color={Colors.muted}>You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 8a800c0. Configure here.
…d the revised plan renders verbatim (#701) The user reported typing "lowercase the event names" on the event-plan approval screen — but the revised plan came back with the same TitleCase names. Two root causes: 1. `confirm_event_plan` always force-Title-Cased every name via `normalizeEventName`. When the LLM dutifully revised with "user signed up", the wizard slammed it back to "User Signed Up" before the user ever saw the revised plan. The same Title-Cased names were then persisted to `.amplitude/events.json` and shipped into the eventual `track()` calls — exactly the display-vs-implementation drift the user noticed ("when they get implemented they don't show the same as this"). 2. After Enter, EventPlanFullScreen vanished and RunScreen rendered as if nothing had happened. No "Revising plan with your feedback…" beat, so the user concluded their note was dropped on the floor. Fix: - `normalizeEventName` is now non-destructive on already-multi-word inputs (lowercase, UPPERCASE, Sentence case all pass through). It still repairs schema-violating shapes (snake_case, kebab-case, camelCase, dotted, single-token). - Tool schema description and `confirm-event-plan-contract.md` now say "default to Title Case, but honor explicit user feedback that asks for a different casing convention". - New `revisingEventPlan` flag on `WizardStore`, set by `resolveEventPlan({decision:'revised'})` and cleared by the next `promptEventPlan`. New tier 3b in `resolveRunStatusPill` surfaces "Revising plan with your feedback…" until the LLM lands the next prompt. Tests: - `normalizeEventName` — preserves intentional lowercase/UPPERCASE/Sentence case on already-multi-word inputs; still repairs snake/kebab/camel/dotted. - `confirm_event_plan feedback round-trip` — feedback string returned verbatim to the agent ("feedback: lowercase the event names"); LLM payloads pass through to `promptEventPlan` and to `events.json` without re-casing. - `run-status-pill` tier 3b — "Revising plan with your feedback…" lights up after revised, clears on next prompt, doesn't fire on approve/skip. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…assification of mid-stream 400s (#705) The `[legacy] DEBUG Agent result with error: API Error: 400 ...` log line was dumping the full failing SSE response body — hundreds of `event: message_start` / `data: {"type":"content_block_delta",...}` framing lines plus `partial_json` `tool_use` deltas — into the user-visible TUI Logs tab when the Anthropic gateway terminated a streaming response with a 4xx (most commonly `400 terminated` mid-stream; Sentry #7442894144). Two leaks: the SDK Result-message branch in `agent-interface.ts` called `logToFile('Agent result with error:', message.result)` with no truncation, AND `agent-runner.ts`'s `abortOnApiError` / `GATEWAY_DOWN` / soft-error branches interpolated the raw `rawMessage` straight into user-facing copy and Sentry context. Both surfaced the SSE body verbatim — past sessions surfaced 50KB+ `log.message` strings polluting orchestrator context. Add `suppressSseFrames(message)` and `sanitizeErrorMessageForLog` helpers to `agent-events.ts` that: - detect runs of Anthropic SSE protocol frames (event:/data:/bare-JSON forms for the eight known stream-event subtypes) - collapse each run into a single `[N SSE frames suppressed]` marker - cap the result at `MAX_LOG_MESSAGE_LENGTH` (existing 2KB budget) - preserve any non-frame content (real errors / stack traces riding alongside the protocol noise survive — same defense as the existing `stripStreamEventNoise` / `partitionHookBridgeRace` pair) Apply the sanitizer at every callsite that logs / interpolates an agent error string: the two `logToFile('Agent result with error:', message.result)` paths in `agent-interface.ts`, and the GATEWAY_DOWN / GATEWAY_INVALID_REQUEST / API_ERROR / RATE_LIMIT branches plus the soft-error pushStatus path in `agent-runner.ts`. Classification still runs against the raw form (the `400 terminated` regex matches the head of the message, not the SSE body) — only logging / user-surface paths take the sanitized form. Tests: `agent-events-sse-suppression.test.ts` (10 cases — fast-path no-op, contiguous-block collapse, inline-prefix split, real-error preservation, singular vs plural wording, bare-JSON form, unknown event-type passthrough, oversized-input pipeline) covers the matcher end-to-end and the truncation cap. Verdict: pre-existing leak in `agent-interface.ts:4220` going back to the original SDK Result handler; unrelated to #698 (which only adds TUI per-event status; doesn't touch error logging). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nv-selection instead of silent defer (#703) * fix(self-heal): unstall second run after .amplitude/ wipe — surface env-selection instead of silent defer After `git reset --hard` wipes `<installDir>/.amplitude/project-binding.json`, the second wizard run gets to credential resolution, finds 2 environments via `fetchAmplitudeUser`, and "defers" — populates `pendingOrgs` and returns `Promise<void>`. To any non-TUI caller (or anyone tailing the per-project log) the void return is indistinguishable from "credentials ready"; the wizard parks at "Detecting your project setup" with no diagnostic and no in-band signal a downstream surface can branch on to route the user to the env picker / emit `auth_required: env_selection_failed`. Make the resolver return a discriminated `ResolveCredentialsResult` so the deferred path is load-bearing instead of silent: - 'resolved' — credentials populated. - 'needs_user_choice' — pendingOrgs set; carries kind + envsWithKey count so the agent-mode rejection envelope can quote it. - 'api_key_notice' — fetch succeeded but no envs had keys. - 'unauthenticated' — no usable token; caller routes to fresh OAuth. - 'ci_env_token' — WIZARD_OAUTH_TOKEN env-var path won. The TUI bin path now logs the outcome at INFO so a tail of `log.txt` shows a concrete reason ("needs_user_choice / environment_selection / envsWithKey=2") instead of just the previous deferring log line. Existing callers in `commands/helpers.ts` (CI / agent path) and `commands/dashboard.ts` ignore the return value — they already branch on `session.credentials` / `session.pendingOrgs`, so behaviour is unchanged for them; the new contract is purely additive. Regression tests in `credential-resolution.test.ts`: - Multi-env defer scenario races resolveCredentials against a 1s timeout to prove no hang, asserts the `'needs_user_choice'` outcome with envsWithKey=2, and locks down that pendingOrgs + pending tokens are populated for the env picker. Pre-fix the void return would have been `undefined` and the await would never have surfaced the deferred state explicitly. - Sanity siblings cover the 'resolved' (cached API key) and 'unauthenticated' (no stored token) and 'api_key_notice' (admin-only project) outcomes. Updated the cli.test.ts mock to return `{ outcome: 'unauthenticated' }` (the closest analogue to the pre-fix `undefined`) so 17 test-timeout regressions on the TUI-auth-task / feature-discovery suites stay green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: stamp deferredEnvCount on filter-mismatch path to avoid unauthenticated misclassification Applied via @cursor push command --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com>
…/resume commands (PR 1 of 3) Introduce src/lib/orchestration/ — a durable, file-backed orchestration store that becomes the source of truth for sessions, tasks, subagents, ownership, and last-stopping-point. Adds six new read-only CLI commands (tasks/task/sessions/session/resume/orchestration status), each emitting Zod-validated JSON envelopes for outer agents. Foundation only. Legacy WizardSession remains the live in-memory surface; PR 2 wires checkpoints + MCP-app lifecycle, PR 3 retires duplicate state and ships the TUI redesign + MCP-server tool parity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
resolveCommonOpts now passes argv.installDir through resolveInstallDir so quoted/env-sourced `~` actually expands instead of being treated as a literal directory name. The resume --execute path now imports spawn from utils/cross-platform- spawn so the npm-installed `amplitude-wizard` .cmd shim resolves on Windows. Node's built-in spawn does not consult PATHEXT and would fail with ENOENT for every Windows user invoking `wizard resume --execute`.
`wizard resume <session-id>` validated the requested session existed but then called `computeLastStoppingPoint(installDir)` without scoping, which always derived its next action from the *most recently created active session* — not the one the user asked about. The envelope's `sessionId` field still echoed the requested ID, so the command/description shown could describe a different session entirely. `computeLastStoppingPoint` now accepts an optional `sessionId` that restricts both session metadata and task buckets to that session. The resume command threads the resolved session id through, and an added test pins the scoping behavior against a two-session fixture.
…kpoints + MCP-app lifecycle Stacks on PR 1 (#689). Adds three typed checkpoint surfaces on top of the v2 orchestration foundation: - Choice — typed user-choice records with stable promptId for de-dup, requiresHuman automation gate, and full status transitions (pending → answered/expired/cancelled/superseded). - Verification — manual out-of-band verification records with status transitions (pending → passed/failed/skipped, skipped/failed may recover to passed; passed/skipped/failed may supersede). - McpAppCapability — durable lifecycle for every MCP-app capability with an anti-nag invariant: install_skipped → needs_user_choice REQUIRES a non-empty lastStateChangeReason. New CLI commands: - wizard choice list / show / answer (with --confirm-human gate) - wizard verification list / show / mark Wires last-stopping-point's pendingChoices / pendingMcpActions / pendingManualVerifications arrays to read real records (was [] in PR 1). Two callsites instrumented as the PR 2 wiring beachhead: - env-selection in src/commands/helpers.ts (Choice mirror + answer) - event-plan-approval in src/lib/wizard-tools.ts (Verification mirror) Adds 42 tests across choices/verifications/mcp-app-lifecycle/last-stopping-point/CLI. No TUI changes (deferred to PR 3); no MCP-server tool changes (deferred to PR 3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The validation array did not include 'all', so passing the documented '--status all' opt-out exited with INVALID_ARGS before reaching the 'statusRaw === all' branch on line 132. Mirror the verification list guard: skip enum validation when statusRaw === 'all' and skip the cast so it stays undefined for the listChoices call.
…sionAt `TERMINAL_VERIFICATION_STATUSES` claimed `Failed` and `Skipped` were terminal, but the allowed-transitions table explicitly permits `Failed → Passed | Superseded` and `Skipped → Passed | Failed | Superseded`. That contradicts the Choice convention (terminal = no forward transitions other than re-supersede) and `last-stopping-point` already treats `Failed` as actionable via `pendingManualVerifications`. Reduce the terminal set to `Passed` and `Superseded`. `transitionMcpCapability` set `userDecision = 'pending'` on a transition back to `NeedsUserChoice` but left a stale `userDecisionAt` from the previous installed/skipped state. Consumers checking `userDecisionAt !== null` would incorrectly conclude a decision had been made. Null it out alongside the pending decision.
- `resume --execute` now attaches `child.on('error', …)` before `exit`.
Previously a synchronous spawn failure (ENOENT, EACCES, missing PATH
entry on Windows) fired an unhandled `error` event, which Node's
EventEmitter rethrows — crashing the CLI with a stack trace instead
of producing a clean message + GENERAL_ERROR exit.
- `saveStore` was calling `ensureDir(dirname(path))` and then
`ensureDir(getRunDir(installDir))` — both resolve to the same run
directory because `getOrchestrationStoreFile()` is defined as
`join(getRunDir(installDir), 'orchestration.json')`. Drop the second
call and the now-unused `getRunDir` import.
- `computeLastStoppingPoint` already filtered tasks by `options.sessionId` but read the full unfiltered `file.choices` / `file.mcpCapabilities` / `file.verifications` arrays. `wizard resume <session-id>` could surface pending checkpoints belonging to a different (more recently active) session, producing a misleading `nextAction`. Filter each by the session-link field on the record (`linkedSessionId` for choices and MCP capabilities, `blockingSessionId` for verifications) so all four buckets stay consistent with the requested session. - `choice.ts` and `verification.ts` had inline `resolveCommonOpts` / `emitJson` / `emitJsonError` that omitted the `resolveInstallDir` call done correctly in `orchestration.ts`. A user passing `--install-dir ~/myapp` would resolve to `<cwd>/~/myapp` instead of the home-relative path, silently writing to the wrong store. Extract the helpers to a shared `orchestration-common.ts` and switch all three command modules to it so the `resolveInstallDir` fix applies uniformly and future drift is impossible.
`deriveNextAction` builds an `inspect_failure` next-action when the
most recent task has stopped. The structured `command` array uses
the configurable `cliPrefix` (sourced from `args.invocation`, which
flows from `options.cliInvocation` on `computeLastStoppingPoint`),
but the inline shell hint embedded in `description` was templating
the hardcoded module-level `CLI_INVOCATION` constant. A custom
invocation (e.g. an alternate `wizard` symlink, or a test harness
overriding the binary name) would surface a description that says
\`amplitude-wizard task <id>\` while the JSON payload's `command`
points at the configured executable. Use `cliPrefix.join(' ')` for
both so the human and machine views always agree.
`resumeCommand` is the human-facing copy-pasteable form of
`nextAction.command`. It was built with `nextAction.command.join(' ')`,
which silently corrupts paths containing spaces (e.g. an `installDir`
of `/Users/me/my project` would land in the shell as two separate
words). The structured `command` array stayed correct, but the string
the user is invited to paste into a terminal would fail.
Add a small `shellJoin` / `shellQuote` helper that wraps tokens with
shell metacharacters or whitespace in single quotes (with the standard
`'\''` close/escape/reopen dance for embedded single quotes). Tokens
that are already shell-safe stay unquoted so the common case stays
readable.
`HeaderBar` already gates the mode badge on `resolved.key !== 'interactive'`, so the default interactive run never sees a stray `[interactive]` chip. `StatusOverlayScreen` rendered the badge unconditionally in its header, so opening `/status` during a normal interactive run printed `[interactive]` next to "Operator overview" — which the brief explicitly says is noise. Mirror HeaderBar's gate and add tests covering both branches (suppressed in interactive, visible in agent mode). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
781c24b to
2c93edd
Compare
The Progress tab on `feat/v2-tui-redesign` used a single-line
`wrap="truncate-end"` row to display every planned event name
joined by commas:
◆ Events: Nf User Signed Up, Nf User Signed In, Nf Use…
Once the agent fills in 10+ events, the line truncates to "…"
even though the screen has a full column of empty rows below it
— the active task list collapses to ~5 lines once Wiring is the
focused step. The user can't audit the plan they just approved
without flipping to /events.
Render one event per row with name + description, soft-wrapped.
The bullet (`·`) lines up with the existing diamond glyph
column. The `(N events)` count gives an at-a-glance scale check.
No data-model change — `PlannedEvent` is still `{name, description}`.
Per-event lifecycle status (queued → in_progress → done) is the
shape of #698 against `main` and is a separate change.
27 RunScreen tests still pass; no test asserted on the comma-join
shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous ReportViewer wrapped each visible line in `<Text wrap="truncate">`, which silently dropped the right edge of any line wider than the content area — events-table rows and code blocks with long paths or JSON payloads got a stray "…" decoration and the user had no way to read what was clipped. Add ANSI-aware horizontal panning so the user can shift the visible window left/right with `h`/`l` (or arrow keys), keep colour codes intact across the slice, and surface the full LogViewer-style key hint footer (↑↓/jk scroll · h/l pan · g/G top/bottom · 0 reset · Esc close). The pan offset is clamped to the widest line so users can't scroll into empty whitespace. Also include a regression test covering: (a) horizontal pan offset shifts visible content, (b) lines wider than the content area are no longer clipped at the right edge, (c) the key-hint footer surfaces all documented controls, and (d) `0` resets the pan offset. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…added/removed (#707) The event-plan approval flow used to feel like list replacement. Each time the user pressed `[F] give feedback`, the agent's revised plan silently overwrote the prior list — the user lost the context of what they had just said and what the AI changed in response. The recent "keep event names snake cased and prefixed" feedback round made this visible: the AI applied the prefix but the user couldn't tell from the new screen alone whether snake_case had also landed (it had not — the display normalizer was Title-Casing — but the lack of an explicit "AI: revised plan +N -M" signal meant nothing about the conversation was legible). Add a round-history layer so the screen can render the back-and-forth: * Store now keeps `EventPlanRound[]`, one entry per `promptEventPlan` call. Each round carries the AI's plan + the user feedback (if any) that produced it. Cleared on `approved` / `skipped`; persists across `revised` rounds. * `pendingPlanFeedback` (instance field) buffers feedback typed in one `[F]` decision and pairs it with the next `promptEventPlan` from the agent. Single-pair carry — no leakage across runs. * EventPlanFullScreen renders a conversational header on rounds ≥ 2: - "You: <quoted feedback>" - "AI: revised plan +N added · −M removed" (with green/red counts) * Per-row diff markers when a prior round exists: - `+` (green) for events new to this round - `−` (red, struck-through) for events the AI dropped - bullet (`·`) for unchanged events Diff is by name (description regen on revision is expected). Round 1 still renders the original "Suggested events for your app" title — the convo affordance only appears once there's actually a conversation to render. Tests: * 3 new cases in EventPlanFullScreen.test.tsx — round-2 quote+delta rendering, round-1 fallback, history clear on approve. * All 174 existing store tests + 6 existing screen tests stay green. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
V2 PR 5 (full TUI v2 screen-tree redesign) — replaced by V3 polish work landed and pending across the TUI v3 chain. Closing per direction change. Audit by subagent a209c551541d85df3. |


TL;DR
/status, mode badge inHeaderBar,/helpslash command with mid-run grouping, and theuseWizardSelectorslice-hook infrastructure for render-cost discipline.src/ui/tui/screens/__tests__/StatusOverlayScreen.glyphs.test.tsx— pins the canonical glyph palette, the "what's blocking" summary headline, and the mode-badge + Blocked-vs-Running distinction.src/ui/tui/utils/lifecycle-display.ts. Silent drift trips a unit test.How to review without running
The diff is large. Below are the 7 files where most reviewable behavior lives, with what each one shows. Skim these in order to get the substance of the PR without checking out the branch.
src/ui/tui/utils/lifecycle-display.ts— Canonical glyph + label + color vocabulary mappingTaskLifecycle→{glyph, color, label}. This is the source-of-truth that all primary surfaces (operator overview, progress list, header) consume. The exhaustive switch keeps drift caught at compile time.src/ui/tui/utils/__tests__/mode-badge.test.ts— Spec-as-test for mode-badge resolution priority (agent>ci>nested>mcp-server> suppressed in plain interactive). Reads top-to-bottom as the resolution rules; coversCLAUDECODE=1andCLAUDE_CODE_ENTRYPOINTenvs.src/ui/tui/screens/StatusOverlayScreen.tsx— The Operator Overview itself. Sectioned by Session / Primary work / Background / Pending choices / Pending verifications / MCP capabilities / Owned artifacts / Next action. Includes the new "what's blocking the run?" headline summary.src/ui/tui/screens/__tests__/StatusOverlayScreen.glyphs.test.tsx— Regression pins for the glyph palette, the summary headline ("Waiting on N choices…", "N primary task(s) in flight", etc.), the Blocked vs. Running distinction (⏸red vs.›violet), and mode-badge rendering.src/ui/tui/hooks/useWizardSelector.ts+src/ui/tui/hooks/__tests__/useWizardSelector.test.tsx— Slice-hook infrastructure withshallowArrayEqual/shallowObjectEqual. The render-cost ceiling test (src/ui/tui/__tests__/render-cost.test.tsx) demonstrates the ~60% render-budget cut for slice-subscribed components versus whole-store subscribers.src/ui/tui/components/HeaderBar.tsx+src/ui/tui/components/__tests__/HeaderBar.modeBadge.test.tsx— Mode badge placement and conditional suppression in plain interactive mode.src/ui/tui/__tests__/console-commands-help.test.ts— Locks down the/helptext contract: which commands are listed, how they're grouped (always-available vs. paused-during-run), and the hanging-indent multi-line feedback rendering.Recommended runtime check:
pnpm try --install-dir=<some-test-app>and press/status— confirms the Operator Overview renders with the new glyphs, summary headline, and mode badge in the header.Stacks on #693 → #691 → #690 → #689. Merge after PRs 1+2+3+4.
Problem
PRs 1–4 fixed the substrate (durable orchestration store, lifecycle, choice/verification primitives, supervisor with PID + heartbeats, live file-watcher refresh). The TUI surface was solid but unfinished:
/statuswas usable but not refreshable while open.PR 5 turns the TUI from "screens that mostly work" into a serious operator interface: coherent IA, shared glyph vocabulary, actionable operator overview, and render-cost discipline.
IA redesign
Three-zone layout (top → bottom):
The mode badge surfaces
[agent]/[ci]/[nested]/[mcp-server](suppressed in plain interactive mode). Resolution priority is documented indocs/tui-v2.md.ASCII layout — 3 viewports
Wide (142×41)
Standard (100×30)
Narrow (80×24)
Glyph palette (canonical vocabulary)
Every primary surface shares one vocabulary:
○›…⏸✓✗⊘⮕Lives in
src/ui/tui/utils/lifecycle-display.ts, sourced fromTaskLifecycle. Pinned by unit tests so silent drift trips a test.Screen tree changes
StatusOverlayScreen→ "Operator Overview". Sectioned by Session / Primary work / Background / Pending choices / Pending verifications / MCP capabilities / Owned artifacts / Next action. Live-refresh via PR 4'suseOrchestrationStorehook so a sibling shell runningwizard choice answer …updates the open overlay without close+re-open.Prompt UX contract
Every choice prompt renders the full contract:
Choice.whyAskingChoice.options[]Choice.recommendedOptionIdChoice.safeDefaultOptionIdChoice.reversibleChoice.requiresHumanChoice.consequenceIfSkippedChoice.resumeCommandChoiceCheckpointBannerrenders the full block; the operator overview renders an inline condensed version that still includes every field.Slash command coherence
New
/helpcommand lists every registered command grouped by "available anytime" vs "available before/after a setup run". When a run is active, the second group renames itself "paused while a setup run is active (Ctrl+C to cancel, then retry)" so the user knows exactly why a command can't fire and what to do.Multi-line command feedback (e.g.
/help,/diagnostics) now renders with a hanging indent so it reads as one coherent block, not several disconnected lines.Render-cost teardown
New
useWizardSelector(store, selector, isEqual?)slice hook. Components subscribed to a slice no longer rerender for unrelated store ticks.shallowArrayEqualandshallowObjectEqualexported alongside.Render-cost benchmark fixture (
src/ui/tui/__tests__/render-cost.test.tsx):Slicing cuts each subscriber's render budget by ~60% in this scenario. The infrastructure ships in PR 5; migrating individual subscribers is incremental.
Bugs addressed (with regression-test refs)
CLAUDECODE=1/CLAUDE_CODE_ENTRYPOINTnow surface as[nested]in the header and in the operator overview. Test:src/ui/tui/utils/__tests__/mode-badge.test.ts,src/ui/tui/screens/__tests__/StatusOverlayScreen.glyphs.test.tsx.Blockedstate now renders the distinct⏸glyph + red color, separate from running›+ violet. Test:StatusOverlayScreen.glyphs.test.tsx.console-commands-help.test.ts(text contract) and existing ConsoleView tests.StatusOverlayScreen.glyphs.test.tsx::summary headline.The brief's other bug categories (success-while-pending, prompt re-ask after durable answer, prompts that disappear) are already covered by PR 3's
ManualVerificationRibbonintegration and the existingChoiceCheckpoint.test.tsx/OutroScreen.verificationRibbon.test.tsxregression tests, which pass on PR 5.Tests added
40 new tests over the base 3949 (3989/3989 vitest):
/helptext generationBackward compatibility
/helpis additive./statusoverlay's data shape is unchanged from PR 3; only the rendering reorganized.--agent,--ci,--json,manifest,plan,apply,verify, MCP server,v: 1envelope, exit codes — all unchanged.ProgressListstill uses a blank gutter forpendingrows rather than the canonical○glyph (deliberate UX trade-off — see comment inProgressList.tsx).Known limitations & follow-ups
useWizardSelector); migrating every subscriber over is out of scope for PR 5. The infrastructure is in place; the migration is incremental.Test plan
pnpm exec vitest run --pool=forks --maxWorkers=1— 277 files / 3989 tests passpnpm test:bdd— 100/100 scenarios passpnpm build(TypeScript + smoke test) — cleanpnpm lint— clean (only pre-existing warning unchanged)🤖 Generated with Claude Code
Note
Medium Risk
Adds new orchestration-related CLI commands and MCP tool surfaces plus changes to credential-resolution signaling, which can affect automation and operator workflows if schemas/exit codes drift. Most changes are additive and covered by new schema-validated smoke tests, but they touch user-facing command routing and external contracts.
Overview
Introduces a durable orchestration inspection surface across the CLI and the external MCP server, adding
tasks/task/sessions/session/resume/orchestration statuspluschoiceandverificationsubcommands with documented extended exit codes and JSON envelopes.Wires a first “beachhead” mirror for agent-mode environment selection into the orchestration store (records an
environment_selectionchoice and marks it answered), and updates credential resolution to return a discriminated outcome that is logged bybin.tsfor debuggability.Expands docs/README with
/statusand TUI v2/operator concepts, adds shared orchestration CLI helpers (install-dir resolution, JSON error envelopes), and adds extensive new tests (CLI smoke tests against real binary, MCP-server orchestration tool parity, SSE frame suppression, and per-run cache memoization).Reviewed by Cursor Bugbot for commit 6210b95. Bugbot is set up for automated code reviews on this repo. Configure here.