Skip to content

fix(tui): lock journey stepper to canonical 5 tasks#449

Open
kelsonpw wants to merge 1 commit intomainfrom
kelsonpw/journey-stepper-order
Open

fix(tui): lock journey stepper to canonical 5 tasks#449
kelsonpw wants to merge 1 commit intomainfrom
kelsonpw/journey-stepper-order

Conversation

@kelsonpw
Copy link
Copy Markdown
Collaborator

@kelsonpw kelsonpw commented Apr 30, 2026

Summary

  • Progress checklist would un-check itself and appear out-of-order whenever the agent drifted ("Install Amplitude SDK" vs "Install Amplitude"), renamed a step on retry, or marked a later step in_progress before an earlier one finished. The exact-string monotonic guard in syncTodos couldn't catch any of those.
  • Canonical 5 are now the source of truth in code (src/lib/canonical-tasks.ts), pre-populated on RunPhase.Running so the user sees 0 done · 5 to go from frame 1.
  • syncTodos buckets each agent TodoWrite into a canonical step (exact-label first, keyword fallback), applies a per-step monotonic guard, enforces single-in-progress + ordering, and drops unbucketed sixth todos. The agent's latest activeForm still surfaces on the correct row, so users continue to see real-time progress text like "Installing project dependencies".

Why

The contract was enforced only by system-prompt text in commandments.ts. LLMs drift on long runs and after retries — the user reported seeing "Installing project dependencies" / "Inspecting project" / tracking-events updates appear before "Install Amplitude" started, then watching "Install Amplitude" un-check itself across an agent retry. Code-side enforcement is the only durable fix.

Test plan

  • pnpm test — 2544 passing (5 new), 14 skipped, 0 failing
  • pnpm lint — clean
  • pnpm build — clean (smoke test passes)
  • Manual: pnpm try against a Next.js test app, verify the 5-row list shows from frame 1, labels match canonical wording, activeforms surface during work, "Install Amplitude" stays ✓ across an agent retry.

🤖 Generated with Claude Code


Note

Medium Risk
Touches core TUI run-progress aggregation logic (setRunPhase/syncTodos), changing how agent TodoWrite output maps to UI state; bugs here could misreport progress or active step during runs.

Overview
Locks the run progress checklist to a canonical set of 5 steps. Introduces canonical-tasks.ts as the single source of truth for step labels/order plus a bucketTodoToCanonicalStep matcher (exact label, then keyword fallback) to map drifted agent TodoWrite items into the right row.

Updates the TUI store so entering RunPhase.Running pre-populates these five tasks immediately, and rewrites syncTodos to always render exactly five rows, enforce monotonic completion, collapse multiple in_progress states to a single latest step, auto-complete earlier steps when the agent skips ahead, and drop unbucketed/extra todos while preserving the latest per-step activeForm.

Adjusts tests to consume CANONICAL_LABELS and adds/updates store tests to cover seeding, bucketing, drift/rename handling, ordering, and empty/unknown-status behaviors.

Reviewed by Cursor Bugbot for commit ab11e94. Bugbot is set up for automated code reviews on this repo. Configure here.

… regress or reorder

The progress checklist was driven directly by agent TodoWrite output with
only an exact-label monotonic guard. When the LLM drifted ("Install
Amplitude SDK" vs "Install Amplitude"), renamed a step on retry, or
marked a later step in_progress while an earlier one was still pending,
tasks would un-check themselves and appear out of order.

Canonical 5 are now the source of truth in code (src/lib/canonical-tasks.ts),
pre-populated on RunPhase.Running so the user sees the journey from frame 1.
syncTodos buckets agent output into canonical steps via exact-label then
keyword fallback, applies a per-step monotonic guard, enforces single
in_progress + ordering, and drops unbucketed extras. ActiveForm still
surfaces the agent's latest wording on the correct row.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kelsonpw kelsonpw requested a review from a team as a code owner April 30, 2026 06:39
@github-actions
Copy link
Copy Markdown
Contributor

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

  • /wizard-ci all

Test all apps in a directory:

  • /wizard-ci django
  • /wizard-ci fastapi
  • /wizard-ci flask
  • /wizard-ci javascript-node
  • /wizard-ci javascript-web
  • /wizard-ci next-js
  • /wizard-ci python
  • /wizard-ci react-router
  • /wizard-ci vue

Test an individual app:

  • /wizard-ci django/django3-saas
  • /wizard-ci fastapi/fastapi3-ai-saas
  • /wizard-ci flask/flask3-social-media
Show more apps
  • /wizard-ci javascript-node/express-todo
  • /wizard-ci javascript-node/fastify-blog
  • /wizard-ci javascript-node/hono-links
  • /wizard-ci javascript-node/koa-notes
  • /wizard-ci javascript-node/native-http-contacts
  • /wizard-ci javascript-web/saas-dashboard
  • /wizard-ci next-js/15-app-router-saas
  • /wizard-ci next-js/15-app-router-todo
  • /wizard-ci next-js/15-pages-router-saas
  • /wizard-ci next-js/15-pages-router-todo
  • /wizard-ci python/meeting-summarizer
  • /wizard-ci react-router/react-router-v7-project
  • /wizard-ci react-router/rrv7-starter
  • /wizard-ci react-router/saas-template
  • /wizard-ci react-router/shopper
  • /wizard-ci vue/movies

Results will be posted here when complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant