Skip to content

fix: force comparison board as default variant chooser (v0.14.1.0)#658

Merged
garrytan merged 2 commits intomainfrom
garrytan/force-comparison-board
Mar 30, 2026
Merged

fix: force comparison board as default variant chooser (v0.14.1.0)#658
garrytan merged 2 commits intomainfrom
garrytan/force-comparison-board

Conversation

@garrytan
Copy link
Copy Markdown
Owner

Summary

The design comparison board ($D compare --serve) was being skipped in favor of showing variants inline + AskUserQuestion "which do you prefer?" — a degraded experience missing rating controls, comments, and remix/regenerate buttons.

Comparison board enforcement:

  • Replaced "show variants inline" instruction with "do NOT show inline, proceed to comparison board" in plan-design-review/SKILL.md.tmpl
  • Added CRITICAL RULE: never use AskUserQuestion as the variant chooser, only as the blocking wait

AskUserQuestion-first wait mechanism (shared resolver):

  • Changed DESIGN_SHOTGUN_LOOP in scripts/resolvers/design.ts from polling-first to AskUserQuestion-first
  • Board URL included in AskUserQuestion so user can click through if tab is lost
  • Polling demoted to serve-failure fallback only
  • All 3 consumer skills get this fix: /plan-design-review, /design-shotgun, /design-consultation

Codex adversarial review fixes:

  • Fixed board URL from /design-board.html (404) to / (where the server actually serves)
  • Improved serve-failure fallback to show variants inline via Read tool instead of choosing blind

Test Coverage

Step 3.4: All new code paths have test coverage. Template text changes tested via gen-skill-docs.test.ts integration.

Pre-Landing Review

No issues found. Prompt template text and template resolver string only — no SQL, concurrency, or security boundaries.

Adversarial Review

Codex found 4 issues (medium tier, 83 source lines):

  • [FIXED] Board URL would 404 — corrected from /design-board.html to /
  • [FIXED] Serve-failure fallback asked user to choose blind — now shows variants inline
  • [BY DESIGN] Regenerate deadlock concern — user explicitly chose AskUserQuestion-first over polling
  • [FALSE POSITIVE] Race condition on feedback.json — user acts on board before telling agent

Plan Completion

8/8 plan items DONE — all implementation and verification items addressed.

Test plan

  • bun test passes (pre-existing test-2 FAIL unchanged)
  • bun run gen:skill-docs regenerates all SKILL.md files
  • Verified "do NOT show inline" text in plan-design-review/SKILL.md
  • Verified AskUserQuestion-first flow in all 3 consumer SKILL.md files
  • Verified no duplicate $D compare --serve commands
  • Verified board URL uses / not /design-board.html

🤖 Generated with Claude Code

garrytan and others added 2 commits March 30, 2026 02:05
The comparison board ($D compare --serve) was being skipped in favor of
showing variants inline + AskUserQuestion "which do you prefer?" — a
degraded experience missing rating controls, comments, and remix buttons.

Changes:
- Replace "show inline" instruction with "do NOT show inline, proceed to
  comparison board" in plan-design-review/SKILL.md.tmpl
- Add CRITICAL RULE: never use AskUserQuestion as the variant chooser
- Change DESIGN_SHOTGUN_LOOP resolver to AskUserQuestion-first wait with
  polling fallback (affects all 3 consumer skills)
- Fix board URL from /design-board.html (404) to / (correct)
- Improve serve-failure fallback to show variants inline via Read tool
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

E2E Evals: ✅ PASS

9/9 tests passed | $1.34 total cost | 12 parallel runners

Suite Result Status Cost
e2e-design 3/3 $0.6
e2e-plan 1/1 $0.1
llm-judge 2/2 $0.04
e2e-design 3/3 $0.6

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

@garrytan garrytan merged commit 7911b7b into main Mar 30, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant