feat(addie): render storyboard hint fix plans in MCP tool output#3084
Merged
feat(addie): render storyboard hint fix plans in MCP tool output#3084
Conversation
This was referenced Apr 25, 2026
Closes #3055. Turns `context_value_rejected` runner hints (from `@adcp/client/testing` 5.17.0) into a deterministic Diagnose / Locate / Fix / Verify build playbook in Addie's `run_storyboard` and `run_storyboard_step` outputs. The new formatter consumes the runner's structured fields and emits a plan that names the two tools that disagree, offers widen-vs-narrow fix paths, and cites the exact verify call. - New: `server/src/addie/services/storyboard-fix-plan.ts` (pure formatter) - Wired into both render sites in `server/src/addie/mcp/member-tools.ts` - Bumps `@adcp/client` 5.16.0 → 5.17.0 for runner-side hint emission - Trailing prompt instruction includes a refusal anchor against prompt-injection in seller-controlled hint fields - All seller-controlled fields (`rejected_value`, `accepted_values[]`, `error_code`, `request_field`) sanitized at the formatter boundary - Three test layers: field-level (16), inline snapshots (3), real-MCP e2e via `runAgainstLocalAgent` (1) Supersedes draft PR #3058. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aseline The 5.17.0 bump in this PR tightens request-side schema validation in the framework dispatch path. 11 bundled storyboards send additional properties their own request schemas now reject (sync_plans, list_property_lists, delete_property_list). Filed upstream at adcontextprotocol/adcp-client#940 — restore floors once 5.17.1+ ships. Legacy dispatch is unaffected and stays at 52 / 380. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c8e4998 to
4c3cdb7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3055. Supersedes draft PR #3058.
Summary
Turns
context_value_rejectedrunner hints (from@adcp/client/testing5.17.0) into a deterministic Diagnose / Locate / Fix / Verify build playbook in Addie'srun_storyboardandrun_storyboard_stepMCP tool outputs. The new formatter consumes the runner's structured hint fields and emits a plan that names the two tools that disagree, offers widen-vs-narrow fix paths, and cites the exact verify call.The earlier draft (#3058) only rendered the runner's pre-formatted
Hint:line. This goes further — it consumes 11 structured fields and produces an actionable builder playbook the LLM can read back verbatim.Verbatim render — canonical catalog drift (real MCP transport)
Files
server/src/addie/services/storyboard-fix-plan.ts— new pure formatter (~250 lines, deterministic, fully tested)server/src/addie/mcp/member-tools.ts— wired into both render sites, with hardened anti-injection trailing promptserver/src/addie/config-version.ts—CODE_VERSION2026.04.3 → 2026.04.4package.json+package-lock.json—@adcp/client5.16.0 → 5.17.0server/tests/unit/:storyboard-fix-plan.test.ts— 20 field-level assertions (incl. injection regression tests)storyboard-fix-plan-snapshot.test.ts— 3 inline snapshots of canonical scenariosstoryboard-fix-plan-e2e.test.ts— real-transport test usingrunAgainstLocalAgent+createAdcpServeragainst a deliberately-broken signals sellerSecurity
The hint payload is partly seller-controlled —
rejected_value,accepted_values[],error_code, ANDrequest_field(the runner copies the seller'serrors[].fieldpointer onto this verbatim — surfaced in security review and fixed before merge). All four pass throughsanitizeAgentStringat the formatter boundary; sanitizer strips ASCII controls, backtick, NEL, LSEP, PSEP. Length caps documented as prompt-injection budgets. Trailing LLM instruction includes an explicit refusal anchor:Reviews addressed
code-reviewer: 1 Must Fix (sanitizer regex readability) + 2 Should Fix (pass-state hint suppression, dedup) + 3 Nits — all addressedsecurity-reviewer: 2 Must Fix (request_fieldwas seller-controlled but treated as runner-controlled; comment was wrong) + 2 Should Fix (cap tightening, refusal anchor) — all addressedFollow-up
Filed adcontextprotocol/adcp-client#935 proposing a hint taxonomy so this rendering pattern can extend beyond
context_value_rejectedto other runner-side diagnostics.Test plan
npm run test:server-unit— 2136 tests pass (no regressions)npm run typecheck— cleanevaluate_agent_qualityagainst a known-broken seller in production and verify the playbook surfaces🤖 Generated with Claude Code