Skip to content

fix: handle AdCP advisory errors by task#2020

Merged
bokelley merged 3 commits into
mainfrom
accept-vendor-metrics-errors
May 26, 2026
Merged

fix: handle AdCP advisory errors by task#2020
bokelley merged 3 commits into
mainfrom
accept-vendor-metrics-errors

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Summary

  • accept AdCP 3.1 vendor_metric optimization_goals and vendor committed_metrics in create_media_buy validation
  • distinguish terminal AdCP errors from advisory errors[] on task-aware success/submitted payloads
  • add public task-aware helpers for advisory-success detection and terminal error detection
  • add regression coverage for get_products advisory errors, report_usage partial success, polled tasks/get completions, terminal payload statuses, and media-buy required success evidence

Root Cause

The SDK used structural errors[] detection as terminal failure detection. In AdCP 3.1, success/submitted payloads can carry advisory errors[], so callers need task-aware success evidence before treating errors[] as fatal. The fix keeps isAdcpError() structural for compatibility and routes operation success through isTerminalAdcpError(toolName).

Validation

  • npm run build:lib
  • NODE_ENV=test node --test-timeout=60000 --test-force-exit --test test/lib/response-unwrapper.test.js test/lib/request-validation.test.js test/lib/poll-task-completion-terminal-states.test.js test/lib/protocol-response-parser-status.test.js
  • NODE_ENV=test node --test-timeout=120000 --test-force-exit --test test/lib/*.test.js

Expert Review

  • protocol review found missing success-evidence coverage and polled-completion gaps; fixed with field groups and tests
  • code review found terminal payload status and cancel_media_buy enum-collision edge cases; fixed and covered

Fixes #2013
Fixes #2014

@bokelley bokelley changed the title [codex] Handle AdCP advisory errors by task fix: handle AdCP advisory errors by task May 26, 2026
Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The advisory-vs-terminal split is the right shape: status: failed | rejected and success: false are the authoritative terminal signals, with the per-tool field-group table acting as the positive Success-arm witness for the errors[]-on-Success case AdCP 3.1 introduced. Pure classifier, no fabrication, no shape rewriting — the witness-not-translator invariant holds.

code-reviewer and ad-tech-protocol-expert both come back sound-with-caveats; the caveats are follow-ups, not blocks.

Things I checked

  • Field-group table at src/lib/utils/response-unwrapper.ts:97-158 against TOOL_RESPONSE_SCHEMAS (response-schemas.ts) — 58 of 59 tools covered; spot-checks against tools.generated.ts confirm create_media_buy [media_buy_id, packages], report_usage [accepted], update_content_standards [success, standards_id], delete_property_list [deleted, list_id], get_products [products | unchanged] all line up with the spec's Success arms.
  • Short-circuit ordering at isTerminalAdcpError (L190-198) — status: failed | rejected wins over field-group evidence, then adcp_error.code, then errors[] defers to hasAdvisorySuccessPayload. Correct precedence for the four call sites (TaskExecutor L735, L869, L1023, L1509 + unwrapper L256).
  • submitted + task_id branch at L179 is toolName-agnostic — safe, mirrors the universal AdCP envelope contract.
  • isAdcpError kept as the structural helper for back-compat; JSDoc now points callers to isTerminalAdcpError for AdCP 3.1.
  • A2A synthetic-failure envelope at L628 already emits status: failed, so MCP and A2A converge on the same terminal classifier.
  • Tests cover advisory-success polled completion, terminal payload (status: failed), envelope-only errors[] (no Success evidence → terminal), submitted + advisory errors[], vendor_metric optimization_goals round-trip, and create_media_buy missing media_buy_id fallback.

Follow-ups (non-blocking — file as issues)

  • get_content_standards is missing from SUCCESS_PAYLOAD_FIELD_GROUPS_BY_TOOL. Registered in TOOL_RESPONSE_SCHEMAS (response-schemas.ts:83) with standards_id required on Success. Add get_content_standards: [['standards_id']]. Not a regression (previous behavior also misclassified), but the only spec-registered tool the table misses.
  • taskName === 'unknown' defeats advisory-success detection on the poll path. mapTasksGetResponseToTaskInfo (TaskExecutor L104, L127) defaults taskType to 'unknown' when the seller's tasks/get envelope omits task_type. The pollTaskCompletion call at L1509 hands that to isOperationSuccess, which threads 'unknown' into hasAdvisorySuccessPayload — no entry → returns false → advisory errors[] becomes terminal. executeTask already knows the originating tool name; thread it through to pollTaskCompletion so polled completions inherit the same classification as inline completions.
  • report_usage: [['accepted']] is field-presence, not value-predicate. A 3.0.x seller emitting { accepted: 0, errors: [...all rows rejected...] } without envelope status (3.0.x didn't require it) hits advisory-success → SDK reports completed. Previous behavior correctly reported failure for this shape. 3.1.0-beta.2+ sellers carry status: failed so the short-circuit at L176 catches them; this is a 3.0.x back-compat edge case. Either require accepted > 0 (needs a value predicate, not just field presence) or document the limitation.
  • No changeset. CLAUDE.md is unambiguous that any src/lib/ change requires one, and this PR adds two new public exports (isTerminalAdcpError, hasAdvisorySuccessPayload) and changes the behavior of isAdcpSuccess. CI's Check for changeset passed because pre-mode counts the existing .changeset/pre.json entries — the gate isn't actually per-PR in pre mode. Run npm run changeset (minor — new exports + bug-fix behavior change) before merge so the next beta release notes name the fix.

Minor nits (non-blocking)

  1. submitted branch accepts both task_id and taskId. src/lib/utils/response-unwrapper.ts:179. AdCP wire is snake_case only; the camelCase fallback silently tolerates non-conformant sellers. Either drop it or add a comment explaining which adopter shape forced it.
  2. isAdcpError could be @deprecated. response-unwrapper.ts:698. JSDoc points callers to isTerminalAdcpError but TS won't warn. The remaining direct caller at TaskExecutor L1028 is fine (it's checking iterability for message extraction), but new code should not reach for the structural helper.
  3. Table has no section comments. response-unwrapper.ts:97. TOOL_RESPONSE_SCHEMAS groups by domain (// Product discovery, // Creative, etc.); mirroring that here makes the next addition easier to audit against the schema map and would have caught the get_content_standards gap.

Notable that the [codex] PR shipped the test plan and root-cause section meticulously but skipped step 1 of the publishing flow — worth a changeset before merge.

Approving on the strength of the classifier shape plus the regression coverage. Land the changeset and the get_content_standards entry in the same follow-up.

Copy link
Copy Markdown

@aao-ipr-bot aao-ipr-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right shape on the architectural question — AdCP 3.1 advisory errors[] on Success/Submitted arms needed task-aware terminal detection, and routing through isTerminalAdcpError(toolName) while keeping isAdcpError structural is the right witness-not-translator move. Two named coverage gaps before merge, plus follow-ups.

Things I checked

  • isOperationSuccess thread-through at all three call sites (TaskExecutor.ts:735, :869, :1519). Polling path passes status.taskType.
  • extractOperationError (TaskExecutor.ts:1028) correctly keeps the structural isAdcpError for message extraction — it only runs after the terminal classification already failed.
  • unwrapProtocolResponse early-exit at response-unwrapper.ts:256 now lets advisory-success payloads flow through schema validation instead of short-circuiting. The Zod schemas in response-schemas.ts model errors[] as co-optional with success fields (e.g. GetProductsResponseStrictSchema), so the validation still passes — confirmed by the new response-unwrapper.test.js advisory-success case.
  • 'submitted' + task_id discriminator matches the wire — test/fixtures/create_media_buy_async_submitted.yaml:149-151 pins snake_case task_id as the AdCP-payload-layer field.
  • Witness-not-translator preserved: errors[] flows through untouched, only the terminal-vs-advisory classification changed. No fabrication.

MUST FIX (one-line touches; safe to land in this PR)

  1. get_content_standards is missing from SUCCESS_PAYLOAD_FIELD_GROUPS_BY_TOOL. response-unwrapper.ts:97-158 lists every other tool from TOOL_RESPONSE_SCHEMAS but skips get_content_standards, which is present at response-schemas.ts:83 and whose Success arm requires standards_id (schemas.generated.ts:6691-6709). Without an entry, a get_content_standards response with (status: 'completed', standards_id: '...', errors: [advisory]) falls through hasAdvisorySuccessPayload, the errors[] branch trips, and the SDK treats advisory as terminal — the exact bug this PR exists to fix, leaking through for one tool. Add get_content_standards: [['standards_id']].
  2. status: 'canceled' is not in the terminal-status check. response-unwrapper.ts:176 and :192 check failed and rejected but not canceled. TaskStatusSchema (schemas.generated.ts:1274) lists canceled as a terminal envelope state alongside the other two. The envelope-level path in TaskExecutor.ts:822 handles canceled correctly so this doesn't bite in the common A2A flow, but a structured-content payload that surfaces status: 'canceled' with a matching field-group will currently be misclassified as advisory success. Add 'canceled' to both checks for symmetry with the rest of the SDK's terminal handling.

Follow-ups (non-blocking — file as issues)

  • The hand-maintained field-groups table will drift. idempotency.ts:26 derives MUTATING_TASKS from the generated schemas at module-load — same problem, same solution. Most response schemas in schemas.generated.ts either compose .and(z.union([Success, Error])) (Success-arm required keys = field-group) or flatten everything optional with tool-specific required fields. Worth a follow-up to walk TOOL_RESPONSE_SCHEMAS and extract Success-arm required keys with a small override map for the flattened-shape tools. At minimum: a unit test that asserts Object.keys(SUCCESS_PAYLOAD_FIELD_GROUPS_BY_TOOL) ⊇ Object.keys(TOOL_RESPONSE_SCHEMAS) would have caught the get_content_standards miss automatically.
  • Polling path defeats the fix when taskType is 'unknown'. TaskExecutor.ts:104 / :127 normalize a missing task_type/taskType on tasks/get to the literal string 'unknown'. isOperationSuccess(status.result, 'unknown') then hits no field-group entry and falls back to terminal-on-any-errors[]. Pre-PR baseline behavior, so not a regression — but the fix silently doesn't apply on the polling path for any seller that omits task_type. Either document the requirement or thread the original taskName through the polling loop.
  • cancel_media_buy is a real tool (server/create-adcp-server.ts:2758) but isn't in TOOL_RESPONSE_SCHEMAS or the field-groups table. If/when it lands in schemas, add the entry too.

Minor nits (non-blocking)

  1. Changeset prose oversells the vendor_metric line. .changeset/advisory-errors-vendor-metrics.md reads "Accept AdCP 3.1 vendor_metric optimization goals in create_media_buy validation" as if a fix shipped — but the source diff has zero vendor_metric content. The only addition is the regression test at test/lib/request-validation.test.js:279-365, which verifies pre-existing schema behavior. Rewrite the prose to make clear this is regression coverage of existing acceptance, or drop the half-sentence entirely. Adopters scanning the changelog for the vendor-metric fix will land here and assume a code change shipped.
  2. taskId (camelCase) on the AdCP payload at response-unwrapper.ts:179 is speculative. The wire is snake_case task_id; the camelCase form is internal-only after TaskExecutor.ts:125 normalization, which runs after this helper. Either drop the taskId check or comment that it's a defensive belt-and-braces guard for legacy fixtures.
  3. validate_input in field-groups but not in TOOL_RESPONSE_SCHEMAS. One-line comment that the entry is forward-looking would save the next reader from "cleaning it up."
  4. Changeset type is patch but the PR adds two new public exports (isTerminalAdcpError, hasAdvisorySuccessPayload). New exports are conventionally minor. Non-blocking — repo precedent on bug-fix-adjacent helpers may differ — but worth a glance before the Release PR lands.

LGTM after the get_content_standards and canceled additions land. Both are one-line touches; safe to push into this PR rather than a follow-up.

@bokelley bokelley force-pushed the accept-vendor-metrics-errors branch from 23e0a53 to 281a6e0 Compare May 26, 2026 10:49
@bokelley bokelley enabled auto-merge (squash) May 26, 2026 10:51
@bokelley bokelley closed this May 26, 2026
auto-merge was automatically disabled May 26, 2026 10:52

Pull request was closed

@bokelley bokelley reopened this May 26, 2026
@bokelley bokelley merged commit b8a08bb into main May 26, 2026
1 check passed
@bokelley bokelley deleted the accept-vendor-metrics-errors branch May 26, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant