fix: distinguish task completion from operation success by bokelley · Pull Request #34 · adcontextprotocol/adcp-client

bokelley · 2025-10-06T10:48:01Z

Problem

When agents return status: completed with success: false in the response data, the library incorrectly reported the operation as successful. This caused silent failures:

Example: Wonderstruck Media Buy

Agent response:

{
  "success": false,
  "message": "Missing required parameters: ['product_ids', 'total_budget', 'flight_start_date', 'flight_end_date']"
}

Library returned:

{ success: true, status: 'completed', data: { success: false, ... } }

Result: Script printed "✅ SUCCESS! Media buy created" even though the operation failed and no media buy was created.

Root Cause

The library conflated two different concepts:

Task completion - Did the API call complete? (protocol level)
Operation success - Did the business operation succeed? (application level)

When A2A/MCP returns status: completed, it means "I successfully processed your request and returned a response." It does NOT mean "the operation succeeded."

Solution

Check response.data.success to determine actual operation success:

If data.success === false, set result.success = false and include error field
If data.success is undefined/missing, assume success (backward compatible)
Extract error message from data.message into result.error

Changes

Applied to all completion code paths in TaskExecutor.ts:

Immediate completion (ADCP_STATUS.COMPLETED)
Polling completion (waitForWorkingCompletion)
Task polling (pollTaskCompletion)
Default/unknown status handling

// Before
return { success: true, status: 'completed', data: completedData };

// After
const operationSuccess = completedData?.success !== false;
return {
  success: operationSuccess,
  status: 'completed',
  data: completedData,
  error: operationSuccess ? undefined : (completedData?.message || 'Operation failed')
};

Impact

Before:

✅ SUCCESS! Media buy created
🎉 Media buy successfully created on Wonderstruck!
   Your 300x250 creative is now running.

After:

❌ FAILED to create media buy

Error Details:
   Error: Missing required parameters: ['product_ids', 'total_budget', 'flight_start_date', 'flight_end_date']
   Status: completed
   Response Time: 75ms

Backward Compatibility

✅ Fully backward compatible - Agents that don't return a success field are unaffected. Only agents that explicitly return success: false will trigger the new failure handling.

Testing

Build passes
Tested with Wonderstruck agent - correctly reports failure
Error message properly surfaced to scripts
No changes to existing test expectations needed

🤖 Generated with Claude Code

## Problem When agents return `status: completed` with `success: false` in the response data, the library incorrectly reported the operation as successful. This caused silent failures where: - Task execution completed successfully - But the actual operation failed (validation error, rejection, etc.) - Library returned `success: true`, masking the failure Example: Wonderstruck agent returns: ```json { "success": false, "message": "Missing required parameters: ['product_ids', ...]" } ``` But library returned `success: true, status: 'completed'` ## Root Cause The library conflated two different concepts: 1. **Task completion** - Did the API call complete? (A2A/MCP level) 2. **Operation success** - Did the operation succeed? (Application level) ## Solution Check `response.data.success` to determine actual operation success: - If `data.success === false`, set `result.success = false` - If `data.success` is undefined/missing, assume success (backward compatible) - Include error message from `data.message` in `result.error` Applied to all completion paths: - Immediate completion (ADCP_STATUS.COMPLETED) - Polling completion (waitForWorkingCompletion) - Task polling (pollTaskCompletion) - Default/unknown status handling ## Impact - Scripts now correctly report failures: "❌ FAILED to create media buy" - Error messages surfaced: "Missing required parameters..." - Backward compatible: existing agents that don't return `success` field unaffected 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

gitguardian · 2025-10-08T02:42:31Z

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.

^{_{🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.}}

Block __proto__, constructor, and prototype keys in setPath() to prevent prototype-polluting assignments from crafted YAML paths. Resolves CodeQL alerts #33 and #34. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add storyboard-driven testing module with CLI Storyboards become the primary testing concept, replacing scenario-based comply retrofitting. Each YAML step maps directly to a SingleAgentClient method via a stateless execution engine designed for LLM consumption. - Add src/lib/testing/storyboard/ module (types, runner, loader, validations, context, task-map) - Add CLI: adcp storyboard {list, show, run, step} with --json output - Bundle 12 storyboard YAMLs from adcontextprotocol/adcp spec repo - Tag storyboards with platform_types for backwards compat with PlatformType - Add yaml as runtime dependency for YAML parsing Closes #423 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: format storyboard module with prettier Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: migrate comply() to storyboard-driven testing Storyboards are now the single testing surface for comply(). The compliance engine runs storyboard YAMLs instead of hand-written scenario functions, while maintaining the existing ComplianceResult interface for backwards compatibility. YAML format extensions: - expect_error: inverts pass/fail for negative testing - requires_tool: skip steps when agent lacks a tool - context_outputs/context_inputs: explicit data flow between steps - error_code validation: check error codes in error responses - track/required_tools: map storyboards to compliance tracks New compliance storyboards (10): - governance_property_lists, governance_content_standards - si_session, brand_rights, media_buy_state_machine - error_compliance, schema_validation, behavioral_analysis - audience_sync, deterministic_testing Deprecates SCENARIO_REQUIREMENTS, DEFAULT_SCENARIOS, testAllScenarios() in favor of storyboard execution. Old exports remain functional. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: guard setPath against prototype pollution Block __proto__, constructor, and prototype keys in setPath() to prevent prototype-polluting assignments from crafted YAML paths. Resolves CodeQL alerts #33 and #34. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add request builder for storyboard steps Build valid requests from discovered context (products, accounts, formats) instead of sending raw YAML sample_request payloads. Each task has a builder that mirrors how hand-written scenarios construct requests — selecting products with pricing options, building proper asset records, generating valid date ranges, etc. sample_request from YAML becomes documentation/fallback only. The runner priority is: --request override > request builder > sample_request. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: session init, schema registration, and validation paths - Initialize MCP session in runner before executing steps (fixes Streamable HTTP → SSE degradation after a few calls) - Close MCP connections after standalone storyboard/step runs - Register 20+ missing response schemas in TOOL_RESPONSE_SCHEMAS (sync_accounts, list_accounts, governance, SI, capabilities, etc.) - Fix get_media_buy_delivery validation path: media_buys → media_buy_deliveries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: bump MCP SDK to 1.29.0, retry session errors, fix validation paths - Bump @modelcontextprotocol/sdk from 1.27.1 to 1.29.0 - Retry StreamableHTTP on any StreamableHTTPError before falling back to SSE (workaround for SDK #1708 and #1852 — session expiry) - Fix sync_creatives validation path: results[0].action → creatives[0].action (matches actual SyncCreativesSuccess schema) Test agent media_buy_seller: 8/9 pass (only sync_governance fails due to missing spec schema — adcontextprotocol/adcp#1978) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: sync schemas and register sync_governance response schema Resync schemas from spec to pick up the new sync_governance response schema (adcontextprotocol/adcp#1978). Register it in TOOL_RESPONSE_SCHEMAS. Test agent media_buy_seller: 9/9 pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: regenerate types to include syncGovernance method Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use Object.defineProperty in setPath to satisfy CodeQL Replace bracket assignment with Object.defineProperty so CodeQL's static analysis recognizes the prototype pollution guard. The FORBIDDEN_KEYS check was already correct but CodeQL doesn't trace through Set.has(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: extract parsePath/resolvePath/setPath to shared path.ts Deduplicate parsePath (was in both context.ts and validations.ts). Move resolvePath and setPath to path.ts as the canonical location. Add isPlainObject type guard before Object.defineProperty in setPath. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: assign track and required_tools to all storyboards Every storyboard now has a track field mapping it to a ComplianceTrack, so comply() can discover and run storyboards for all 11 tracks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: update release-please workflow to use config files

5661c68

bokelley force-pushed the bokelley/fix-media-buy-success-detection branch from e20a9cc to 5661c68 Compare October 8, 2025 02:45

chore: ignore test-local-agent.ts test script

d670ed1

bokelley merged commit 34b8d88 into main Oct 8, 2025
6 checks passed

This was referenced Oct 8, 2025

chore(main): release 0.3.1 #36

Closed

chore(main): release 0.4.1 #37

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: distinguish task completion from operation success#34

fix: distinguish task completion from operation success#34
bokelley merged 3 commits intomainfrom
bokelley/fix-media-buy-success-detection

bokelley commented Oct 6, 2025

Uh oh!

gitguardian bot commented Oct 8, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bokelley commented Oct 6, 2025

Problem

Example: Wonderstruck Media Buy

Root Cause

Solution

Changes

Impact

Backward Compatibility

Testing

Uh oh!

gitguardian bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

️✅ There are no secrets present in this pull request anymore.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gitguardian bot commented Oct 8, 2025 •

edited

Loading