Skip to content

refactor: error-emission architecture base (PR 1 of 3)#1306

Open
ChrisHuie wants to merge 29 commits into
mainfrom
feature/error-emission-architecture-pr1
Open

refactor: error-emission architecture base (PR 1 of 3)#1306
ChrisHuie wants to merge 29 commits into
mainfrom
feature/error-emission-architecture-pr1

Conversation

@ChrisHuie
Copy link
Copy Markdown
Contributor

Summary

  • Establishes the AdCP spec 3.0.6 two-layer error envelope across MCP, A2A, and REST. _impl raises typed AdCPError; boundary translators run build_two_layer_error_envelope() so both
    adcp_error.code and errors[0].code populate — storyboard runners (@adcp/sdk@6.11.0+) no longer synthesize MCP_ERROR from missing layers.
  • 4 structural guards establish a shrink-only ratchet: 82 Pattern A Error(code=...) sites and 126 raise ValueError sites capped per-file. PR 2 drains the caps; PR 3 covers async/submitted
    lifecycle.
  • A2A audit-gap fix at adcp_a2a_server.py:1397_log_a2a_operation(success=False) now fires on AdCPError before re-raise. Behavior change: previously-silent failures reach audit log + Slack
    on merge.

Planning artifact: .claude/notes/error-emission-architecture/PLAN.md (8 decisions documented).

What's in this PR

Substrate (src/):

  • core/exceptions.pybuild_two_layer_error_envelope(); 7 typed subclasses (AdCPMediaBuyNotFoundError, AdCPPackageNotFoundError, AdCPCreativeRejectedError, AdCPBudgetExceededError,
    AdCPBudgetTooLowError, AdCPCapabilityNotSupportedError, AdCPProductUnavailableError); context param on AdCPError.
  • core/tool_error_logging.pyAdCPToolError(ToolError) carries envelope; MCP wire text is JSON-parseable.
  • core/context_manager.pyContextManager.fail_step(step_id, *, exc, error_message). Persisted workflow_step.response_data is byte-identical to wire response (same builder).
  • a2a_server/adcp_a2a_server.py_adcp_to_a2a_error embeds envelope in data; audit-gap fix.
  • app.py + routes/api_v1.py — REST boundary uses envelope.

Structural guards (tests/unit/):

  • test_architecture_no_error_construction_in_impl.py — 82 Pattern A sites across 11 files (cap shrink-only).
  • test_architecture_no_value_error_in_impl.py — 126 ValueError sites across 23 files.
  • test_architecture_error_envelope_two_layer.py — AST guard verifying each of 3 boundary translators calls the envelope builder.
  • _per_file_cap_guard.py — shared helper.

Drive-by hygiene (commit 2):

  • Pre-commit mypy hook adcp pin 3.2.04.3.0 (was producing phantom AccountReference / get_adcp_spec_version errors).
  • Drop 2 obsolete # type: ignore[arg-type] at src/app.py:44-45. Net src/ type:ignore count 6059.
  • Resync .type-ignore-baseline 4259 (was stale; main had drifted via PRs bypassing the hook).
  • Stale sync-creatives-request schema-mismatch allowlist entry fixed (accountaccount_id per adcp 4.3 spec evolution).

Coordination with in-flight PRs

Spec & storyboard compliance

Test plan

  • make quality clean (verified locally: 4481 passing, 19 xfailed, 0 failures).
  • ./run_all_tests.sh — all 5 suites pass.
  • Storyboard smoke: docker compose up -d then npx -y @adcp/sdk@6.11.0 storyboard run mcp http://localhost:8000/mcp/ --auth test-token --scenario media_buy_seller/invalid_transitions. Verify
    errors[0].code and adcp_error.code both surface real codes (MEDIA_BUY_NOT_FOUND, PACKAGE_NOT_FOUND, NOT_CANCELLABLE); zero MCP_ERROR synthesis.
  • Operations review for Slack volume jump from A2A audit-gap fix.
  • Cap counts unchanged (82 Pattern A, 126 ValueError).
  • Coverage doesn't decrease.

Sequence

PR Scope Status
PR 1 (this PR) Substrate + structural guards ✅ ready
PR 2 Pattern A cleanup sweep (~82 + 126 sites) upcoming
PR 3 Async/submitted lifecycle + Response3 envelope upcoming

@ChrisHuie ChrisHuie changed the title refactor: error-emission architecture substrate (PR 1/3) refactor: error-emission architecture base (PR 1/3) May 15, 2026
ChrisHuie added a commit that referenced this pull request May 15, 2026
… integration tests

Three CI failures from the PR 1 substrate commit (PR #1306):

1. Lint & Type Check (mypy): the two '# type: ignore[arg-type]' at
   src/app.py:44-45 are load-bearing in CI (full project venv sees the
   WSGIMiddleware vs Starlette.mount ASGI-callable mismatch) but appear
   unused in the pre-commit isolated mypy hook env (no starlette stubs).
   Restored with '[arg-type, unused-ignore]' so both environments are
   happy: CI suppresses the real arg-type error, pre-commit suppresses
   the resulting unused-ignore warning.

2. Integration (other): three TestRecoveryFieldInErrorResponses tests in
   tests/integration/test_error_paths.py still asserted body['recovery']
   (the pre-envelope flat shape); missed in the original wire-shape sweep.
   Updated to body['adcp_error']['recovery'] and body['errors'][0]['recovery'].

3. type-ignore-no-regression hook: baseline bumps 59 -> 61 to reflect the
   restoration of the two src/app.py type:ignores that should not have been
   removed in the substrate commit.

Integration (infra) showed a 'MCP server failed to start on port 45697
within 20s' error in test_update_media_buy_minimal; server log shows
Uvicorn DID start, just past the 20s timeout. Treating as CI flake.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 15, 2026
The 20s readiness timeout in the integration test MCP server fixture has
been firing intermittently on PR #1306 (Integration (infra) job). CI logs
show Uvicorn DID start ('Uvicorn running on http://0.0.0.0:...') just past
the 20s threshold — the fixture raises RuntimeError before the listener
opens. Server startup is dominated by Python imports (fastmcp + adcp SDK +
project tree) plus FastAPI lifespan + DB pool warm-up; under CI load this
routinely takes 20-40s.

Bumping to 60s gives headroom for normal cold-boot while still failing
reasonably fast on a genuinely-broken server. Same rationale as
bf5fe3a (test-stack readiness deadline 120s -> 360s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 15, 2026
…or harness isinstance() checks

Integration (creative) failure on PR #1306:
test_strict_mode_missing_package_aborts asserted
``isinstance(result.error, AdCPNotFoundError)`` but the harness reconstructed
a base ``AdCPError`` because the new wire-translation maps the base
``NOT_FOUND`` code to ``INVALID_REQUEST``, which the harness's
``_CODE_TO_CLASS`` registry didn't recognize.

Two fixes:

1. _assignments.py:77 migrates from ``AdCPNotFoundError`` to
   ``AdCPPackageNotFoundError`` so the wire code is PACKAGE_NOT_FOUND
   (STANDARD, no translation needed). The class is a subclass of
   AdCPNotFoundError so ``isinstance()`` checks still pass on the
   reconstructed exception.

2. tests/harness/_base.py ``_CODE_TO_CLASS`` registry now includes the 7
   PR 1 substrate subclasses so the harness reconstructs the specific type
   after a transport roundtrip. Without this, ``PACKAGE_NOT_FOUND`` would
   have fallen back to base AdCPError and lost the isinstance specificity
   anyway.

Drive-by DRY fixes the duplication guard surfaced after this change:

- test_error_envelope.test_substrate_subclasses_present_with_standard_codes
  uses by-name getattr lookup against the exceptions module instead of an
  inline class list (which was duplicating the CANONICAL_ERROR_CODES set
  in test_error_format_consistency).
- test_architecture_no_error_construction_in_impl imports the shared
  ``_collect_error_aliases`` from test_architecture_error_code_compliance
  rather than redefining it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ChrisHuie ChrisHuie changed the title refactor: error-emission architecture base (PR 1/3) refactor: error-emission architecture base (PR 1 of 3) May 15, 2026
ChrisHuie added a commit that referenced this pull request May 16, 2026
… violation

Migrates the new raise site from raw ValueError to AdCPValidationError so the
transport boundary translates to the spec-compliant two-layer envelope after
PR #1307's narrowed except AdCPError catchall lands. Previous ValueError shape
was caught by inner (ValueError, PermissionError) and re-emitted via Pattern A
(Error(code=...) in _impl) — anti-pattern the error-emission architecture
eliminates.

Field path 'packages[].targeting_overlay.property_list' surfaces in the wire
error envelope's field attribute. Structured violations carried in details.

Note: context=req.context threading deferred until PR #1306's AdCPError
context parameter lands; will be added on rebase.

Note: includes a drive-by black/ruff format reconciliation on an unrelated
type annotation at line 2729 (pre-existing project tooling disagreement
documented under feedback_black_ruff_format_disagreement).

Refs: .claude/notes/inventory-targeting/PLAN.md A1

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 17, 2026
Applies the mandatory + important + suggested-additive items from the
post-merge review on PR #1306. Five reviewer items land in this commit;
three larger items (N2 FIXME at 208 sites, B5 storyboard context echo,
H10 per-transport integration test) are surfaced as separate follow-ups.

B1 (mandatory) — recovery defaults on typed subclasses:
- AdCPMediaBuyNotFoundError, AdCPPackageNotFoundError,
  AdCPAccountPaymentRequiredError, AdCPGoneError: override class default
  to recovery="correctable". The buyer holds the lever for recovery in
  each case (re-issue with right id, settle outstanding balance,
  reference a fresh resource). Previously these inherited their parent's
  terminal default, creating per-site override fatigue across PR 2
  cleanup. Test names + assertions for the base AdCPNotFoundError default
  are explicit about the base-vs-subclass split. New tests cover the
  subclass overrides.
- AdCPCapabilityNotSupportedError carries a docstring documenting the
  intentional SDK divergence (spec classifies UNSUPPORTED_FEATURE as
  terminal; we keep correctable because the buyer can drop the
  unsupported feature).

B4 (mandatory) — A2A explicit-skill emits failed Task + DataPart:
- _handle_explicit_skill no longer translates AdCPError to a JSON-RPC
  A2AError. Typed exception propagates so the explicit-skill dispatcher
  loop wraps the two-layer envelope into the Task's artifact DataPart.
  The standalone _adcp_to_a2a_error helper remains for paths that
  genuinely want a JSON-RPC error shape.
- Audit logging still fires before propagation (audit-gap fix preserved).
- Without this, the PR's own PLAN-cited invalid_transitions storyboard
  scenario failed on A2A: AdCP-level errors were elevated to transport
  failures instead of being represented as failed Tasks.
- Updated test_a2a_server_error_carries_recovery + the A2A boundary
  translation suite to reflect the new contract: tests check propagation
  through _handle_explicit_skill and exercise _adcp_to_a2a_error directly
  for the translation contract.

N3 (suggested) — context threading through serializers:
- AdCPError.to_dict() and to_adcp_error() now include the optional
  ``context`` (model_dumped to JSON when it's a ContextObject). Closes
  the architectural asymmetry where wire envelopes carried context but
  the dict-based serializers dropped it.

B2/B3 (suggested) — cap-dict documentation:
- PATTERN_A_PER_FILE_CAP comment block clarifies entries are migration
  targets (FIXME #1304); legitimate floors use the # noqa marker
  (4 known advisory sites per PR 2 plan).
- VALUE_ERROR_PER_FILE_CAP documents the boundary-vs-internal split.
  Internal contracts stay as ValueError; PR 2 migrates only the boundary
  set.

Pyproject — pin adcp ~=4.3:
- Compatible-release pin so 4.4+ regressions can't sneak in silently.
  Replaces the open >=4.3.0 floor. Bump deliberately.

Pending follow-ups (separate commits):
- N2: # FIXME(#1304) comments at the 82 Pattern A + 126 ValueError sites
  tracked by the structural guards (single sed-like commit; travels
  separately to keep this review-round focused).
- B5: pass context=req.context at the 3 raise sites the storyboard smoke
  test (media_buy_seller/invalid_transitions) exercises.
- H10: tests/integration/test_error_envelope_two_layer_per_transport.py.

Deferred to follow-up issues:
- A2A non-skill paths still produce no envelope (N5)
- A2A audit fix only covers AdCPError branch (N6)
- Installed adcp SDK advertises spec 3.0.1 (N7) — pre-existing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 17, 2026
…elope contract

CI on PR #1306 caught 5 test failures introduced by the B4 change in
51b666a (a2a `_handle_explicit_skill` now propagates AdCPError rather
than translating to a JSON-RPC A2AError). The production code change was
correct; three downstream contracts were not updated to match.

Harness regression (2 tests):
- tests/harness/_base.py: `_run_a2a_handler` blindly parsed every Task's
  first artifact as a success-shape `response_cls(**artifact_data)`. With
  the new contract, failed Tasks carry the two-layer envelope as the
  DataPart — parsing them as a success response produced a pydantic
  ValidationError, which then failed `result.error.error_code` assertions
  with `AttributeError`.
- Extracted `_envelope_to_adcp_error(envelope, fallback_message)` helper
  shared by `_unwrap_a2a_server_error` and the new failed-Task branch.
  When `task.status.state == TASK_STATE_FAILED`, reconstruct the typed
  AdCPError subclass from the envelope DataPart and raise it, so callers
  catching `AdCPAuthenticationError` (or asserting `result.error.error_code`)
  see the same type they used to see when A2AError was raised.

Integration test contract drift (3 tests):
- tests/integration/test_a2a_brand_manifest.py::
  test_get_products_neither_brief_nor_brand_rejected — was asserting
  `pytest.raises(A2AError)` from `handler.on_message_send`. After B4 the
  same path returns a failed Task with the envelope DataPart. Updated to
  assert task state + envelope shape (`adcp_error.code == VALIDATION_ERROR`,
  `errors[0].code == VALIDATION_ERROR`).
- tests/integration/test_a2a_error_responses.py::
  TestA2AErrorResponseStructure::test_adcp_error_carries_recovery_through_a2a_boundary
  + test_custom_recovery_override_preserved_through_a2a — were asserting
  `pytest.raises(A2AError)` directly on `_handle_explicit_skill`. After B4
  the typed AdCPError propagates instead. Updated to assert
  `pytest.raises(AdCPAdapterError | AdCPValidationError | AdCPNotFoundError)`
  with `error.recovery` and round-trip through `build_two_layer_error_envelope`
  to keep the envelope-shape coverage.

Verified locally before commit:
- 5 originally-failing CI tests now pass
- 34 tests across the 4 affected files pass
- 583 / 583 Integration (infra) pass (previously 3 failed)
- 528 / 528 Integration (creative) non-live-agent tests pass (previously 2 failed)
- 140 unit tests in error-substrate suites pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 17, 2026
CI Unit Tests on PR #1306 caught two schema-vs-library mismatches on
`get-media-buy-delivery-request.json`:

  - `time_granularity` — per-window slice granularity field
  - `include_window_breakdown` — windowed pull breakdown flag

Both are recent additions to the live AdCP spec (adcontextprotocol.org)
that the adcp Python library has not yet declared on
`GetMediaBuyDeliveryRequest`. The Pydantic model uses `extra="forbid"`
in non-production mode, so `test_model_accepts_all_schema_fields` and
`test_field_names_match_schema` rejected the spec-only fields.

Both tests already consult `KNOWN_SCHEMA_LIBRARY_MISMATCHES`, an
allowlist designed precisely for this drift window between live-spec
updates and library catch-up. Adding these two fields keeps the suite
green until the library publishes them, with FIXME(salesagent-amkf)
in the module docstring already tracking allowlist cleanup.

Verified locally: both parametrized tests pass after the allowlist
update; the local schema cache was deleted before the run so the
test path matches CI (fresh fetch from the live spec).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ChrisHuie ChrisHuie marked this pull request as ready for review May 17, 2026 22:48
ChrisHuie added a commit that referenced this pull request May 18, 2026
CI Unit Tests on PR #1306 caught two schema-vs-library mismatches on
`get-media-buy-delivery-request.json`:

  - `time_granularity` — per-window slice granularity field
  - `include_window_breakdown` — windowed pull breakdown flag

Both are recent additions to the live AdCP spec (adcontextprotocol.org)
that the adcp Python library has not yet declared on
`GetMediaBuyDeliveryRequest`. The Pydantic model uses `extra="forbid"`
in non-production mode, so `test_model_accepts_all_schema_fields` and
`test_field_names_match_schema` rejected the spec-only fields.

Both tests already consult `KNOWN_SCHEMA_LIBRARY_MISMATCHES`, an
allowlist designed precisely for this drift window between live-spec
updates and library catch-up. Adding these two fields keeps the suite
green until the library publishes them, with FIXME(salesagent-amkf)
in the module docstring already tracking allowlist cleanup.

Verified locally: both parametrized tests pass after the allowlist
update; the local schema cache was deleted before the run so the
test path matches CI (fresh fetch from the live spec).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@KonstantinMirin KonstantinMirin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design is right, the execution isn't there yet.

The core idea — typed exceptions, build_two_layer_error_envelope as single source of truth, AST guards enforcing it — is exactly what we need. But this is PR 1 of 3, the foundation everything else stacks on. It has to be airtight, and right now it's not.

The PR contradicts its own architecture in several places:

  1. The A2A explicit-skill loop adds a proper except AdCPError handler with the two-layer envelope (line 629), then the except Exception catchall directly below it (line 638) produces {"error": str(e)} — a flat dict with no envelope at all. The architecture establishes the envelope as canonical, and the very code that introduces it has a fallthrough that ignores it.

  2. _handle_tool_error was refactored to use the envelope builder, but still hardcodes status_code=500 for every error. A VALIDATION_ERROR (400) or AUTH_REQUIRED (401) gets served as 500 through REST. adcp_error_handler in app.py does this correctly — the pattern exists, it just wasn't applied here.

  3. _envelope_to_adcp_error was extracted as the shared helper for reconstructing errors from envelopes. parse_rest_error — in the same file, modified in the same PR — inlines the same three-tier fallback logic instead of calling it. This is the kind of duplication we've been actively eliminating.

  4. Context serialization appears three different ways in exceptions.py: to_dict(), to_adcp_error(), and build_two_layer_error_envelope() each serialize self.context with different behavior (exclude_none=True in one, not the others). One module, one PR, three divergent implementations of the same operation.

The three transport boundaries aren't symmetric:

The structural guard verifies all three transports call build_two_layer_error_envelope. But around that call, each boundary does something different:

  • MCP catches ValueError/PermissionError and translates them. A2A does too. REST doesn't.
  • MCP logs to activity feed + audit. A2A logs to audit only. REST logs to neither.
  • A2A uses logger.error for business validation failures. MCP uses logger.warning for the same errors.

For an architecture PR that's supposed to make error handling consistent across transports, this is a significant gap.

What I'd like to see before this is done:

  • The A2A catchall should produce an envelope (or convert to AdCPError and re-raise into the handler above it)
  • _handle_tool_error should use the actual status code, not hardcode 500
  • parse_rest_error should call _envelope_to_adcp_error
  • Context serialization should be one method, not three expressions
  • The three transport boundaries should handle the same exception types, log at the same level, and log to the same destinations — or the asymmetry should be documented as intentional with a reason

The structural guards and the per-file cap ratchet are well done. The exception hierarchy design is clean. This just needs another pass on the execution to match the quality of the design.

ChrisHuie added a commit that referenced this pull request May 18, 2026
CI Unit Tests on PR #1306 caught two schema-vs-library mismatches on
`get-media-buy-delivery-request.json`:

  - `time_granularity` — per-window slice granularity field
  - `include_window_breakdown` — windowed pull breakdown flag

Both are recent additions to the live AdCP spec (adcontextprotocol.org)
that the adcp Python library has not yet declared on
`GetMediaBuyDeliveryRequest`. The Pydantic model uses `extra="forbid"`
in non-production mode, so `test_model_accepts_all_schema_fields` and
`test_field_names_match_schema` rejected the spec-only fields.

Both tests already consult `KNOWN_SCHEMA_LIBRARY_MISMATCHES`, an
allowlist designed precisely for this drift window between live-spec
updates and library catch-up. Adding these two fields keeps the suite
green until the library publishes them, with FIXME(salesagent-amkf)
in the module docstring already tracking allowlist cleanup.

Verified locally: both parametrized tests pass after the allowlist
update; the local schema cache was deleted before the run so the
test path matches CI (fresh fetch from the live spec).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 18, 2026
Applies the five reviewer items from round 3. Each fix lands with a
dedicated unit-test class so the contract is locked down at the boundary.

Item 4 — _serialize_context() helper (foundational):
- Extract shared serialization for AdCPError.context into a single
  ``_serialize_context()`` helper so to_dict(), to_adcp_error(), and
  build_two_layer_error_envelope() emit byte-identical context payloads.
  Canonical form: isinstance(dict) check, shallow-copy dicts (aliasing
  protection), and model_dump(mode="json", exclude_none=True).
- Three new envelope-test invariants: aliasing protection, exclude_none
  semantics, and three-path consistency.

Item 2 — _handle_tool_error preserves status_code:
- Thread status_code from the source AdCPError through AdCPToolError to
  the REST defensive ToolError catch. Previously hardcoded to 500, which
  mislabeled 4xx errors as 5xx on this path (and silently broke buyer
  agent retry classification for VALIDATION_ERROR / AUTH_REQUIRED /
  *_NOT_FOUND / *_EXCEEDED).
- Six new tests verifying status_code propagation for 400/401/404/422/502
  plus the plain-ToolError fallback to 500.

Item 1 — A2A bare-Exception produces envelope:
- The explicit-skill dispatcher's ``except Exception`` branch previously
  appended a flat {"error": str(e)} into results, producing an artifact
  DataPart that storyboard runners would synthesize as ``MCP_ERROR``.
  Now wraps the untyped exception in a synthetic AdCPError (translated
  via wire_error_code to SERVICE_UNAVAILABLE) so every failure path emits
  the same two-layer envelope shape as the typed-AdCPError branch.
- DRY: extracted shared ``_build_failed_skill_result`` helper so the
  typed and untyped branches don't duplicate envelope construction.
- Four new tests on the helper covering typed/untyped/empty-message/
  envelope-shape parity.

Item 3 — parse_rest_error delegates to _envelope_to_adcp_error:
- The harness's REST error reconstruction duplicated envelope→exception
  parsing already implemented by ``_envelope_to_adcp_error`` (used by the
  A2A unwrapper). Now delegates; ``_envelope_to_adcp_error`` upgraded to
  also extract details from the envelope. HTTP-status fallback retained
  for unstructured bodies.
- Three new tests verifying REST and A2A reconstruction produce
  byte-identical AdCPError subclasses for the same envelope input.

Item 5 — REST symmetric handlers for ValueError + PermissionError:
- MCP's _translate_to_tool_error and A2A's dispatcher both catch raw
  ValueError → AdCPValidationError envelope and raw PermissionError →
  AdCPAuthorizationError envelope. REST previously fell through to a 500
  server error for both. Added @app.exception_handler(ValueError) and
  @app.exception_handler(PermissionError) so all three transports treat
  these exceptions identically. Verified FastAPI's RequestValidationError
  (separate class, not a ValueError subclass) is not shadowed.
- Extracted shared ``_envelope_response()`` helper so the three handlers
  share the same JSONResponse construction path. Updated the architecture
  guard ``test_rest_boundary_uses_envelope`` to accept 1-level transitive
  calls through in-module helpers, preserving the guard's intent without
  forbidding DRY refactors.
- Three new tests verifying 400/403 envelopes + the RequestValidationError
  isolation invariant.

Verification:
- make quality: 4506 unit tests pass.
- tox -e integration: 1880 tests pass (28 deselected — external-network
  ``@pytest.mark.skip_ci`` tests excluded by CI marker, unrelated to
  these changes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@KonstantinMirin KonstantinMirin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #1306 — Error Emission Architecture

Overall this is a well-designed architecture. build_two_layer_error_envelope() as a single source of truth enforced by AST structural guards, cross-transport roundtrip tests across all 11 subclasses, DRY-extracted per-file ratchet guards, and unified reconstruction via _envelope_to_adcp_error. Strong work.


Blockers

1. A2A _handle_explicit_skill ValueError/PermissionError asymmetry

src/a2a_server/adcp_a2a_server.py:1475-1489

AdCPError propagates to the dispatcher (-> failed Task with envelope DataPart), but ValueError and PermissionError are translated to JSON-RPC A2AError via _adcp_to_a2a_error(). Same semantic error class, different wire outcome. MCP and REST both treat these identically through the envelope path.

This contradicts the PR's own B4 contract: "AdCP-domain errors are async-task failures, not JSON-RPC transport errors."

Fix — wrap and re-raise instead of calling _adcp_to_a2a_error:

except ValueError as e:
    raise AdCPValidationError(str(e)) from e
except PermissionError as e:
    raise AdCPAuthorizationError(str(e)) from e

The outer dispatcher already handles both via _build_failed_skill_result. The catch-all except Exception can also be removed — the dispatcher's own except Exception covers it.

Flagged independently by 3 of 5 review agents — strongest cross-agent signal.

2. DRY violation: with_error_logging async/sync wrappers

src/core/tool_error_logging.py:236-253 vs :262-279

The 15-line exception-handling block (context-extraction loop over args/kwargs, _extract_tenant_and_principal, _log_tool_error, _translate_to_tool_error) is copy-pasted verbatim. Only difference: await tool_func(...) vs tool_func(...).

Fix — extract shared helper:

def _handle_tool_exception(tool_func, e, args, kwargs) -> NoReturn:
    context = None
    for arg in args:
        if isinstance(arg, FastMCPContext) or hasattr(arg, "tenant_id"):
            context = arg
            break
    for v in kwargs.values():
        if isinstance(v, FastMCPContext) or hasattr(v, "tenant_id"):
            context = v
            break
    tenant_id, principal_id = _extract_tenant_and_principal(context) if context else (None, None)
    _log_tool_error(tool_func.__name__, e, tenant_id, principal_id)
    _translate_to_tool_error(e)

Major

3. _handle_tool_error plain-ToolError fallback always returns HTTP 500

src/routes/api_v1.py:66-71

synthetic = AdCPError(error_message) uses status_code=500 (class default). Only error_code is overridden, not status_code. A plain ToolError("VALIDATION_ERROR", ...) would produce HTTP 500 + VALIDATION_ERROR — the status/code disagree and a buyer agent may retry a correctable error.

Limited blast radius since AdCPToolError now handles the primary path, but the mismatch is real for any plain ToolError that reaches this fallback.

Fix: Add a simple ERROR_CODE_TO_STATUS lookup for known codes (VALIDATION_ERROR->400, AUTH_REQUIRED->401, NOT_FOUND->404).

4. to_adcp_error() context location differs from envelope builder

src/core/exceptions.py:220-242

to_adcp_error() puts context in details["context"] (nested), while build_two_layer_error_envelope() puts it at top-level context. The test explicitly asserts this asymmetry. Not a bug — but now that the envelope builder is canonical, to_adcp_error() is effectively legacy.

Fix: Add deprecation note to to_adcp_error() docstring.

5. Lazy import + redundant recovery="correctable" in _assignments.py

src/core/tools/creatives/_assignments.py:77-82

AdCPValidationError is already imported at top level from the same module — no circular dependency risk. The explicit recovery="correctable" is redundant (it's the class default) and would silently prevent a future default change from taking effect here.

Fix: Move import to top level, drop the kwarg.

6. REST routes still catch ToolError defensively

src/routes/api_v1.py — every endpoint has try/except ToolError. Both paths produce the same wire shape, but the dual-path is unnecessary now that the app-level AdCPError handler exists. New routes must copy the pattern for no reason.

7. Test precision gaps

  • test_error_format_consistency.py:28-60 — catches (AdCPValidationError, ToolError) with only assert len(str(error)) > 0. Now that boundary behavior is architecturally defined, pin the expected type.
  • test_error_paths.py:356-399 — bare except Exception: pass makes the test vacuous. The targeted TestImportValidation tests already cover the import regression this was guarding.

8. Envelope assertion helpers duplicated across test files

_assert_two_layer_envelope (test_adcp_exceptions.py) and _assert_rest_envelope / _assert_mcp_envelope (test_error_boundary_translation.py) verify the same shape. A spec change to the envelope requires updating 3 helpers. Extract shared assert_envelope_shape() into a test helper module.


Nits

File Note
tool_error_logging.py:141,158,174 f-string logging — prefer %s formatting
tests/harness/_base.py:73 _CODE_TO_CLASS dict relies on insertion order for AUTH_REQUIRED — consider explicit override
test_architecture_no_error_construction_in_impl.py:68 Cross-file import from sibling guard — consider moving to shared module
exceptions.py:448-463 UNSUPPORTED_FEATURE intentional spec divergence — track with issue number
conftest.py:575 60s startup wait justified by CI evidence but root cause unaddressed

What's working well

  • Single source of truth for wire shape via build_two_layer_error_envelope(), enforced by AST structural guard with 1-level transitive call analysis
  • Cross-transport roundtrip tests across all 11 AdCPError subclasses through MCP, A2A, and REST
  • Per-file ratchet guards with DRY-extracted _per_file_cap_guard.py — caps-only-shrink is a great pattern
  • _envelope_to_adcp_error unifies REST and A2A reconstruction — proven byte-identical by roundtrip tests
  • AdCPToolError design carrying envelope + status_code eliminates 4xx-as-5xx mislabeling
  • ContextManager.fail_step byte-identical-by-construction invariant with matching test
  • _serialize_context shallow-copy defense against aliasing footguns, with explicit mutation test

ChrisHuie added a commit that referenced this pull request May 19, 2026
Five items from the steelmanned 2026-05-19 feedback round. Audit-trail
completeness on the async side + capability honesty + targeted error
messages on the approval path.

Item 1 — catalog_management=True → False (capability honesty, P1):
- ``capabilities.py:175`` was declaring True with comment "We have product
  catalog management" — but that's internal admin CRUD over the products
  table, NOT the spec's buyer-driven ``sync_catalogs`` task. AdCP binds
  the flag to ``SyncCatalogsRequest`` (account + catalogs[] +
  delete_missing + validation_mode); no such tool ships in src/.
- Honesty comment matches the property_list_filtering=False rationale —
  declaring True for an unimplemented capability lets buyers reach the
  boundary and discover UNSUPPORTED_FEATURE there instead of at discovery.
- ``test_get_adcp_capabilities.py:99`` schema-construction example updated
  to match production; behavioral test at
  ``test_impl_returns_full_response_with_tenant`` gains positive
  assertions for both property_list_filtering=False AND
  catalog_management=False so regression is caught at the wire.

Item 2 — webhook payload threads structured Error (P2):
- Round-5's try/except wrapper at media_buy_create.py:3694 and
  media_buy_update.py:1425-1438 fixed the workflow-step orphan but only
  set ``error_message``. ``_send_push_notifications`` at
  context_manager.py:715-726 emits ``step.response_data`` (not
  error_message) to push notification subscribers — async buyers were
  receiving status=failed with empty body, losing error code, field
  path, and recovery classification the sync caller got.
- New ``ContextManager.fail_workflow_step_for_exception(step_id, exc)``
  helper builds a one-element ``errors[]`` payload using the SDK's
  ``adcp_error()`` helper (same shape sync transport boundaries emit) and
  passes BOTH error_message AND response_data to update_workflow_step.
  Async and sync paths now at parity.

Item 3 — inner try/except prevents original-exception shadowing (P2):
- Same helper wraps the update_workflow_step call in try/except so a DB
  hiccup during audit can't replace the original AdCPError. Python's
  bare ``raise`` (in the caller) would otherwise pick up the audit-
  failure exception, and the buyer would see an unrelated DB error in
  place of the real validation failure. Audit failure logged for SRE
  visibility but swallowed so the caller's ``raise`` propagates cleanly.
- End-to-end test simulates the caller pattern and verifies the original
  exception identity reaches the test boundary, not the audit one.

Defensive wire-code enforcement in the helper: any source code that
falls outside ``STANDARD_ERROR_CODES`` after the
``wire_error_code`` translation gets a ``SERVICE_UNAVAILABLE`` fallback.
On this branch, ``INTERNAL_ERROR`` is in ``INTERNAL_CODES`` but not in
``ERROR_CODE_MAPPING`` (PR #1306 fills that gap for the sync paths);
the helper's fallback ensures async subscribers see spec-compliant
codes today.

Item 4 — vacuous sister assertion (P2):
- ``test_property_targeting_allowed_enforcement.py:191-193`` had the same
  ``isinstance(response, Error) or all(...)`` short-circuit pattern I
  fixed at line 157 in round 6 — codebase audit confirmed this was the
  only remaining site with the dangerous disjunction shape. Mirrors the
  line-157 split: separate ``not isinstance`` asserting success branch,
  follow-up ``all(...)`` asserting the rule doesn't fire.

Item 5 — approval-path narrowed exception with specific message (P2):
- ``media_buy_create.py:644-651`` calls ``Targeting(**targeting_raw)``;
  corrupt package_config rows surface from the outer ``except Exception``
  catch-all at :725-732 as opaque messages like
  ``"'str' object is not a mapping"``. Admin clicks Approve, sees
  something cryptic, has to grep the audit log to identify the failing
  package. Now narrow ``except (TypeError, ValidationError)`` with a
  targeting-specific message names the package and the offending field.
- Abort behavior preserved (NOT skip-and-continue) — execute_approved
  is mutating; silently dropping targeting would ship a buy without the
  buyer's intended targeting, worse than failing the approval.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 19, 2026
Round-4 batch addressing the 13 items from Konstantine's 2026-05-19 review.
Two blockers, six majors, three nits applied; two nits deferred per the
plan (f-string logging cosmetic, 60s startup wait root-cause).

Blocker 1 — A2A ValueError/PermissionError asymmetry:
- _handle_explicit_skill no longer translates raw ValueError/PermissionError
  to JSON-RPC InvalidParamsError/InvalidRequestError via _adcp_to_a2a_error.
  Both wrap-and-raise as typed AdCPValidationError/AdCPAuthorizationError so
  the outer dispatcher's `except AdCPError` branch produces a failed Task
  with a two-layer envelope — matching the contract every other transport
  already honors.
- Catch-all `except Exception` removed from _handle_explicit_skill; the
  outer dispatcher's `except Exception` covers untyped fallthrough and
  routes through _build_failed_skill_result for uniform envelope shape.
- Three new tests verify ValueError → AdCPValidationError, PermissionError →
  AdCPAuthorizationError, and untyped exception passthrough to the dispatcher.

Blocker 2 — DRY async/sync wrapper duplication in with_error_logging:
- Extracted shared _handle_tool_exception(tool_func, e, args, kwargs) helper.
  Async and sync wrappers each shrink to a 3-line try/except calling the
  shared helper. The 15-line copy-paste of context extraction + tenant/
  principal lookup + error logging + boundary translation is now single-
  source-of-truth.

Major 3 — _handle_tool_error plain-ToolError fallback returns correct status:
- Added _ERROR_CODE_TO_STATUS map (22 entries, mirrors per-class status_code
  declarations in src/core/exceptions). Plain ToolError("CODE", "msg") legacy
  paths now resolve VALIDATION_ERROR→400, AUTH_REQUIRED→401, NOT_FOUND→404,
  etc., rather than always defaulting to 500. Unknown codes still fall
  through to 500. Four new tests verify the map application + fallback.

Major 4 — to_adcp_error() deprecation note:
- Added explicit deprecation note in the docstring. The method is effectively
  legacy now that build_two_layer_error_envelope() is the canonical wire-
  shape producer. The asymmetry (context nested under details[] here vs
  top-level in the envelope) is intentional and called out so callers can
  prefer the envelope builder for new code paths.

Major 5 — _assignments.py lazy import + redundant recovery kwarg:
- Moved AdCPPackageNotFoundError to module-level imports alongside
  AdCPValidationError (no circular-dependency risk). Dropped the redundant
  `recovery="correctable"` kwarg (it's the class default; passing it
  explicitly would silently override a future default change).

Major 6 — Global @app.exception_handler(ToolError):
- Added global ToolError handler in src/app.py delegating to _handle_tool_error.
  All 12 per-route `try/except ToolError` blocks removed from src/routes/api_v1.py.
  AdCPToolError (subclass) matched by the handler, so the typed envelope path
  works end-to-end via the global handler. Two new tests verify the global-
  handler path through the FastAPI test client.

Major 7 — Test precision gaps:
- test_error_format_consistency: vacuous `(AdCPValidationError, ToolError)`
  union catch with only `assert len(str(error)) > 0` refactored to call
  _create_media_buy_impl directly and pin to AdCPValidationError, asserting
  error_code + message. The Pydantic-validation sister test is now pinned
  to ValidationError at the schema layer where it actually fires.
- test_error_paths: bare `except Exception: pass` tightened to typed
  (AdCPError, ValidationError, ValueError) acceptance. The synchronous
  sync_creatives_raw call now asserts that the response carries a
  CreativeAction.failed entry — silently accepting RuntimeError or NameError
  is no longer possible.

Major 8 — Envelope assertion helper unification:
- New tests/helpers/envelope_assertions.assert_envelope_shape() is the
  single source of truth for the two-layer envelope shape. The four
  pre-existing helpers (_assert_two_layer_envelope, _assert_mcp_envelope,
  _assert_a2a_envelope, _assert_rest_envelope) now delegate to it. A spec
  change to the envelope is a one-place update.

Nit 2 — _CODE_TO_CLASS insertion-order dependency:
- AdCPAuthenticationError now pinned explicitly via post-comprehension
  assignment. AUTH_REQUIRED disambiguation no longer depends on dict-
  comprehension order.

Nit 3 — Cross-test-file import:
- Moved collect_error_aliases AST helper from test_architecture_error_code_
  compliance.py to tests/unit/_ast_helpers (shared with the
  no_error_construction_in_impl guard). Sibling guards no longer reach into
  each other's modules.

Nit 4 — UNSUPPORTED_FEATURE intentional spec divergence:
- Docstring expanded with explicit "Intentional spec divergence" note
  including the revisit condition (SDK runtime enforcing terminal). Per
  project convention no inline issue ref — the docstring is the canonical
  record.

Deferred (per plan):
- N1 (f-string logging cosmetic)
- N5 (60s startup wait root cause — separate scope)

Verification: make quality green (4512 unit + format + lint + mypy);
focused integration runs on test_a2a_error_responses.py + test_error_paths.py
(25 tests passed).
ChrisHuie added a commit that referenced this pull request May 20, 2026
CI Unit Tests on PR #1306 caught two schema-vs-library mismatches on
`get-media-buy-delivery-request.json`:

  - `time_granularity` — per-window slice granularity field
  - `include_window_breakdown` — windowed pull breakdown flag

Both are recent additions to the live AdCP spec (adcontextprotocol.org)
that the adcp Python library has not yet declared on
`GetMediaBuyDeliveryRequest`. The Pydantic model uses `extra="forbid"`
in non-production mode, so `test_model_accepts_all_schema_fields` and
`test_field_names_match_schema` rejected the spec-only fields.

Both tests already consult `KNOWN_SCHEMA_LIBRARY_MISMATCHES`, an
allowlist designed precisely for this drift window between live-spec
updates and library catch-up. Adding these two fields keeps the suite
green until the library publishes them, with FIXME(salesagent-amkf)
in the module docstring already tracking allowlist cleanup.

Verified locally: both parametrized tests pass after the allowlist
update; the local schema cache was deleted before the run so the
test path matches CI (fresh fetch from the live spec).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 20, 2026
…er + strip inline issue refs

Pre-handoff Pattern sweep on #1313 caught two repeat patterns:

DRY (4-adapter copy-paste):
- Each of Broadstreet/Mock/Triton/Xandr had identical 4-line block:
    property_list_error = self._check_property_list_supported(packages)
    if property_list_error is not None:
        self.log(f"[red]Error: {property_list_error.errors[0].message}[/red]")
        return property_list_error
- Extracted to base-class helper `_reject_property_list_if_unsupported(packages)`
  that does the check + log + returns error-or-None. Each adapter now uses:
    if err := self._reject_property_list_if_unsupported(packages):
        return err
- The granular `_check_property_list_supported` remains for callers that need
  the raw error without the audit-log side effect.

Inline issue refs (project convention feedback_no_issue_refs_in_comments):
- 6 sites referencing `AdCP #1302 contract 5` rephrased to
  `AdCP honest-declaration contract`
- 3 sites in capabilities.py referencing `#1247 gaps`, `PRs #1306/#1307`
  rewritten to convey the same context without inline refs
- 1 site in media_buy_update.py FIXME referencing `PR #1307 sub-batch 3`
  rephrased to "broader Pattern A migration"
- 1 site in test_property_targeting_allowed_enforcement.py rephrased
- 1 site in test_pydantic_schema_alignment.py allowlist comment rephrased

Verification: 10 adapter property_list tests pass; 4461 unit tests total
pass (excluding pre-existing get-media-buy-delivery schema drift).
ChrisHuie added a commit that referenced this pull request May 20, 2026
…rectly

Pre-handoff Pattern B sweep on #1315 caught a lazy import of
AdCPAdapterError inside an `except Exception` block at products.py:425.
Same M5/Pattern B pattern Konstantine flagged on #1306 and #1276 R8.

Two improvements:
1. Hoist AdCPAdapterError to top-level import group; no circular dep
   with src.core.exceptions (which doesn't import src.core.tools).
2. Replace `except Exception as e: if isinstance(e, AdCPAdapterError):
   raise` with direct `except AdCPAdapterError: raise` followed by
   `except Exception`. Cleaner intent — typed-exception passthrough is
   the explicit case, not a runtime isinstance guard inside a catch-all.

Allowlist line-number update for test_architecture_no_model_dump_in_impl
(products.py:623 → 628) because adding the import block shifted the
logged-filters model_dump call site by 5 lines.

Verification: 4505 unit tests pass (excluding pre-existing get-media-buy-
delivery schema drift unrelated to this change).
ChrisHuie added a commit that referenced this pull request May 20, 2026
Pre-handoff Pattern B sweep on #1307 caught the same lazy-import pattern
Konstantine flagged on #1306 R8 (M2/M3). Sister PR to #1306 in the same
error-emission architecture series — fixing pre-emptively so the first
review doesn't repeat known feedback.

Hoists:
- src/a2a_server/adcp_a2a_server.py:152 `build_two_layer_error_envelope`
- src/app.py:119 `build_two_layer_error_envelope`
- src/core/tool_error_logging.py:93, 182 `AdCPError`/`AdCPAuthorizationError`/
  `AdCPValidationError`/`build_two_layer_error_envelope`
- src/core/tools/creative_formats.py:137 `AdCPServiceUnavailableError`
- src/core/tools/creatives/_assignments.py:80 `AdCPPackageNotFoundError`
- src/core/tools/media_buy_list.py:258 `AdCPValidationError`
- src/routes/api_v1.py:59 `AdCPError`/`build_two_layer_error_envelope`/
  `AdCPToolError`/`extract_error_info`

Justification: src.core.exceptions has no imports from src.core.tools or
src.a2a_server or src.routes; no circular dep exists. All hoists are safe.

Verification: 4459 unit tests pass (1 pre-existing schema-drift failure
on get-media-buy-delivery-request matches the #1330 issue, unrelated).
ChrisHuie added a commit that referenced this pull request May 20, 2026
…er + strip inline issue refs

Mirror of #1313/#1314 pre-handoff cleanup, applied to #1315 (b5 faithful
property intersection) to close the same sister-PR pattern gap that's
been recurring.

DRY (4-adapter copy-paste): identical 4-line block at Broadstreet/Mock/
Triton/Xandr extracted to base-class _reject_property_list_if_unsupported.

Inline issue refs: 14 sites stripped across src/adapters/, src/core/tools/,
tests/. "AdCP #1302 contract 5" rephrased to "AdCP honest-declaration
contract"; #1247/#1306/#1307 refs rewritten without inline numbers.

Verification: 4505 unit tests pass.
ChrisHuie added a commit that referenced this pull request May 20, 2026
CI Unit Tests on PR #1306 caught two schema-vs-library mismatches on
`get-media-buy-delivery-request.json`:

  - `time_granularity` — per-window slice granularity field
  - `include_window_breakdown` — windowed pull breakdown flag

Both are recent additions to the live AdCP spec (adcontextprotocol.org)
that the adcp Python library has not yet declared on
`GetMediaBuyDeliveryRequest`. The Pydantic model uses `extra="forbid"`
in non-production mode, so `test_model_accepts_all_schema_fields` and
`test_field_names_match_schema` rejected the spec-only fields.

Both tests already consult `KNOWN_SCHEMA_LIBRARY_MISMATCHES`, an
allowlist designed precisely for this drift window between live-spec
updates and library catch-up. Adding these two fields keeps the suite
green until the library publishes them, with FIXME(salesagent-amkf)
in the module docstring already tracking allowlist cleanup.

Verified locally: both parametrized tests pass after the allowlist
update; the local schema cache was deleted before the run so the
test path matches CI (fresh fetch from the live spec).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 20, 2026
…er + strip inline issue refs

Mirror of #1313's pre-handoff cleanup, applied to #1314 (b3 kevel branch)
to keep sister PRs from getting flagged on the same patterns Konstantine
already flagged once. Identical fix surface to #1313 because the 4-adapter
honest-declaration content is the same across the b4/b3/b5 trilogy.

DRY (4-adapter copy-paste):
- Same identical 4-line block (property_list_error = ... + log + return)
  at Broadstreet/Mock/Triton/Xandr. Extracted to base-class
  _reject_property_list_if_unsupported(packages) which does check + log
  + returns error-or-None. Call sites collapse to:
    if err := self._reject_property_list_if_unsupported(packages):
        return err
- Granular _check_property_list_supported remains for callers that need
  the raw error without the log side-effect.

Inline issue refs (project convention feedback_no_issue_refs_in_comments):
- 14 sites stripped across src/adapters/, src/core/tools/, tests/.
- "AdCP #1302 contract 5" → "AdCP honest-declaration contract".
- "#1247"/"#1306"/"#1307" PR/issue refs rewritten to convey context
  without inline numbers.

Verification: 20 adapter+kevel property_list tests pass; 4491 unit tests
total pass (excluding pre-existing get-media-buy-delivery schema drift).
Copy link
Copy Markdown
Collaborator

@KonstantinMirin KonstantinMirin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #1306 — Final Round

All prior feedback (2 blockers + 6 majors) verified as resolved. The two-layer envelope architecture is solid — single source of truth via build_two_layer_error_envelope(), cross-transport symmetry fully achieved, DRY agent found zero violations and documented 9 improvements.

Four remaining items.

Major

1. _ERROR_CODE_TO_STATUS has phantom entries and missing wire codes

src/routes/api_v1.py:57-80

Three agents flagged this independently. The map mixes pre-translation codes, wire codes, and phantom codes:

Entry Problem
CAPABILITY_NOT_SUPPORTED: 422 Not a wire code — the actual code is UNSUPPORTED_FEATURE
GONE: 410 Pre-translation code — the wire code is INVALID_STATE
RATE_LIMIT_EXCEEDED: 429 Pre-translation code — the wire code is RATE_LIMITED

Missing entries: UNSUPPORTED_FEATURE: 422, INVALID_STATE: 410, RATE_LIMITED: 429, INVALID_REQUEST: 400.

Low blast radius since AdCPToolError bypasses the map entirely, but a safety net with wrong entries is worse than documenting it as intentionally incomplete.

2. _adcp_to_a2a_error is dead code

src/a2a_server/adcp_a2a_server.py:143-166

Defined and updated (now builds two-layer envelope in its data dict) but never called from any production code path after the A2A refactor. _handle_explicit_skill now re-raises AdCPError directly. Either remove or mark # NOTE: currently unused.

3. extract_error_info widens RecoveryHint to str | None

src/core/tool_error_logging.py:109-110

AdCPError.recovery is RecoveryHint (a Literal), never None. The return type tuple[str, str, str | None] forces a type: ignore[assignment] downstream in _handle_tool_error. Using RecoveryHint | None as the third element would eliminate the suppression.

4. Lazy import of AdCPValidationError in media_buy_list.py

src/core/tools/media_buy_list.py:258

Inside an except ValidationError block. The module already imports from src.core.exceptions at top level. No circular dependency risk. Move to top-level imports for consistency with the rest of the PR.

What's working well

  • Single source of truth for wire shape via build_two_layer_error_envelope() — enforced by AST structural guard with transitive call analysis
  • A2A asymmetry eliminated — ValueError/PermissionError now wrap-and-re-raise to typed AdCPError subclasses, catch-all removed
  • REST defensive catches removed — all 10+ per-route try/except ToolError blocks replaced by global exception handlers
  • DRY: _handle_tool_exception extracts the 15-line shared error handling path from sync/async wrappers
  • DRY: assert_envelope_shape consolidates 4 per-boundary assertion helpers into one canonical helper
  • Per-file ratchet guards with caps-only-shrink enforcement for both Pattern A and ValueError migration
  • _envelope_to_adcp_error unifies REST and A2A reconstruction — proven byte-identical by roundtrip tests

ChrisHuie and others added 21 commits May 22, 2026 16:20
… integration tests

Three CI failures from the PR 1 substrate commit (PR #1306):

1. Lint & Type Check (mypy): the two '# type: ignore[arg-type]' at
   src/app.py:44-45 are load-bearing in CI (full project venv sees the
   WSGIMiddleware vs Starlette.mount ASGI-callable mismatch) but appear
   unused in the pre-commit isolated mypy hook env (no starlette stubs).
   Restored with '[arg-type, unused-ignore]' so both environments are
   happy: CI suppresses the real arg-type error, pre-commit suppresses
   the resulting unused-ignore warning.

2. Integration (other): three TestRecoveryFieldInErrorResponses tests in
   tests/integration/test_error_paths.py still asserted body['recovery']
   (the pre-envelope flat shape); missed in the original wire-shape sweep.
   Updated to body['adcp_error']['recovery'] and body['errors'][0]['recovery'].

3. type-ignore-no-regression hook: baseline bumps 59 -> 61 to reflect the
   restoration of the two src/app.py type:ignores that should not have been
   removed in the substrate commit.

Integration (infra) showed a 'MCP server failed to start on port 45697
within 20s' error in test_update_media_buy_minimal; server log shows
Uvicorn DID start, just past the 20s timeout. Treating as CI flake.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The 20s readiness timeout in the integration test MCP server fixture has
been firing intermittently on PR #1306 (Integration (infra) job). CI logs
show Uvicorn DID start ('Uvicorn running on http://0.0.0.0:...') just past
the 20s threshold — the fixture raises RuntimeError before the listener
opens. Server startup is dominated by Python imports (fastmcp + adcp SDK +
project tree) plus FastAPI lifespan + DB pool warm-up; under CI load this
routinely takes 20-40s.

Bumping to 60s gives headroom for normal cold-boot while still failing
reasonably fast on a genuinely-broken server. Same rationale as
bf5fe3a (test-stack readiness deadline 120s -> 360s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… INTERNAL_CODES translation, ratchet hardening

Round-1 review surfaced 2 blockers + 7 important items. Addressed inline.

BLOCKING fixes:
- B5: A2A ValueError/PermissionError/Exception fallthrough paths at
  adcp_a2a_server.py:1423-1433 now wrap in synthetic AdCPValidationError /
  AdCPAuthorizationError / AdCPError and route through _adcp_to_a2a_error()
  so the envelope is uniform. Previously these paths emitted no envelope
  and let storyboard runners synthesize MCP_ERROR.
- B1: INTERNAL_CODES translation as defense-in-depth. ERROR_CODE_MAPPING
  now translates NOT_FOUND → INVALID_REQUEST, INTERNAL_ERROR → SERVICE_UNAVAILABLE,
  CONFIGURATION_ERROR → SERVICE_UNAVAILABLE so the 9 production raise sites
  that use these base codes today emit STANDARD_ERROR_CODES on the wire.
  Tests added: test_internal_codes_translated_to_wire_safe_codes (positive
  assertion); test_internal_codes_overlap_with_mapping_have_wire_safe_targets
  (architectural guard for future overlap additions).

IMPORTANT fixes:
- B4: tests/harness/_base.py parse_rest_error now reads adcp_error.code and
  errors[0].code from the two-layer envelope first, then falls back to the
  legacy flat error_code shape (kept for tests pre-dating the envelope).
- F1: media_buy_list.py:256 raise ToolError → raise AdCPValidationError so
  the MCP boundary translator runs the envelope builder.
- B2/F5: docstrings on AdCPAuthenticationError and AdCPAuthorizationError
  document the spec-3.0.4-vs-SDK-4.3 recovery mismatch (we follow the SDK
  we run; re-classify on SDK upgrade).
- I1: build_two_layer_error_envelope() context echo uses model_dump(mode="json")
  so datetimes/UUIDs/etc. become JSON-serializable primitives.
- I2: envelope["adcp_error"] is now a dict() copy of payload["errors"][0]
  instead of an alias — kills the mutation footgun before PR 3 starts
  touching both layers.
- I3: structural guards and _per_file_cap_guard helpers now resolve paths
  via Path(__file__).resolve().parents[2] so the guards work regardless of
  invocation directory.

Affected tests updated to assert the new wire-translated codes:
- test_adcp_exceptions: test_not_found_error_returns_404 expects INVALID_REQUEST
- test_error_boundary_translation: roundtrip parametrize tables updated;
  AdCPError → SERVICE_UNAVAILABLE, AdCPNotFoundError → INVALID_REQUEST
- test_error_format_consistency: MCPRecovery parametrize table same updates
- test_error_code_mapping: test_no_overlap_between_mapping_and_internal
  replaced by test_internal_codes_overlap_with_mapping_have_wire_safe_targets

Decisions D9-D16 documented in .claude/notes/error-emission-architecture/PLAN.md
addendum so PR 2/3 authors don't re-derive them.

Deferred to PR 2:
- FIXME comments at 82 allowlisted Pattern A sites (need posted architecture
  issue # to reference).
- Storyboard smoke test re-run (return-path migration must complete first).
- 7 broken context-missing Pattern A sites in media_buy_update.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…or harness isinstance() checks

Integration (creative) failure on PR #1306:
test_strict_mode_missing_package_aborts asserted
``isinstance(result.error, AdCPNotFoundError)`` but the harness reconstructed
a base ``AdCPError`` because the new wire-translation maps the base
``NOT_FOUND`` code to ``INVALID_REQUEST``, which the harness's
``_CODE_TO_CLASS`` registry didn't recognize.

Two fixes:

1. _assignments.py:77 migrates from ``AdCPNotFoundError`` to
   ``AdCPPackageNotFoundError`` so the wire code is PACKAGE_NOT_FOUND
   (STANDARD, no translation needed). The class is a subclass of
   AdCPNotFoundError so ``isinstance()`` checks still pass on the
   reconstructed exception.

2. tests/harness/_base.py ``_CODE_TO_CLASS`` registry now includes the 7
   PR 1 substrate subclasses so the harness reconstructs the specific type
   after a transport roundtrip. Without this, ``PACKAGE_NOT_FOUND`` would
   have fallen back to base AdCPError and lost the isinstance specificity
   anyway.

Drive-by DRY fixes the duplication guard surfaced after this change:

- test_error_envelope.test_substrate_subclasses_present_with_standard_codes
  uses by-name getattr lookup against the exceptions module instead of an
  inline class list (which was duplicating the CANONICAL_ERROR_CODES set
  in test_error_format_consistency).
- test_architecture_no_error_construction_in_impl imports the shared
  ``_collect_error_aliases`` from test_architecture_error_code_compliance
  rather than redefining it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Applies the mandatory + important + suggested-additive items from the
post-merge review on PR #1306. Five reviewer items land in this commit;
three larger items (N2 FIXME at 208 sites, B5 storyboard context echo,
H10 per-transport integration test) are surfaced as separate follow-ups.

B1 (mandatory) — recovery defaults on typed subclasses:
- AdCPMediaBuyNotFoundError, AdCPPackageNotFoundError,
  AdCPAccountPaymentRequiredError, AdCPGoneError: override class default
  to recovery="correctable". The buyer holds the lever for recovery in
  each case (re-issue with right id, settle outstanding balance,
  reference a fresh resource). Previously these inherited their parent's
  terminal default, creating per-site override fatigue across PR 2
  cleanup. Test names + assertions for the base AdCPNotFoundError default
  are explicit about the base-vs-subclass split. New tests cover the
  subclass overrides.
- AdCPCapabilityNotSupportedError carries a docstring documenting the
  intentional SDK divergence (spec classifies UNSUPPORTED_FEATURE as
  terminal; we keep correctable because the buyer can drop the
  unsupported feature).

B4 (mandatory) — A2A explicit-skill emits failed Task + DataPart:
- _handle_explicit_skill no longer translates AdCPError to a JSON-RPC
  A2AError. Typed exception propagates so the explicit-skill dispatcher
  loop wraps the two-layer envelope into the Task's artifact DataPart.
  The standalone _adcp_to_a2a_error helper remains for paths that
  genuinely want a JSON-RPC error shape.
- Audit logging still fires before propagation (audit-gap fix preserved).
- Without this, the PR's own PLAN-cited invalid_transitions storyboard
  scenario failed on A2A: AdCP-level errors were elevated to transport
  failures instead of being represented as failed Tasks.
- Updated test_a2a_server_error_carries_recovery + the A2A boundary
  translation suite to reflect the new contract: tests check propagation
  through _handle_explicit_skill and exercise _adcp_to_a2a_error directly
  for the translation contract.

N3 (suggested) — context threading through serializers:
- AdCPError.to_dict() and to_adcp_error() now include the optional
  ``context`` (model_dumped to JSON when it's a ContextObject). Closes
  the architectural asymmetry where wire envelopes carried context but
  the dict-based serializers dropped it.

B2/B3 (suggested) — cap-dict documentation:
- PATTERN_A_PER_FILE_CAP comment block clarifies entries are migration
  targets (FIXME #1304); legitimate floors use the # noqa marker
  (4 known advisory sites per PR 2 plan).
- VALUE_ERROR_PER_FILE_CAP documents the boundary-vs-internal split.
  Internal contracts stay as ValueError; PR 2 migrates only the boundary
  set.

Pyproject — pin adcp ~=4.3:
- Compatible-release pin so 4.4+ regressions can't sneak in silently.
  Replaces the open >=4.3.0 floor. Bump deliberately.

Pending follow-ups (separate commits):
- N2: # FIXME(#1304) comments at the 82 Pattern A + 126 ValueError sites
  tracked by the structural guards (single sed-like commit; travels
  separately to keep this review-round focused).
- B5: pass context=req.context at the 3 raise sites the storyboard smoke
  test (media_buy_seller/invalid_transitions) exercises.
- H10: tests/integration/test_error_envelope_two_layer_per_transport.py.

Deferred to follow-up issues:
- A2A non-skill paths still produce no envelope (N5)
- A2A audit fix only covers AdCPError branch (N6)
- Installed adcp SDK advertises spec 3.0.1 (N7) — pre-existing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…elope contract

CI on PR #1306 caught 5 test failures introduced by the B4 change in
51b666a (a2a `_handle_explicit_skill` now propagates AdCPError rather
than translating to a JSON-RPC A2AError). The production code change was
correct; three downstream contracts were not updated to match.

Harness regression (2 tests):
- tests/harness/_base.py: `_run_a2a_handler` blindly parsed every Task's
  first artifact as a success-shape `response_cls(**artifact_data)`. With
  the new contract, failed Tasks carry the two-layer envelope as the
  DataPart — parsing them as a success response produced a pydantic
  ValidationError, which then failed `result.error.error_code` assertions
  with `AttributeError`.
- Extracted `_envelope_to_adcp_error(envelope, fallback_message)` helper
  shared by `_unwrap_a2a_server_error` and the new failed-Task branch.
  When `task.status.state == TASK_STATE_FAILED`, reconstruct the typed
  AdCPError subclass from the envelope DataPart and raise it, so callers
  catching `AdCPAuthenticationError` (or asserting `result.error.error_code`)
  see the same type they used to see when A2AError was raised.

Integration test contract drift (3 tests):
- tests/integration/test_a2a_brand_manifest.py::
  test_get_products_neither_brief_nor_brand_rejected — was asserting
  `pytest.raises(A2AError)` from `handler.on_message_send`. After B4 the
  same path returns a failed Task with the envelope DataPart. Updated to
  assert task state + envelope shape (`adcp_error.code == VALIDATION_ERROR`,
  `errors[0].code == VALIDATION_ERROR`).
- tests/integration/test_a2a_error_responses.py::
  TestA2AErrorResponseStructure::test_adcp_error_carries_recovery_through_a2a_boundary
  + test_custom_recovery_override_preserved_through_a2a — were asserting
  `pytest.raises(A2AError)` directly on `_handle_explicit_skill`. After B4
  the typed AdCPError propagates instead. Updated to assert
  `pytest.raises(AdCPAdapterError | AdCPValidationError | AdCPNotFoundError)`
  with `error.recovery` and round-trip through `build_two_layer_error_envelope`
  to keep the envelope-shape coverage.

Verified locally before commit:
- 5 originally-failing CI tests now pass
- 34 tests across the 4 affected files pass
- 583 / 583 Integration (infra) pass (previously 3 failed)
- 528 / 528 Integration (creative) non-live-agent tests pass (previously 2 failed)
- 140 unit tests in error-substrate suites pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Applies the five reviewer items from round 3. Each fix lands with a
dedicated unit-test class so the contract is locked down at the boundary.

Item 4 — _serialize_context() helper (foundational):
- Extract shared serialization for AdCPError.context into a single
  ``_serialize_context()`` helper so to_dict(), to_adcp_error(), and
  build_two_layer_error_envelope() emit byte-identical context payloads.
  Canonical form: isinstance(dict) check, shallow-copy dicts (aliasing
  protection), and model_dump(mode="json", exclude_none=True).
- Three new envelope-test invariants: aliasing protection, exclude_none
  semantics, and three-path consistency.

Item 2 — _handle_tool_error preserves status_code:
- Thread status_code from the source AdCPError through AdCPToolError to
  the REST defensive ToolError catch. Previously hardcoded to 500, which
  mislabeled 4xx errors as 5xx on this path (and silently broke buyer
  agent retry classification for VALIDATION_ERROR / AUTH_REQUIRED /
  *_NOT_FOUND / *_EXCEEDED).
- Six new tests verifying status_code propagation for 400/401/404/422/502
  plus the plain-ToolError fallback to 500.

Item 1 — A2A bare-Exception produces envelope:
- The explicit-skill dispatcher's ``except Exception`` branch previously
  appended a flat {"error": str(e)} into results, producing an artifact
  DataPart that storyboard runners would synthesize as ``MCP_ERROR``.
  Now wraps the untyped exception in a synthetic AdCPError (translated
  via wire_error_code to SERVICE_UNAVAILABLE) so every failure path emits
  the same two-layer envelope shape as the typed-AdCPError branch.
- DRY: extracted shared ``_build_failed_skill_result`` helper so the
  typed and untyped branches don't duplicate envelope construction.
- Four new tests on the helper covering typed/untyped/empty-message/
  envelope-shape parity.

Item 3 — parse_rest_error delegates to _envelope_to_adcp_error:
- The harness's REST error reconstruction duplicated envelope→exception
  parsing already implemented by ``_envelope_to_adcp_error`` (used by the
  A2A unwrapper). Now delegates; ``_envelope_to_adcp_error`` upgraded to
  also extract details from the envelope. HTTP-status fallback retained
  for unstructured bodies.
- Three new tests verifying REST and A2A reconstruction produce
  byte-identical AdCPError subclasses for the same envelope input.

Item 5 — REST symmetric handlers for ValueError + PermissionError:
- MCP's _translate_to_tool_error and A2A's dispatcher both catch raw
  ValueError → AdCPValidationError envelope and raw PermissionError →
  AdCPAuthorizationError envelope. REST previously fell through to a 500
  server error for both. Added @app.exception_handler(ValueError) and
  @app.exception_handler(PermissionError) so all three transports treat
  these exceptions identically. Verified FastAPI's RequestValidationError
  (separate class, not a ValueError subclass) is not shadowed.
- Extracted shared ``_envelope_response()`` helper so the three handlers
  share the same JSONResponse construction path. Updated the architecture
  guard ``test_rest_boundary_uses_envelope`` to accept 1-level transitive
  calls through in-module helpers, preserving the guard's intent without
  forbidding DRY refactors.
- Three new tests verifying 400/403 envelopes + the RequestValidationError
  isolation invariant.

Verification:
- make quality: 4506 unit tests pass.
- tox -e integration: 1880 tests pass (28 deselected — external-network
  ``@pytest.mark.skip_ci`` tests excluded by CI marker, unrelated to
  these changes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round-4 batch addressing the 13 items from Konstantine's 2026-05-19 review.
Two blockers, six majors, three nits applied; two nits deferred per the
plan (f-string logging cosmetic, 60s startup wait root-cause).

Blocker 1 — A2A ValueError/PermissionError asymmetry:
- _handle_explicit_skill no longer translates raw ValueError/PermissionError
  to JSON-RPC InvalidParamsError/InvalidRequestError via _adcp_to_a2a_error.
  Both wrap-and-raise as typed AdCPValidationError/AdCPAuthorizationError so
  the outer dispatcher's `except AdCPError` branch produces a failed Task
  with a two-layer envelope — matching the contract every other transport
  already honors.
- Catch-all `except Exception` removed from _handle_explicit_skill; the
  outer dispatcher's `except Exception` covers untyped fallthrough and
  routes through _build_failed_skill_result for uniform envelope shape.
- Three new tests verify ValueError → AdCPValidationError, PermissionError →
  AdCPAuthorizationError, and untyped exception passthrough to the dispatcher.

Blocker 2 — DRY async/sync wrapper duplication in with_error_logging:
- Extracted shared _handle_tool_exception(tool_func, e, args, kwargs) helper.
  Async and sync wrappers each shrink to a 3-line try/except calling the
  shared helper. The 15-line copy-paste of context extraction + tenant/
  principal lookup + error logging + boundary translation is now single-
  source-of-truth.

Major 3 — _handle_tool_error plain-ToolError fallback returns correct status:
- Added _ERROR_CODE_TO_STATUS map (22 entries, mirrors per-class status_code
  declarations in src/core/exceptions). Plain ToolError("CODE", "msg") legacy
  paths now resolve VALIDATION_ERROR→400, AUTH_REQUIRED→401, NOT_FOUND→404,
  etc., rather than always defaulting to 500. Unknown codes still fall
  through to 500. Four new tests verify the map application + fallback.

Major 4 — to_adcp_error() deprecation note:
- Added explicit deprecation note in the docstring. The method is effectively
  legacy now that build_two_layer_error_envelope() is the canonical wire-
  shape producer. The asymmetry (context nested under details[] here vs
  top-level in the envelope) is intentional and called out so callers can
  prefer the envelope builder for new code paths.

Major 5 — _assignments.py lazy import + redundant recovery kwarg:
- Moved AdCPPackageNotFoundError to module-level imports alongside
  AdCPValidationError (no circular-dependency risk). Dropped the redundant
  `recovery="correctable"` kwarg (it's the class default; passing it
  explicitly would silently override a future default change).

Major 6 — Global @app.exception_handler(ToolError):
- Added global ToolError handler in src/app.py delegating to _handle_tool_error.
  All 12 per-route `try/except ToolError` blocks removed from src/routes/api_v1.py.
  AdCPToolError (subclass) matched by the handler, so the typed envelope path
  works end-to-end via the global handler. Two new tests verify the global-
  handler path through the FastAPI test client.

Major 7 — Test precision gaps:
- test_error_format_consistency: vacuous `(AdCPValidationError, ToolError)`
  union catch with only `assert len(str(error)) > 0` refactored to call
  _create_media_buy_impl directly and pin to AdCPValidationError, asserting
  error_code + message. The Pydantic-validation sister test is now pinned
  to ValidationError at the schema layer where it actually fires.
- test_error_paths: bare `except Exception: pass` tightened to typed
  (AdCPError, ValidationError, ValueError) acceptance. The synchronous
  sync_creatives_raw call now asserts that the response carries a
  CreativeAction.failed entry — silently accepting RuntimeError or NameError
  is no longer possible.

Major 8 — Envelope assertion helper unification:
- New tests/helpers/envelope_assertions.assert_envelope_shape() is the
  single source of truth for the two-layer envelope shape. The four
  pre-existing helpers (_assert_two_layer_envelope, _assert_mcp_envelope,
  _assert_a2a_envelope, _assert_rest_envelope) now delegate to it. A spec
  change to the envelope is a one-place update.

Nit 2 — _CODE_TO_CLASS insertion-order dependency:
- AdCPAuthenticationError now pinned explicitly via post-comprehension
  assignment. AUTH_REQUIRED disambiguation no longer depends on dict-
  comprehension order.

Nit 3 — Cross-test-file import:
- Moved collect_error_aliases AST helper from test_architecture_error_code_
  compliance.py to tests/unit/_ast_helpers (shared with the
  no_error_construction_in_impl guard). Sibling guards no longer reach into
  each other's modules.

Nit 4 — UNSUPPORTED_FEATURE intentional spec divergence:
- Docstring expanded with explicit "Intentional spec divergence" note
  including the revisit condition (SDK runtime enforcing terminal). Per
  project convention no inline issue ref — the docstring is the canonical
  record.

Deferred (per plan):
- N1 (f-string logging cosmetic)
- N5 (60s startup wait root cause — separate scope)

Verification: make quality green (4512 unit + format + lint + mypy);
focused integration runs on test_a2a_error_responses.py + test_error_paths.py
(25 tests passed).
… audit

Self-audit after the R4 batch caught three pattern-survivors that
Konstantine would re-flag at the next review. Each maps to a pattern he
already cited explicitly. Fixing them in-place so the cross-codebase
extraction discipline is honored (not just one-off citations).

Pattern from R1.1 (flat {"error": str(e)} dict in A2A path):
- `_handle_explicit_skill` was fixed in R3 to emit two-layer envelopes via
  `_build_failed_skill_result`. But the sibling top-level exception handler
  in `on_message_send` (adcp_a2a_server.py:969) was still attaching a flat
  `{"error": str(e), "error_type": ...}` dict to the failed Task's artifact.
  Wire-visible payload, same defect as R1.1.
- Fix: extract `_build_error_envelope(exc)` as the shared envelope builder
  for "wrap-arbitrary-exception → two-layer envelope" (DRY per CLAUDE.md).
  Both `_build_failed_skill_result` and the on_message_send handler now go
  through it. The single source of truth for the wire shape extends to
  every A2A failure path, not just the explicit-skill dispatcher.

Pattern from M5 (redundant recovery= kwarg matching class default):
- _assignments.py R4 fix dropped one redundant `recovery="correctable"`.
  Two sister sites still passed the same redundant kwarg:
    src/core/tools/media_buy_delivery.py:83
    src/core/tools/media_buy_list.py:98
  Both pass `recovery="correctable"` which equals the AdCPValidationError
  class default — passing it explicitly would silently override a future
  default change.
- Fix: drop both kwargs. The class default speaks for itself.

Pattern from M7 (vacuous pytest.raises union catches):
- M7 fixed `(AdCPValidationError, ToolError)` unions in two tests. Two
  sister sites in test_inventory_profile_adcp_compliance.py used
  `pytest.raises((ValueError, TypeError))` with no assertion body. Pydantic
  v2 raises ValidationError (a ValueError subclass), not TypeError — the
  union was over-broad and the no-assertion body was vacuous.
- Fix: pin to `ValidationError` and add a meaningful assertion (error
  references the malformed field for FormatId; error_count() > 0 for
  Property).

Verification: 165 unit tests pass (touched files + boundary translation +
adcp_exceptions + format_consistency); 150 A2A integration + error_paths
tests pass; ruff format + lint clean on touched files.
…refs

- Hoist src.core.exceptions imports to module top in:
  - a2a_server/adcp_a2a_server.py (line 152 + 211 sites)
  - core/tool_error_logging.py (line 99 + 188 sites)
  - routes/api_v1.py (line 92 site)
  No circular dependency on src.core.exceptions, so these are pure hoists.
- F-string logging → %-style at tool_error_logging.py:146, 163, 179.
- Strip 4 #1304 inline issue refs from two architecture guard files
  (test_architecture_no_error_construction_in_impl + no_value_error_in_impl);
  the FIXME marker convention switches from FIXME(#1304) to
  FIXME(salesagent-pattern-a) per the no-issue-refs-in-comments policy.
- Schema drift: add if_catalog_version + if_pricing_version to the
  get-products-request allowlist; update test_offline_mode payload to
  include the now-required cache_scope field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. _ERROR_CODE_TO_STATUS in api_v1.py: replace pre-translation codes with
   wire codes, add INVALID_REQUEST=400 entry.
   - CAPABILITY_NOT_SUPPORTED → UNSUPPORTED_FEATURE (the actual wire code)
   - GONE → INVALID_STATE (wire code)
   - RATE_LIMIT_EXCEEDED → RATE_LIMITED (wire code)
   Resorted by status code for readability.

2. _adcp_to_a2a_error in adcp_a2a_server.py: add "NOTE: currently unused"
   to the docstring. _handle_explicit_skill now re-raises AdCPError
   directly so the boundary exception handler builds the envelope, but
   the helper is retained as a documented direct-translation API for
   the boundary-translation tests.

3. extract_error_info return type: widen recovery from str | None to
   RecoveryHint | None so callers don't need a type: ignore[assignment]
   when copying into AdCPError.recovery. Legacy-ToolError parsing branch
   guards the assignment against the three valid RecoveryHint literals.
   Drop the type: ignore in _handle_tool_error.

4. media_buy_list.py: remove redundant lazy import of AdCPValidationError
   inside the ValidationError handler. The module already imports it at
   the top.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- nit #3 (UNSUPPORTED_FEATURE spec divergence): add FIXME tracker tag
  to the existing docstring so reviewers can grep for revisit-candidate
  sites. The docstring already explains the divergence rationale and
  revisit condition; the tag is grep-affordance.

- nit #4 (60s startup wait root cause at tests/e2e/conftest.py:80):
  document what the budget actually covers (alembic migration, MCP/A2A/
  REST router registration, first-call cold-path through the typed
  creative-format registry + adcp library). Empirically 25-45s in CI;
  60s leaves ~30% headroom.

nit #2 (_CODE_TO_CLASS insertion-order) was already addressed by the
explicit override line at tests/harness/_base.py — no change needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Files drifted from ruff style after rebase onto new main. No behavior change.
- Add media_buy_list.py:3 to PATTERN_A_PER_FILE_CAP — these are
  legitimate advisory Error() sites in success envelope (2 AUTH_REQUIRED
  + 1 TARGETING_REHYDRATION_FAILED) returned inside
  GetMediaBuysResponse.errors[] alongside successful media_buys[],
  not fatal raises. Allowlist-permanent.
- Lower media_buy_update.py ValueError cap 5→3 to match actual count
  (caps must track reality per shrinking-ratchet rule).
The structural guard's docstring claimed every capped file "carries" a
FIXME(salesagent-pattern-a) comment, but none actually do — the cap
dict + assert_caps_only_shrink ratchet is the real enforcement. Soften
the language so the docstring describes what the code does, not what it
might do.

Also clarify that the cap dict can hold both migration-target sites AND
legitimate advisory Error() sites in success envelopes (the latter
allowlist-permanent, marked with inline comments).
…ctable

The original B1 override marked AdCPAccountPaymentRequiredError as
recovery='correctable' based on the buyer perspective (settle balance
and retry). #1334 then landed BDD storyboard rows (UC-002 partition +
boundary) asserting the spec contract: recovery='terminal' from the
sales agent's view because there is no in-band remediation — the buyer
must settle externally first.

Remove the override (inherits 'terminal' from AdCPError) and update the
unit test to match. Also drop the stale 'mcp-account_not_found' entry
from the T-UC-004-partition-account strict-xfail set since the MCP
boundary now correctly translates the account lookup to
AdCPError(ACCOUNT_NOT_FOUND).
…gh + status-table

Addresses 5 of 25 items from Konstantine's 2026-05-20 "Structural
Follow-up" review on PR #1306 (1 Critical + 3 High + 1 Medium real-bug).
The remaining Medium/Low items will be batched separately.

### Item #1 (Critical): 5 A2A skill handlers bypassed build_two_layer_error_envelope

Replace custom-dict returns with `raise AdCPValidationError(...)` in:
- _handle_create_media_buy_skill (missing_params + ValidationError branches)
- _handle_sync_creatives_skill
- _handle_create_creative_skill
- _handle_assign_creative_skill
- _handle_update_performance_index_skill

The outer dispatcher catches AdCPError and routes through
_build_failed_skill_result -> _build_error_envelope, producing the
single two-layer wire shape. Previously these branches emitted
flat dicts that _serialize_for_a2a classified as successful Tasks,
erasing the real wire code on the buyer side.

### Item #2 (High): _handle_explicit_skill audit-log asymmetry

Normalize ValueError/PermissionError to typed AdCPError in a unified
except-tuple, then audit-log uniformly. Previously the audit call was
inside the AdCPError branch only, silently skipped for the
wrapped-from-ValueError and wrapped-from-PermissionError paths.

### Item #3 (High): REST handlers had zero logging

Add logger.warning to _envelope_response so all three REST exception
handlers (adcp_error_handler, value_error_handler,
permission_error_handler) leave a uniform breadcrumb with code,
message, and request path. Mirrors the A2A audit-log symmetry above.

### Item #4 (High): Structural guard pinned dead-code helper

Replace `_adcp_to_a2a_error` (dead code per its own docstring) with
`AdCPRequestHandler._build_error_envelope` (the production A2A path
called from on_message_send) in BOUNDARY_FUNCTIONS. Extend
_collect_module_functions to index FunctionDef nodes inside ClassDef
so the guard can pin class methods, not just module-level functions.

### Item #8 (Medium, real bug): _ERROR_CODE_TO_STATUS conflicts

Auto-generate the wire-code -> HTTP status table from
AdCPError.__subclasses__() at module load. Eliminates two real
conflicts the hand-maintained table carried:
- AUTH_REQUIRED was 401 but AdCPAuthorizationError.status_code=403
- SERVICE_UNAVAILABLE was 503 but AdCPAdapterError.status_code=502
When a wire code is shared by multiple subclasses, pick the more
restrictive status (403 > 401, 503 > 502) since plain-ToolError
fallback paths carry no context to disambiguate. INVALID_REQUEST
is anchored to 400 (its conventional HTTP-spec value) since it's
the generic 4xx catchall, not a specific typed code.

Also propagate wire-translated codes via ERROR_CODE_MAPPING so a
class declaring `error_code = "NOT_FOUND"` (404) correctly propagates
to its wire form `INVALID_REQUEST`/`VALIDATION_ERROR` consumers.

### Test updates

Updated 5 unit tests that asserted on the OLD pre-fix behavior:
- test_create_media_buy_validates_required_adcp_parameters: now expects
  AdCPValidationError raise instead of dict return
- test_missing_params_returns_error_dict -> renamed to
  test_missing_params_raises_typed_validation_error
- test_validation_error_returns_error_dict -> renamed to
  test_validation_error_raises_typed_validation_error
- test_missing_required_params_error_consistent: assertion path updated
- test_a2a_validation_error_missing_params: now expects raise
- test_plain_tool_error_with_auth_code_returns_401 -> renamed to
  test_plain_tool_error_with_auth_code_returns_403 (matches the
  newly-correct AUTH_REQUIRED -> 403 mapping)

Local quality: 4682 passed, 1 skipped, 20 xfailed.
…llow-up

Addresses items #6, #12, #16, #17, #19, #20, #23, #24 from the
2026-05-20 review (small mechanical fixes; no architectural change):

- #6 + #17: ``extract_error_info`` envelope branch now coerces
  ``recovery`` through ``_coerce_recovery()``, which validates the value
  against ``typing.get_args(RecoveryHint)`` instead of a hard-coded tuple
  duplicating the Literal. Both the envelope and legacy ToolError branches
  go through the same helper, so a future RecoveryHint extension is picked
  up automatically.

- #12: ``_body_contains_builder_call`` docstring now correctly says
  "N-level transitive call analysis with cycle detection via ``seen``"
  (was "1-level transitive", but the implementation actually walks the
  full call graph with cycle detection).

- #16: dropped the duplicate ``logger.warning`` in ``on_message_send``'s
  AdCPError handler — ``_handle_explicit_skill`` already logs the same
  error (with audit log + activity feed) before raising. Two log lines
  for the same failure was noise.

- #19: ``_translate_to_tool_error`` now uses ``raise error`` (not bare
  ``raise``) on its passthrough branches so the ``NoReturn`` contract
  holds even when called outside an active ``except`` block.

- #20: ``_handle_tool_error`` defensively copies ``e.envelope`` before
  returning to ``JSONResponse`` (preserves the envelope-builder's
  immutability contract — the dict is owned by the AdCPToolError instance
  and may be referenced elsewhere).

- #23: ``_handle_tool_exception`` Context extraction switched from
  ``hasattr(arg, "tenant_id")`` to explicit ``isinstance(arg, (FastMCPContext,
  ToolContext))`` — the hasattr check matched any Pydantic model with a
  ``tenant_id`` field, treating request bodies as Contexts.

- #24: hoisted ``from fastmcp.exceptions import ToolError`` to the top
  imports in ``src/app.py``; dropped the mid-file ``# noqa: E402``.

Local quality: 4682 passed, 1 skipped, 20 xfailed.

Deferred (separate scope):
- #5 byte-identical test, #7 delete per-boundary wrappers (30+ call
  sites), #9 decorator extraction, #10/14/25 move AdCPToolError to
  neutral module, #11 fail_step caller (#1311 coordination), #13
  CWD-relative paths, #15 dead legacy fallback, #18 already softened
  in b5c4356, #21 status_code keyword-only, #22 synthetic mutation.
… wire shape

When Item #1 of Konstantine's structural follow-up flipped the 5 A2A
skill handlers from custom-dict returns to typed AdCPError raises,
two tests slipped through my earlier unit-test sweep because they only
run with the full Docker stack:

- ``tests/integration/test_a2a_error_responses.py``: two assertions
  on ``artifact_data["success"] is False`` for validation + auth
  failures. The new wire shape is the spec two-layer envelope (no
  ``success`` field at top level — that was an ad-hoc derivation
  from ``_serialize_for_a2a`` for the dict-return path). Updated to
  assert on ``adcp_error.code`` and ``errors[0].code`` instead.

- ``tests/e2e/test_a2a_webhook_payload_types.py``: three sites still
  sent the legacy ``product_ids`` / ``total_budget`` shape that
  ``_handle_create_media_buy_skill`` explicitly rejects (per its own
  docstring). Previously these "wrongly passed" as completed Tasks
  because the dict-return bypass routed them through; now they
  correctly raise AdCPValidationError → failed Task. Replaced with
  AdCP-spec packages[] format so they actually exercise the
  completed-status path the tests were named for.
…auth-error shape

Three follow-up test repairs after the wire-shape flip in 0fb997b:

- e2e webhook payload tests (3 sites): the prior fix substituted
  product_ids (plural) and budget-as-object inside packages[]. The
  PackageRequest schema validates product_id (singular str) and
  budget: float, so the server was rejecting these as VALIDATION_ERROR
  — the e2e ran but produced a failed Task instead of the completed
  Task the test expected. Switched to the canonical
  _discover_product_and_pricing + build_adcp_media_buy_request pattern
  the sibling submitted-status test already uses.

- test_create_media_buy_auth_error_includes_errors_field: rolled back
  envelope-level adcp_error assertion. Principal not found is an
  established Pattern A site — media_buy_create.py:1414 returns a
  CreateMediaBuyError variant carrying Error(code=AUTH_REQUIRED), not
  a raised AdCPAuthorizationError. Documented by sister test
  test_principal_not_found_returns_error_response in test_media_buy.py.

- test_error_response_has_consistent_structure +
  test_errors_field_structure_from_validation_error: now use
  pytest.raises(AdCPValidationError) per the new raise-at-skill-handler
  contract (Konstantine structural follow-up Item #1).

- test_create_media_buy_message_field_exists: dropped deprecated
  top-level budget object — adcp 3.6+ places budget at the package
  level only; CreateMediaBuyRequest rejects the top-level form with
  extra_forbidden.

Verified locally: 21/21 in the touched integration suites pass.
Three test files needed format reconciliation after rebasing on main
(which now carries the #1338 supply-chain hardening + pre-commit
black/ruff SHA pins). Pure cosmetic — 36 lines reformatted across
3 files. No assertion or logic changes.

SKIP=black on this commit due to known black/ruff multi-line formatting
disagreement on pre-existing test code; ruff format is the project's
authoritative formatter via make quality.

Local quality: 4682 passed, 1 skipped, 20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ChrisHuie ChrisHuie force-pushed the feature/error-emission-architecture-pr1 branch from a2a2844 to 178df15 Compare May 22, 2026 22:53
ChrisHuie and others added 7 commits May 22, 2026 17:52
Per the 2026-05-20 structural follow-up review on PR #1306, three Low
items previously deferred to follow-up work:

- #13: Architecture guard tests anchor scan paths on
  Path(__file__).resolve().parents[2] instead of CWD-relative
  Path("src/..."). Pytest from a subdir would otherwise silently match
  nothing and the guard would pass without scanning anything.
    - test_architecture_error_code_compliance.py: _SCAN_DIRS anchored
    - test_architecture_error_envelope_two_layer.py: _function_calls_builder
      resolves paths via _REPO_ROOT

- #15: Legacy flat-shape fallback in tests/harness/_base.py
  _envelope_to_adcp_error marked DEPRECATED with FIXME removal condition.
  Production envelopes use the two-layer adcp_error/errors[] shape; the
  fallback is only reached via the test-only helper _adcp_to_a2a_error,
  which still emits top-level error_code. Removal tied to that helper's
  deletion (planned in #7 batch).

- #21: AdCPToolError.__init__ now keyword-only on status_code. All 9
  existing callers already pass status_code= as kwarg, so no behavior
  change. Prevents a missing positional arg from silently defaulting to
  500 and misclassifying a 4xx as 5xx.

Local quality: 4682 passed, 1 skipped, 20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses three related items from the 2026-05-20 structural follow-up
review on PR #1306:

- #14: `_handle_tool_error` was defined in api_v1.py but called only
  from app.py via lazy import. Moved (renamed to public
  `handle_tool_error`) plus its helpers (`_build_error_code_to_status`,
  `_ERROR_CODE_TO_STATUS`) to src/core/tool_error_logging.py — the
  module that already hosts AdCPToolError and extract_error_info.
  Underscore prefix dropped per Konstantine's ask.

- #25: app.py now imports `handle_tool_error` eagerly at the top
  instead of the lazy `from src.routes.api_v1 import _handle_tool_error`
  inside tool_error_handler. No circular-import risk after the move
  (tool_error_logging has no transitive dependency on app).

- #10: REST module (api_v1.py) no longer imports AdCPToolError. The
  class stays in tool_error_logging.py (alongside the boundary translator
  functions that build it). Konstantine's literal suggestion was to move
  AdCPToolError to src/core/exceptions.py; this commit takes the
  alternative he listed (REST no longer imports the MCP-boundary type)
  because adding fastmcp to exceptions.py would couple the neutral
  exceptions module to MCP.

Cleanup in api_v1.py:
- Dropped imports of JSONResponse, ToolError, AdCPError, AdCPToolError,
  extract_error_info, build_two_layer_error_envelope. All moved with
  the function or no longer used after the move.

Test updates:
- tests/unit/test_error_boundary_translation.py: 10 lazy imports of
  `_handle_tool_error` updated to direct imports of `handle_tool_error`
  from tool_error_logging. Docstring references updated. Adjacent
  AdCPToolError + handle_tool_error imports merged onto one line.

Local quality: 4682 passed, 1 skipped, 20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Eliminates the undocumented instance-attribute mutation pattern in the
REST `handle_tool_error` fallback path:

    BEFORE:
        synthetic = AdCPError(error_message)
        synthetic.error_code = error_code               # mutating class attr
        synthetic.status_code = _ERROR_CODE_TO_STATUS.get(...)
        if recovery is not None:
            synthetic.recovery = recovery

    AFTER:
        synthetic = AdCPError(
            error_message,
            error_code=error_code,
            status_code=_ERROR_CODE_TO_STATUS.get(error_code, 500),
            recovery=recovery,
        )

`AdCPError.__init__` now accepts `error_code` and `status_code` as
keyword overrides (alongside the existing `recovery` kwarg). Typed
subclasses still set these as class attributes; the base class accepts
them as kwargs so synthesizing a base AdCPError from a plain ToolError
no longer requires post-construction mutation hiding an undocumented
API surface.

Local quality: 4682 passed, 1 skipped, 20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… to sibling branch

Konstantine's 2026-05-20 review item #11 flagged ContextManager.fail_step
as having zero production callers — the "byte-identical by construction"
guarantee has nothing to enforce because no production path uses the
method. Konstantine's recommended fix: wire one production caller in this
PR (suggested ``_mark_approval_failed`` in ``order_approval_service.py``).

That production site is being refactored on another in-flight branch (the
SyncJob terminal-ordering + ``_mark_approval_failed`` rework), so wiring
the caller here would conflict with the sibling work. The substrate-rule
closure is being scoped to that sibling branch instead — when it adds the
``_mark_approval_failed`` rewrite, it will also wire ``fail_step`` so the
two changes land together without merge-order risk.

Until then: this docstring FIXME documents the gap so a reviewer reading
``fail_step`` understands the substrate-only status. Tests in
``test_context_manager_fail_step.py`` remain the only ingress.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds TestWireBytesIdenticalAcrossTransports — drives real transports
(REST via TestClient hitting an endpoint that raises AdCPError, A2A via
the dispatcher's failed-skill builder) and asserts the wire envelope
bytes match after json.dumps(sort_keys=True).

Konstantine flagged that prior tests asserted in-memory dict equality via
shared parsing helpers (parse_rest_error and _envelope_to_adcp_error both
call into the same unwrapper), so byte-identical was tautological. The
new test extracts each transport's actual wire bytes:

- REST: TestClient.get('/api/v1/capabilities') with the underlying _raw
  patched to raise AdCPError; response.json() is the wire body
- A2A: AdCPRequestHandler._build_failed_skill_result(skill, exc)['error_envelope']
  is the embedded envelope in the failed-Task DataPart

Two scenarios pinned: AdCPValidationError (with field) and
AdCPNotFoundError. If either transport ever drifts (extra field, different
key order, missing layer), this test fails immediately.

Pins the "byte-identical by construction" claim in:
- tests/unit/test_error_envelope.py:6 docstring
- src/core/context_manager.py:419 fail_step docstring
- tests/harness/_base.py:809 harness docstring

Local quality: 4684 passed, 1 skipped, 20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#7

Delete the four boundary-specific assertion wrappers in favor of the
canonical `assert_envelope_shape` helper in `tests/helpers/envelope_assertions.py`.
The canonical helper's docstring claimed "Replaces the per-boundary
helpers" but the wrappers remained as pure-rename indirection — the
spec change "update one place" promise wasn't actually honored.

Removed:
- `_assert_mcp_envelope` (tests/unit/test_error_boundary_translation.py)
- `_assert_a2a_envelope` (tests/unit/test_error_boundary_translation.py)
- `_assert_rest_envelope` (tests/unit/test_error_boundary_translation.py)
- `_assert_two_layer_envelope` (tests/unit/test_adcp_exceptions.py)

Added to canonical helper: `check_mcp_tool_error: bool = False` flag that
asserts target is an `AdCPToolError` instance before reading `.envelope`.
This was the only added behavior `_assert_mcp_envelope` had beyond pure
rename — pulling it into the canonical helper preserves the MCP-specific
type pin without keeping a wrapper function.

Call sites updated:
- 7 MCP sites → `assert_envelope_shape(exc, code, check_mcp_tool_error=True, ...)`
- 6 A2A sites → `assert_envelope_shape(data, code, recovery=..., check_backward_compat=True)`
- 12 REST sites → `assert_envelope_shape(body, code, ...)` (pure rename)
- 10 _assert_two_layer_envelope sites → `assert_envelope_shape(body, code, ...)` (pure rename)

Single source of truth for envelope shape assertions: future spec
changes update exactly one function.

Local quality: 4684 passed, 1 skipped, 20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…stantine #9

A2A skill handlers all carried the same try/except skeleton with diverging
prefix messages ("Unable to retrieve", "Failed to", "Error in"). Extract
the canonical pattern into a `@_a2a_skill("skill_name")` decorator.

Decorator behavior (defined above AdCPRequestHandler class):
- AdCPError → re-raise unchanged (dispatcher's `except AdCPError` branch
  routes through `_build_failed_skill_result` → wire envelope)
- Untyped Exception → log with exc_info=True + standardized InternalError
  message: `f"{skill_name} skill handler failed: {e}"`

Applied to all 14 handlers with the outer try/except pattern:
  get_products, create_media_buy, sync_creatives, list_creatives,
  create_creative, get_creatives, assign_creative,
  get_adcp_capabilities, list_creative_formats,
  list_authorized_properties, update_media_buy, get_media_buys,
  get_media_buy_delivery, update_performance_index

Inner try/except blocks (e.g., ValidationError handling) preserved as-is.

Removed ~150 lines of boilerplate. Standardized wire error message shape.
Added exc_info=True on the log line (previously absent).

Also: removed unused `# type: ignore[arg-type]` at line 1500
(_handle_explicit_skill dispatcher); .type-ignore-baseline ratchets
61 → 60 as a result.

Local quality: 4684 passed, 1 skipped, 20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ChrisHuie added a commit that referenced this pull request May 23, 2026
Rebased on current main (which absorbed #1337 and #1338 via the
intervening week). 9 files needed ruff-format reconciliation:

- 6 files in this PR's scope (test_gam_lifecycle, test_media_buy,
  test_update_media_buy_behavioral, test_property_targeting_allowed_enforcement,
  test_targeting_overlay_roundtrip, test_mock_product_config_post) — touched
  by this branch's commits and need formatting after rebase replayed them
  through ruff format.

- 3 files from main (creatives.py + media_buy_create.py + test_creatives_blueprint.py
  introduced by #1337) — ruff format on this branch disagrees with the
  black formatting #1337's pre-commit applied. Formatting these unifies the
  branch's view; same issue affected #1306's rebase and was resolved the
  same way.

Pure cosmetic — no logic changes. Local quality: 4634 passed, 1 skipped,
20 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants