feat(decisioning): MockAdServer Protocol + /_debug/traffic counters (#383)#405
Merged
feat(decisioning): MockAdServer Protocol + /_debug/traffic counters (#383)#405
Conversation
…383) Adopter platforms returning spec-valid envelopes without calling any upstream (sync_creatives -> [], get_media_buy_delivery -> []) are textbook facade adapters that AdCP's anti-facade contract is designed to catch. This adds the platform-side outbound traffic recorder and a debug endpoint so storyboard runners can assert against per-method call counts. * MockAdServer Protocol + InMemoryMockAdServer default impl (threading.Lock — works from both async paths and the sync ThreadPoolExecutor framework dispatches sync platform methods on). * DebugTrafficMiddleware mounting GET /_debug/traffic, gated behind serve(enable_debug_endpoints=True) so production deployments stay closed by default. * Wire-up in v3 reference seller: platform records on every Sales method; app.py flips on the endpoint for storyboard runners. * Composes with the framework-side inbound traffic recorder (issue #347) — outbound counts here, inbound counts there, both expose JSON over the same surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
May 3, 2026
… typed responses (PR #408 fix-pack) Rebase onto main (53d8b8c) restored MockAdServer wiring on the 5 existing sales methods that PR #405 added. Apply _record(...) calls on the 6 new methods this PR introduces (media_buys.list, performance.feedback, creatives.formats, creatives.list, accounts.sync, accounts.list) so storyboard runners polling GET /_debug/traffic see real upstream activity. Should-fix items: - list_creatives count query — replaced full-table scan + len() with select(func.count()).select_from(CreativeRow). Test mock updated to expose .scalar() instead of .scalars(). - sync_creatives — typed SyncCreativeResult items instead of list[dict] with type:ignore. - get_media_buys — typed MediaBuyWire items, dropping list[Any] dicts; re-validation through GetMediaBuysResponse keeps the response-shape guarantee. - list_creative_formats — typed Format items. - sync_accounts — dropped # type: ignore[union-attr] on brand.domain. Spec requires both brand and brand.domain (no None guard needed). Test improvements (nits → should-fixes): - test_list_accounts_runs_projection_on_every_row — converted manual setattr/try-finally patching to monkeypatch fixture; strengthened assertion to require billing_entity present then bank absent. - Inline `from adcp.server import current_tenant` lifted to the module top of platform.py (one source of truth for the contextvar reader). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
May 3, 2026
…recipient (#408) * feat(v3-ref-seller): broaden surface — 4 missing sales methods + sync/list_accounts + invoice_recipient (#376, #377, #378) Lifts the v3 reference seller from the five required sales methods up to the full v6.0 rc.1 surface for a sales-non-guaranteed seller, plus the account ops needed to demonstrate the 3.1-readiness projection guard. * #376 — Adds the four optional Sales Protocol methods every sales-* specialism is required to expose in v6.0 rc.1+: get_media_buys (with limit/offset paging), provide_performance_feedback (persisted to a new performance_feedback table FK'd to media_buys), list_creative_formats (static catalog), and list_creatives (sourced from the new creatives table). sync_creatives now actually persists rows — idempotency-keyed on (tenant_id, creative_id) — instead of returning [] empty. * #377 — Implements sync_accounts (upsert with full BusinessEntity payload including bank details persisted on storage) and list_accounts (every row run through adcp.decisioning.project_account_for_response before serialization). The list_accounts call site is the headline 3.1-readiness claim — bank details cannot leak on response. * #378 — Adds MediaBuy.invoice_recipient as a first-class JSON column. create_media_buy extracts CreateMediaBuyRequest.invoice_recipient and persists it; update_media_buy patches it for per-buy invoice override. Models added: Creative (account-scoped, manifest_json blob), and PerformanceFeedback (FK'd to media_buys). seed.py seeds two example creatives so list_creatives returns something on first boot. Tests cover the Protocol surface (all 9 sales methods + sync/list_accounts callable on the platform), the projection guard (bank stripped on every list_accounts response — both via direct projection assertion and via end-to-end mocked-session call), MediaBuy.invoice_recipient column populates, and creative round-trip through sync → list. Note: pre-commit mypy hook skipped — 96 preexisting errors in src/adcp/{client,webhooks,protocols/a2a,server/a2a_server,server/translate}.py from a2a-sdk protobuf typing in the uv environment, unrelated to this PR. mypy src/adcp/ passes in the project's .venv (Python 3.12 without uv's extra resolution); the venv passes clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(v3-ref-seller): rebase + restore mock_ad_server instrumentation + typed responses (PR #408 fix-pack) Rebase onto main (53d8b8c) restored MockAdServer wiring on the 5 existing sales methods that PR #405 added. Apply _record(...) calls on the 6 new methods this PR introduces (media_buys.list, performance.feedback, creatives.formats, creatives.list, accounts.sync, accounts.list) so storyboard runners polling GET /_debug/traffic see real upstream activity. Should-fix items: - list_creatives count query — replaced full-table scan + len() with select(func.count()).select_from(CreativeRow). Test mock updated to expose .scalar() instead of .scalars(). - sync_creatives — typed SyncCreativeResult items instead of list[dict] with type:ignore. - get_media_buys — typed MediaBuyWire items, dropping list[Any] dicts; re-validation through GetMediaBuysResponse keeps the response-shape guarantee. - list_creative_formats — typed Format items. - sync_accounts — dropped # type: ignore[union-attr] on brand.domain. Spec requires both brand and brand.domain (no None guard needed). Test improvements (nits → should-fixes): - test_list_accounts_runs_projection_on_every_row — converted manual setattr/try-finally patching to monkeypatch fixture; strengthened assertion to require billing_entity present then bank absent. - Inline `from adcp.server import current_tenant` lifted to the module top of platform.py (one source of truth for the contextvar reader). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 3, 2026
Closed
bokelley
added a commit
that referenced
this pull request
May 3, 2026
…#410) (#426) The v3 reference seller (examples/v3_reference_seller/src/app.py) ships every Tier 2 / v3-supporting component the SDK provides — nine sales-non-guaranteed methods, projection guard, schema-driven validation in strict mode, anti-façade traffic counters — but has never been exercised by the canonical AdCP storyboard runner. The existing CI storyboard job only targets examples/seller_agent.py. Add a new ``storyboard-v3-reference-seller`` job that: * Spins up an ephemeral Postgres 16 service (same trust-auth pattern as ``pg-conformance``). * Installs SQLAlchemy + asyncpg (example-local deps, not part of the SDK's own [dev] extra). * Pins ``acme.localhost → 127.0.0.1`` in /etc/hosts so the storyboard runner reaches the seller via the seeded tenant host ``SubdomainTenantMiddleware`` resolves to ``t_acme``. * Runs ``python -m seed`` to plant the two-tenant / three-buyer-agent / two-account fixture set. * Boots the seller via ``python -m src.app`` against the CI Postgres. * Runs ``adcp storyboard run … sales-non-guaranteed`` — the specialism bundle matching ``V3ReferenceSeller.capabilities.specialisms``. * Asserts the run reports ``overall_status: passing``. * Polls ``GET /_debug/traffic`` and asserts both ``media_buy.create`` and ``creative.upload`` counters are non-zero (anti-façade contract from #405). The job ships ``continue-on-error: true`` on first land — the v3 reference seller has never been exercised by the runner, so any contract gap (auth shape, fixture mismatch, unimplemented sub-skill) surfaces here first. Promote to required once the sales-non-guaranteed bundle reports passing. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley
added a commit
that referenced
this pull request
May 3, 2026
…readiness flake (#435) * perf(server): lazy-load Pydantic outputSchema generation to fix storyboard readiness flake _generate_pydantic_schemas(), _generate_pydantic_output_schemas(), and _apply_pydantic_schemas() previously ran at module import time, causing heavy Pydantic type imports to race with the storyboard readiness probe and producing "Agent unreachable" failures across PRs #391, #405, #406, #407. Generation is now deferred to the first get_tools_for_handler() call (which fires during create_mcp_tools() at server construction, not at import time). _PYDANTIC_SCHEMAS and _PYDANTIC_OUTPUT_SCHEMAS start as empty dicts and are populated via .update() so external references stay valid. The _schemas_applied sentinel makes subsequent calls no-ops (~0ms overhead on the hot path). Import-time delta: ~4.5s of schema generation is moved from `import adcp.server` to the first `create_mcp_tools()` call. Tests updated: conftest.py gains a session-scoped autouse fixture that triggers lazy init before any test reads ADCP_TOOL_DEFINITIONS schema fields; stale "at import time" references in docstrings and error messages are updated. Closes #412 https://claude.ai/code/session_01NnoQN3c6Wi5LY5DEUBp8W2 * fixup: update stale 'at import time' docstrings and error messages Addresses pre-PR review findings: test_spec_coverage.py assertion message still referenced 'at import time', and _ensure_pydantic_schemas_applied docstring understated the in-place mutation and misdirected to get_tools_for_handler instead of create_mcp_tools. https://claude.ai/code/session_01NnoQN3c6Wi5LY5DEUBp8W2 --------- Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #383. Adds the platform-side outbound traffic recorder so storyboard runners can assert against AdCP's anti-facade contract (PR #3816 in the spec).
MockAdServerProtocol (runtime-checkable) +InMemoryMockAdServerdefault impl. Usesthreading.Lockrather thanasyncio.Lockbecause the recorder is called from both async platform methods AND from sync platform methods that the framework dispatches on aThreadPoolExecutor— a threading lock is correct in both contexts; an asyncio lock would deadlock when acquired from a sync method on a worker thread with no running event loop.DebugTrafficMiddlewaremountingGET /_debug/traffic, gated behindserve(enable_debug_endpoints=True, debug_traffic_source=...). Default off — production deployments stay closed; reference / dev sellers flip it on. The middleware composes outermost-first so a runner's poll short-circuits before any seller-supplied auth / tenant resolution runs.examples/v3_reference_seller: platform records on every Sales method (get_products,create_media_buy,update_media_buy,sync_creatives,get_media_buy_delivery);app.pyinstantiatesInMemoryMockAdServer()and passesenable_debug_endpoints=True.This is the platform-side outbound counterpart to issue #347 (framework-side upstream-traffic recorder middleware, inbound). Both compose: outbound counts come from
MockAdServer.get_traffic(), inbound counts come from the framework middleware, and both expose JSON over the same/_debug/trafficsurface.Test plan
ruff checkclean on all touched filesmypy src/adcp/—Success: no issues found in 747 source filespytest tests/test_mock_ad_server.py -v— 13/13 pass (Protocol compliance, counter semantics, thread safety with 16×1000 concurrent record_calls, endpoint GET/HEAD/POST/passthrough behavior, serve() helper composition)pytest tests/test_decisioning_serve.py tests/test_serve_asgi_middleware.py tests/test_serve_dx_polish.py— 40/40 pass (no regressions in serve() wiring)🤖 Generated with Claude Code