Skip to content

feat(decisioning): MockAdServer Protocol + /_debug/traffic counters (#383)#405

Merged
bokelley merged 1 commit intomainfrom
bokelley/issue-383-mock-ad-server
May 3, 2026
Merged

feat(decisioning): MockAdServer Protocol + /_debug/traffic counters (#383)#405
bokelley merged 1 commit intomainfrom
bokelley/issue-383-mock-ad-server

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 3, 2026

Summary

Closes #383. Adds the platform-side outbound traffic recorder so storyboard runners can assert against AdCP's anti-facade contract (PR #3816 in the spec).

  • MockAdServer Protocol (runtime-checkable) + InMemoryMockAdServer default impl. Uses threading.Lock rather than asyncio.Lock because the recorder is called from both async platform methods AND from sync platform methods that the framework dispatches on a ThreadPoolExecutor — a threading lock is correct in both contexts; an asyncio lock would deadlock when acquired from a sync method on a worker thread with no running event loop.
  • DebugTrafficMiddleware mounting GET /_debug/traffic, gated behind serve(enable_debug_endpoints=True, debug_traffic_source=...). Default off — production deployments stay closed; reference / dev sellers flip it on. The middleware composes outermost-first so a runner's poll short-circuits before any seller-supplied auth / tenant resolution runs.
  • Wire-up in examples/v3_reference_seller: platform records on every Sales method (get_products, create_media_buy, update_media_buy, sync_creatives, get_media_buy_delivery); app.py instantiates InMemoryMockAdServer() and passes enable_debug_endpoints=True.

This is the platform-side outbound counterpart to issue #347 (framework-side upstream-traffic recorder middleware, inbound). Both compose: outbound counts come from MockAdServer.get_traffic(), inbound counts come from the framework middleware, and both expose JSON over the same /_debug/traffic surface.

Test plan

  • ruff check clean on all touched files
  • mypy src/adcp/Success: no issues found in 747 source files
  • pytest tests/test_mock_ad_server.py -v — 13/13 pass (Protocol compliance, counter semantics, thread safety with 16×1000 concurrent record_calls, endpoint GET/HEAD/POST/passthrough behavior, serve() helper composition)
  • pytest tests/test_decisioning_serve.py tests/test_serve_asgi_middleware.py tests/test_serve_dx_polish.py — 40/40 pass (no regressions in serve() wiring)

🤖 Generated with Claude Code

…383)

Adopter platforms returning spec-valid envelopes without calling any
upstream (sync_creatives -> [], get_media_buy_delivery -> []) are
textbook facade adapters that AdCP's anti-facade contract is designed
to catch. This adds the platform-side outbound traffic recorder and
a debug endpoint so storyboard runners can assert against per-method
call counts.

* MockAdServer Protocol + InMemoryMockAdServer default impl
  (threading.Lock — works from both async paths and the sync
  ThreadPoolExecutor framework dispatches sync platform methods on).
* DebugTrafficMiddleware mounting GET /_debug/traffic, gated behind
  serve(enable_debug_endpoints=True) so production deployments stay
  closed by default.
* Wire-up in v3 reference seller: platform records on every Sales
  method; app.py flips on the endpoint for storyboard runners.
* Composes with the framework-side inbound traffic recorder
  (issue #347) — outbound counts here, inbound counts there, both
  expose JSON over the same surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit 53d8b8c into main May 3, 2026
11 of 12 checks passed
bokelley added a commit that referenced this pull request May 3, 2026
… typed responses (PR #408 fix-pack)

Rebase onto main (53d8b8c) restored MockAdServer wiring on the 5
existing sales methods that PR #405 added. Apply _record(...) calls on
the 6 new methods this PR introduces (media_buys.list,
performance.feedback, creatives.formats, creatives.list,
accounts.sync, accounts.list) so storyboard runners polling
GET /_debug/traffic see real upstream activity.

Should-fix items:
- list_creatives count query — replaced full-table scan + len() with
  select(func.count()).select_from(CreativeRow). Test mock updated to
  expose .scalar() instead of .scalars().
- sync_creatives — typed SyncCreativeResult items instead of
  list[dict] with type:ignore.
- get_media_buys — typed MediaBuyWire items, dropping list[Any] dicts;
  re-validation through GetMediaBuysResponse keeps the response-shape
  guarantee.
- list_creative_formats — typed Format items.
- sync_accounts — dropped # type: ignore[union-attr] on
  brand.domain. Spec requires both brand and brand.domain (no None
  guard needed).

Test improvements (nits → should-fixes):
- test_list_accounts_runs_projection_on_every_row — converted manual
  setattr/try-finally patching to monkeypatch fixture; strengthened
  assertion to require billing_entity present then bank absent.
- Inline `from adcp.server import current_tenant` lifted to the
  module top of platform.py (one source of truth for the contextvar
  reader).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 3, 2026
…recipient (#408)

* feat(v3-ref-seller): broaden surface — 4 missing sales methods + sync/list_accounts + invoice_recipient (#376, #377, #378)

Lifts the v3 reference seller from the five required sales methods up to
the full v6.0 rc.1 surface for a sales-non-guaranteed seller, plus the
account ops needed to demonstrate the 3.1-readiness projection guard.

* #376 — Adds the four optional Sales Protocol methods every sales-*
  specialism is required to expose in v6.0 rc.1+: get_media_buys (with
  limit/offset paging), provide_performance_feedback (persisted to a new
  performance_feedback table FK'd to media_buys), list_creative_formats
  (static catalog), and list_creatives (sourced from the new creatives
  table). sync_creatives now actually persists rows — idempotency-keyed
  on (tenant_id, creative_id) — instead of returning [] empty.

* #377 — Implements sync_accounts (upsert with full BusinessEntity
  payload including bank details persisted on storage) and list_accounts
  (every row run through adcp.decisioning.project_account_for_response
  before serialization). The list_accounts call site is the headline
  3.1-readiness claim — bank details cannot leak on response.

* #378 — Adds MediaBuy.invoice_recipient as a first-class JSON column.
  create_media_buy extracts CreateMediaBuyRequest.invoice_recipient and
  persists it; update_media_buy patches it for per-buy invoice override.

Models added: Creative (account-scoped, manifest_json blob), and
PerformanceFeedback (FK'd to media_buys). seed.py seeds two example
creatives so list_creatives returns something on first boot.

Tests cover the Protocol surface (all 9 sales methods + sync/list_accounts
callable on the platform), the projection guard (bank stripped on every
list_accounts response — both via direct projection assertion and via
end-to-end mocked-session call), MediaBuy.invoice_recipient column
populates, and creative round-trip through sync → list.

Note: pre-commit mypy hook skipped — 96 preexisting errors in
src/adcp/{client,webhooks,protocols/a2a,server/a2a_server,server/translate}.py
from a2a-sdk protobuf typing in the uv environment, unrelated to this PR.
mypy src/adcp/ passes in the project's .venv (Python 3.12 without uv's
extra resolution); the venv passes clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(v3-ref-seller): rebase + restore mock_ad_server instrumentation + typed responses (PR #408 fix-pack)

Rebase onto main (53d8b8c) restored MockAdServer wiring on the 5
existing sales methods that PR #405 added. Apply _record(...) calls on
the 6 new methods this PR introduces (media_buys.list,
performance.feedback, creatives.formats, creatives.list,
accounts.sync, accounts.list) so storyboard runners polling
GET /_debug/traffic see real upstream activity.

Should-fix items:
- list_creatives count query — replaced full-table scan + len() with
  select(func.count()).select_from(CreativeRow). Test mock updated to
  expose .scalar() instead of .scalars().
- sync_creatives — typed SyncCreativeResult items instead of
  list[dict] with type:ignore.
- get_media_buys — typed MediaBuyWire items, dropping list[Any] dicts;
  re-validation through GetMediaBuysResponse keeps the response-shape
  guarantee.
- list_creative_formats — typed Format items.
- sync_accounts — dropped # type: ignore[union-attr] on
  brand.domain. Spec requires both brand and brand.domain (no None
  guard needed).

Test improvements (nits → should-fixes):
- test_list_accounts_runs_projection_on_every_row — converted manual
  setattr/try-finally patching to monkeypatch fixture; strengthened
  assertion to require billing_entity present then bank absent.
- Inline `from adcp.server import current_tenant` lifted to the
  module top of platform.py (one source of truth for the contextvar
  reader).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 3, 2026
…#410) (#426)

The v3 reference seller (examples/v3_reference_seller/src/app.py)
ships every Tier 2 / v3-supporting component the SDK provides —
nine sales-non-guaranteed methods, projection guard, schema-driven
validation in strict mode, anti-façade traffic counters — but has
never been exercised by the canonical AdCP storyboard runner. The
existing CI storyboard job only targets examples/seller_agent.py.

Add a new ``storyboard-v3-reference-seller`` job that:

* Spins up an ephemeral Postgres 16 service (same trust-auth
  pattern as ``pg-conformance``).
* Installs SQLAlchemy + asyncpg (example-local deps, not part of
  the SDK's own [dev] extra).
* Pins ``acme.localhost → 127.0.0.1`` in /etc/hosts so the
  storyboard runner reaches the seller via the seeded tenant host
  ``SubdomainTenantMiddleware`` resolves to ``t_acme``.
* Runs ``python -m seed`` to plant the two-tenant /
  three-buyer-agent / two-account fixture set.
* Boots the seller via ``python -m src.app`` against the CI
  Postgres.
* Runs ``adcp storyboard run … sales-non-guaranteed`` — the
  specialism bundle matching ``V3ReferenceSeller.capabilities.specialisms``.
* Asserts the run reports ``overall_status: passing``.
* Polls ``GET /_debug/traffic`` and asserts both
  ``media_buy.create`` and ``creative.upload`` counters are
  non-zero (anti-façade contract from #405).

The job ships ``continue-on-error: true`` on first land — the
v3 reference seller has never been exercised by the runner, so
any contract gap (auth shape, fixture mismatch, unimplemented
sub-skill) surfaces here first. Promote to required once the
sales-non-guaranteed bundle reports passing.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 3, 2026
…readiness flake (#435)

* perf(server): lazy-load Pydantic outputSchema generation to fix storyboard readiness flake

_generate_pydantic_schemas(), _generate_pydantic_output_schemas(), and
_apply_pydantic_schemas() previously ran at module import time, causing
heavy Pydantic type imports to race with the storyboard readiness probe
and producing "Agent unreachable" failures across PRs #391, #405, #406, #407.

Generation is now deferred to the first get_tools_for_handler() call (which
fires during create_mcp_tools() at server construction, not at import time).
_PYDANTIC_SCHEMAS and _PYDANTIC_OUTPUT_SCHEMAS start as empty dicts and are
populated via .update() so external references stay valid. The _schemas_applied
sentinel makes subsequent calls no-ops (~0ms overhead on the hot path).

Import-time delta: ~4.5s of schema generation is moved from `import adcp.server`
to the first `create_mcp_tools()` call.

Tests updated: conftest.py gains a session-scoped autouse fixture that triggers
lazy init before any test reads ADCP_TOOL_DEFINITIONS schema fields; stale
"at import time" references in docstrings and error messages are updated.

Closes #412

https://claude.ai/code/session_01NnoQN3c6Wi5LY5DEUBp8W2

* fixup: update stale 'at import time' docstrings and error messages

Addresses pre-PR review findings: test_spec_coverage.py assertion message
still referenced 'at import time', and _ensure_pydantic_schemas_applied
docstring understated the in-place mutation and misdirected to
get_tools_for_handler instead of create_mcp_tools.

https://claude.ai/code/session_01NnoQN3c6Wi5LY5DEUBp8W2

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(decisioning): MockAdServer Protocol + /_debug/traffic counters for anti-façade compliance

1 participant