Skip to content

feat(buyer-agent-registry): caching + rate-limit + audit emission (#380)#407

Merged
bokelley merged 3 commits intomainfrom
bokelley/issue-380-registry-cache-audit
May 3, 2026
Merged

feat(buyer-agent-registry): caching + rate-limit + audit emission (#380)#407
bokelley merged 3 commits intomainfrom
bokelley/issue-380-registry-cache-audit

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 3, 2026

Summary

The v3 seller's Tier 2 commercial-identity gate (BuyerAgentRegistry) hits the DB on every dispatch with no cache, no rate limit, and no audit emission. Production sellers eat unnecessary load and the lookup endpoint is a credential-stuffing oracle. This PR adds three composable wrappers and wires them into the v3 reference seller.

  • CachingBuyerAgentRegistry — TTL + LRU (default 60s / 4096 entries). Caches positive AND negative resolutions; negative caching closes the enumeration probe path so a probe walking arbitrary agent_url strings hits the DB at most once per (tenant, key) per TTL window. Hit-callback hook for Prometheus / OTel counters.
  • RateLimitedBuyerAgentRegistry — per-(tenant, lookup_key) token bucket (default 100 RPS). On exhaustion raises PERMISSION_DENIED with NO details and a generic message — wire-uniform with the registry-miss path from PR fix(decisioning): Tier 2 codes → spec-conformant PERMISSION_DENIED (#375) #393. A distinct RATE_LIMITED code would itself be an enumeration oracle.
  • AuditingBuyerAgentRegistry — terminal wrapper emitting one AuditEvent per DB outcome (resolved / miss). The cache and rate-limit layers accept the same audit_sink kwarg so cached_hit / cached_miss / rate_limited outcomes land in the same trail.

Composition

The wrappers stack outside-in. Adopters build Caching(RateLimited(Auditing(SQL-backed))) — cache shortcuts repeated lookups before the rate limiter or DB sees them; rate limiter stops probe traffic before the DB; audit layer captures every actual lookup. The v3 reference seller's make_registry factory now composes this stack with the same audit sink at every layer.

Wire-shape contract

The rate-limit denial path matches PR #393's spec-conformant PERMISSION_DENIED shape: same code, same _denied_message text, same recovery="correctable", NO details. Rate-limited and not-recognized are wire-indistinguishable — preserves the spec's omit-on-unestablished-identity rule.

Test plan

  • pytest tests/ -k "registry_cache or buyer_agent_registry" — 42 passed (23 new + 19 existing)
  • ruff check src/ examples/ — no new errors (6 pre-existing in unrelated example files)
  • mypy src/adcp/ — Success: no issues found in 746 source files
  • v3 reference seller smoke tests pass
  • Tier 2 dispatch + spec-conformance tests pass (44 / 44)

🤖 Generated with Claude Code

bokelley added a commit that referenced this pull request May 3, 2026
…-pack)

Code reviewer flagged that CachingBuyerAgentRegistry.invalidate() and
clear() mutated self._cache without holding self._lock. _store()'s
move_to_end / popitem(last=False) eviction races with concurrent
admin invalidate calls, risking RuntimeError or LRU-order corruption.

Convert both to async + acquire the lock before mutating. Test
updated to await the new coroutine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley closed this May 3, 2026
@bokelley bokelley reopened this May 3, 2026
bokelley and others added 3 commits May 2, 2026 22:26
Three composable wrappers around the BuyerAgentRegistry Protocol that
the v3 seller's Tier 2 commercial-identity gate hits on every dispatch.
Without them, every resolve runs a SQL query — including negative paths
an enumeration probe will spam, making the lookup endpoint a
credential-stuffing oracle.

* CachingBuyerAgentRegistry — TTL + LRU cache, default 60s / 4096
  entries. Caches BOTH positive and negative resolutions; the negative
  cache closes the enumeration probe path so a probe walking arbitrary
  agent_url strings hits the DB at most once per (tenant, key) per TTL
  window. Hit-callback hook for Prometheus / OpenTelemetry counters.
* RateLimitedBuyerAgentRegistry — per-(tenant, lookup-key) token
  bucket, default 100 RPS. On exhaustion raises PERMISSION_DENIED with
  NO details and a generic message — wire-uniform with the registry-
  miss path from PR #393. A distinct RATE_LIMITED code or populated
  details would itself be an enumeration oracle.
* AuditingBuyerAgentRegistry — terminal wrapper emitting one
  AuditEvent per DB outcome (resolved / miss). The cache and rate-
  limit layers also accept the same audit_sink kwarg so cached_hit /
  cached_miss / rate_limited outcomes land in the same trail.

The wrappers stack outside-in. Adopters compose Caching(RateLimited(
Auditing(SQL-backed))) — the cache shortcuts repeated lookups before
the rate limiter or DB sees them; the rate limiter stops probe traffic
before the DB; the audit layer captures every actual lookup.

Wires the v3 reference seller's TenantScopedBuyerAgentRegistry into
the production stack with the same audit sink at every layer so SecOps
can reconstruct every resolve attempt. The make_registry factory
accepts ttl_seconds / rps_per_tenant / max_entries overrides for
adopters with different SLA / volume requirements.

Tests: tests/test_buyer_agent_registry_cache.py (23 tests covering
cache hit / miss / TTL expiry / LRU eviction / rate-limit threshold
+ refill / audit emission per outcome / sink-failure isolation /
end-to-end composition).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-pack)

Code reviewer flagged that CachingBuyerAgentRegistry.invalidate() and
clear() mutated self._cache without holding self._lock. _store()'s
move_to_end / popitem(last=False) eviction races with concurrent
admin invalidate calls, risking RuntimeError or LRU-order corruption.

Convert both to async + acquire the lock before mutating. Test
updated to await the new coroutine.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley force-pushed the bokelley/issue-380-registry-cache-audit branch from 288da96 to dc1b39c Compare May 3, 2026 02:26
@bokelley bokelley merged commit 5a1e6b6 into main May 3, 2026
11 of 12 checks passed
bokelley added a commit that referenced this pull request May 3, 2026
… worktrees (#433)

* fix(testing): make a2a_compat_shim resilient to wrong a2a-sdk in /tmp worktrees

Attribute assignments like `pb.Role.user = pb.Role.ROLE_USER` at module
import time would raise AttributeError if a2a-sdk isn't at the pinned
version (>=1.0.1,<1.0.2), propagating through conftest.py's top-level
import and breaking pytest collection entirely. Agents running in fresh
/tmp worktrees with uninitialized environments hit this on PRs #391, #406, #407.

Two changes:
- `a2a_compat_shim.py`: introduce `_proto_alias()` helper that guards each
  attribute alias independently with hasattr + a per-alias RuntimeWarning
  (includes install command) rather than letting AttributeError propagate.
- `conftest.py`: wrap the shim import in try/except (ImportError|AttributeError)
  with a fallback to None; update the autouse fixture to no-op when the shim
  is unavailable, so collection always succeeds and only A2A tests fail.

https://claude.ai/code/session_01AnL37fUet4e3yXt9YBxd7a

* fix(testing): stacklevel=2 in _proto_alias + document _STATE_STRING_MAP asymmetry

stacklevel=2 makes the per-alias warning point at the _proto_alias() call
site in the module body (the useful diagnostic location) rather than at
the warnings.warn() line inside the helper.

Add a comment at _STATE_STRING_MAP explaining that any AttributeError from
the dict literal is caught by conftest.py's import guard, so the different
guard pattern is intentional and collection still succeeds.

https://claude.ai/code/session_01AnL37fUet4e3yXt9YBxd7a

---------

Co-authored-by: Claude <noreply@anthropic.com>
bokelley added a commit that referenced this pull request May 3, 2026
…readiness flake (#435)

* perf(server): lazy-load Pydantic outputSchema generation to fix storyboard readiness flake

_generate_pydantic_schemas(), _generate_pydantic_output_schemas(), and
_apply_pydantic_schemas() previously ran at module import time, causing
heavy Pydantic type imports to race with the storyboard readiness probe
and producing "Agent unreachable" failures across PRs #391, #405, #406, #407.

Generation is now deferred to the first get_tools_for_handler() call (which
fires during create_mcp_tools() at server construction, not at import time).
_PYDANTIC_SCHEMAS and _PYDANTIC_OUTPUT_SCHEMAS start as empty dicts and are
populated via .update() so external references stay valid. The _schemas_applied
sentinel makes subsequent calls no-ops (~0ms overhead on the hot path).

Import-time delta: ~4.5s of schema generation is moved from `import adcp.server`
to the first `create_mcp_tools()` call.

Tests updated: conftest.py gains a session-scoped autouse fixture that triggers
lazy init before any test reads ADCP_TOOL_DEFINITIONS schema fields; stale
"at import time" references in docstrings and error messages are updated.

Closes #412

https://claude.ai/code/session_01NnoQN3c6Wi5LY5DEUBp8W2

* fixup: update stale 'at import time' docstrings and error messages

Addresses pre-PR review findings: test_spec_coverage.py assertion message
still referenced 'at import time', and _ensure_pydantic_schemas_applied
docstring understated the in-place mutation and misdirected to
get_tools_for_handler instead of create_mcp_tools.

https://claude.ai/code/session_01NnoQN3c6Wi5LY5DEUBp8W2

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant