Skip to content

feat(server)!: pluggable TaskStore on A2A (#224)#230

Merged
bokelley merged 3 commits intomainfrom
bokelley/a2a-pluggable-taskstore
Apr 20, 2026
Merged

feat(server)!: pluggable TaskStore on A2A (#224)#230
bokelley merged 3 commits intomainfrom
bokelley/a2a-pluggable-taskstore

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Summary

Closes #224 (Phase 2 PR-N from the SDK adoption roadmap). First of three A2A hooks salesagent needs to drop their 2,288-LOC custom A2A server.

Test plan

  • `pytest tests/ --ignore=tests/conformance --ignore=tests/integration` — 1500 passed, 15 skipped (up from 1497; +3 task_store tests)
  • `ruff check src/ tests/ examples/` — clean
  • `mypy src/adcp/` — 0 errors
  • `examples/a2a_db_tasks.py` round-trip + durability verified manually (save→close conn→reopen→get→delete)
  • Integration test `test_task_store_survives_server_recreation` proves a shared store outlives the Starlette app (the "restart" property downstream actually needs)

Follow-ups

Once those land, A2A adoption reaches parity with MCP for production agents.

🤖 Generated with Claude Code

bokelley and others added 3 commits April 20, 2026 02:27
…option

Closes #224 (Phase 2 PR-N from .context/sdk-adoption-roadmap.md).

`create_a2a_server()` now accepts `task_store: TaskStore | None`, and
threads it through to `DefaultRequestHandler(task_store=...)`. `serve()`
surfaces the same kwarg so `serve(..., transport="a2a", task_store=...)`
works end-to-end. When omitted, behavior is unchanged — default
`InMemoryTaskStore` keeps existing adopters green.

This is the first of three A2A hooks salesagent's 2,288-LOC custom A2A
server replaces. With #225 (push-notif config store) and #226 (per-skill
middleware) landing next, their ~13-day A2A migration becomes ~5-day
and net-positive.

### Reference impl — examples/a2a_db_tasks.py

SQLite-backed `TaskStore` subclass showing the full contract
(save / get / delete) and durability across process restart. SQLite
chosen as the reference target: stdlib-only, no infra, schema sketch
translates directly to Postgres/MySQL. Docstring calls out the gaps
between reference and production (async driver, transaction pattern
for same-transaction commits with handler writes, TTL/GC).

### Tests

tests/test_a2a_server.py:
- test_create_a2a_server_defaults_to_in_memory_task_store — preserves
  pre-#224 behavior (omitting the kwarg is a no-op).
- test_create_a2a_server_accepts_custom_task_store — asserts the
  store threads through to DefaultRequestHandler.task_store via a
  `_extract_default_request_handler()` helper that walks the
  a2a-sdk Starlette graph (localises the blast radius when a2a-sdk
  internals move).
- test_task_store_survives_server_recreation — end-to-end property:
  create two A2A apps sharing a store, prove tasks written by app A
  are readable by app B.

### Docs

docs/handler-authoring.md — A2A section's "InMemoryTaskStore caveat"
bullet upgraded to a "Durable task storage" subsection with the
`SqliteTaskStore` recipe. Known-gaps bullets link to #225/#226.
serve.py + create_a2a_server docstrings document the new kwarg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + hardening

Addresses security and code reviewer findings on PR #230.

SECURITY — HIGH: reference SqliteTaskStore taught cross-tenant leakage
The TaskStore ABC passes a ServerCallContext with the authenticated
user on every call, but the reference impl ignored it. Durability
turned a process-local ambiguity into a permanent, enumerable
disclosure: any principal that learns another tenant's task id
retrieves that tenant's full task — history, artifacts, PII.

Fix in examples/a2a_db_tasks.py:
- New `scope` column on the a2a_tasks table, composited into the PK
  `(scope, task_id)`.
- `_scope_from_context()` derives the scope from
  `context.user.user_name` (with `is_authenticated` check), falling
  back to `__anonymous__` when unauth'd — so anonymous and
  authenticated tasks never collide.
- Every `save` / `get` / `delete` passes the scope into a `WHERE scope
  = ?` clause. Cross-scope reads return None; cross-scope deletes
  affect zero rows.

HARDENING — SqliteTaskStore additional followups
- `_conn()` now rolls back on exception (port-to-psycopg safety).
- SQLite file is `chmod 0o600` on first creation so a co-tenant process
  can't read buyer-supplied task history on a shared host.
- `INSERT OR REPLACE` last-writer-wins caveat documented at the SQL
  site and in the "Not production-ready" block.
- Buyer-supplied history in `Task.model_dump_json` now explicit: the
  docstring calls out that persisted history = plaintext conversation
  content on disk.

DOCS — docs/handler-authoring.md "Four things a durable TaskStore MUST do"
Surfaces the four most common production mistakes (principal scoping,
file permissions, concurrent-save correctness, terminal-task GC) so
readers who skip the example file don't ship a leak.

TESTS — tests/test_a2a_server.py
- `test_custom_task_store_receives_saves_from_skill_dispatch` replaces
  the previous test's misleading claim. The old version only proved
  Python GC; this one calls `DefaultRequestHandler.on_get_task` and
  asserts the custom store's recording set receives the lookup. If
  a2a-sdk ever bypasses `.task_store`, this fails even when the kwarg
  still gets set.
- `test_task_store_persists_across_app_recreation`: retained and
  scoped to the property it actually proves (store reusable across
  multiple app instances).
- `test_sqlite_task_store_isolates_scopes_by_context`: loads the
  reference example via importlib and asserts tenant A's task is not
  returned to tenant B — pins the example's security claim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_server

PR #230 CI failed on Python 3.10 with ``TypeError: Cannot subclass
typing.Any`` at ``a2a.server.apps.jsonrpc.__init__``. Root cause: the
a2a-sdk Starlette integration uses 3.11+ typing patterns. The existing
``test_create_a2a_server_creates_starlette_app`` already guards with
``@pytest.mark.skipif(sys.version_info < (3, 11))``; the four new
tests added in #230 didn't, so they collected and blew up.

Adds the same guard to:

- test_create_a2a_server_defaults_to_in_memory_task_store
- test_create_a2a_server_accepts_custom_task_store
- test_custom_task_store_receives_saves_from_skill_dispatch
- test_task_store_persists_across_app_recreation

test_sqlite_task_store_isolates_scopes_by_context doesn't call
create_a2a_server (it exercises the SqliteTaskStore directly via
importlib), so no skip needed — confirmed passing on 3.10 in the
failing CI run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit b2e22a5 into main Apr 20, 2026
10 checks passed
bokelley added a commit that referenced this pull request Apr 20, 2026
…on, loud anonymous fallback, SSRF/secret-storage docs

Addresses code-reviewer + security-reviewer findings on PR #232.

CODE — delete misleading inline comment + branching
- src/adcp/server/a2a_server.py: the "only forward when explicitly
  provided" branch was premised on claims that DefaultRequestHandler
  defaults push_config_store to an in-memory store and that passing
  None disables push-notifs in some a2a-sdk versions. Neither claim
  is true — DefaultRequestHandler stores push_config_store verbatim
  and None == omitted. Pass unconditionally, delete the dead branch
  and the wrong comment.

EXAMPLE — SqlitePushNotificationConfigStore hardening
- scope_provider injection: takes Callable[[], str | None] in __init__
  so sellers wire their own auth integration without reaching into
  the module-level `_current_push_config_scope` ContextVar. Default
  provider reads the existing ContextVar — current adopters unbroken.
- Anonymous-scope UserWarning: emitted once per store instance the
  first time scope_provider returns None. Silent fall-through was
  the multi-tenant footgun security review flagged — operators
  shipping the example to prod without wiring auth got one giant
  shared webhook pool with no signal.
- config_id collision fix: two clients registering on the same task
  without an explicit id used to collide on the PK via INSERT OR
  REPLACE. Now synthesises a UUID per call.

DOCS — threats the reference example teaches
- examples/a2a_db_tasks.py module docstring: new "Security model —
  push-notification config store adds two threats tenant-scoping
  alone does NOT address" covering SSRF on webhook URLs and
  plaintext storage of auth secrets.
- SqlitePushNotificationConfigStore docstring: explicit callout for
  the ContextVar-across-background-asyncio.Task fragility on the
  push-notif sender path.
- docs/handler-authoring.md "Durable push-notification config
  storage" section expands from one paragraph to three numbered
  MUSTs: (1) URL allowlist, (2) secret-at-rest handling, (3)
  per-principal scoping. Parallel structure to the TaskStore MUSTs
  above. Operator-facing failure modes subsection explains why the
  UserWarning is a P0 signal.

TESTS — tests/test_a2a_server.py (27 passing, +3 this round)
- test_custom_push_config_store_receives_sets_from_handler: behavioral
  test mirroring #230's TaskStore analog. Drives
  on_set_task_push_notification_config through the extracted handler
  and asserts the recording store's .sets grew. Defends against
  a2a-sdk renaming `_push_config_store`.
- test_sqlite_push_config_store_warns_once_on_anonymous_scope: locks
  the warning contract — fires exactly once per store instance.
- test_sqlite_push_config_store_synthesises_config_id_when_omitted:
  two sets on same task without explicit id produce two rows.

Follow-ups (not blocking this PR): URL allowlist helper shipped from
adcp.server; field-level envelope encryption for webhook secrets;
BasePushNotificationSender subclass that re-sets the ContextVar on
the background sender path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 2: Pluggable TaskStore on create_a2a_server (PR-N)

1 participant