Skip to content

feat(server): request-body size cap middleware (closes #239)#241

Merged
bokelley merged 2 commits intomainfrom
bokelley/request-size-cap
Apr 20, 2026
Merged

feat(server): request-body size cap middleware (closes #239)#241
bokelley merged 2 commits intomainfrom
bokelley/request-size-cap

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

Summary

Closes #239 — the Medium security finding deferred from PR #238. Every tool call runs Pydantic `model_validate` unconditionally now that typed-dispatch is live; without an input-size bound, adversarial callers can exhaust validation CPU/memory with arbitrarily large JSON.

Adds an ASGI middleware installed at the `serve()` bind point — caps HTTP request bodies at 10 MB by default, rejects oversized bodies with HTTP 413 before FastMCP or a2a-sdk parse a byte.

Two enforcement layers

  • Content-Length fast-fail. Advertised body over the cap → reject without reading.
  • Streaming accounting. No Content-Length (chunked transfers) → middleware buffers + counts, 413s the moment the total crosses the cap.

GET / HEAD / OPTIONS bypass (no request body). Lifespan + WebSocket scopes pass through untouched.

API

```python

Default: 10 MB

serve(MyAgent())

Sellers with legitimately large creative asset payloads:

serve(MyAgent(), max_request_size=50 * 1024 * 1024)

Public-facing deployments that only accept small payloads:

serve(MyAgent(), max_request_size=256 * 1024)

Escape hatch — enforce elsewhere (reverse proxy / WAF):

serve(MyAgent(), max_request_size=0)
```

Covers both transports (MCP streamable-http + SSE, A2A). stdio skips since there's no HTTP body to police.

Test plan

  • 11 ASGI-level tests, all passing
  • Content-Length reject (over cap), accept (at cap), accept (under cap)
  • Malformed Content-Length falls through to streaming check
  • Chunked body exceeding cap mid-stream rejects; inner app never invoked
  • Chunked body under cap passes through verbatim
  • GET/HEAD/OPTIONS bypass
  • Lifespan scope passes through untouched
  • `max_request_size=0` opt-out returns app unwrapped
  • Explicit int overrides default
  • Full suite: 1876/1876 pass
  • mypy clean (673 source files)

🤖 Generated with Claude Code

bokelley and others added 2 commits April 20, 2026 11:58
PR #238 security review flagged that typed-handler dispatch runs
Pydantic model_validate on every tool call with no input-size bound.
Adversarial callers could submit arbitrarily large JSON and exhaust
validation CPU/memory. Transport layer (FastMCP streamable-HTTP,
a2a-sdk Starlette) imposes no uniform cap, so this was pre-existing —
PR #238 meaningfully increased the blast radius.

Fix: ASGI middleware at bind time, before FastMCP / a2a-sdk see the
request. Two enforcement layers:

- Content-Length fast-fail: advertised body over cap is rejected with
  HTTP 413 before a byte is read.
- Streaming accounting: chunked transfers (no Content-Length) are
  counted as they arrive and 413'd the moment total crosses cap.

GET/HEAD/OPTIONS bypass the check — no request body in any protocol
we serve. Lifespan + WebSocket scopes pass through untouched.

Default: 10 MB. Big enough for realistic AdCP payloads (multi-package
create_media_buy with embedded creatives runs ~1-2 MB). Small enough
that adversarial traffic can't trivially exhaust a single worker.

Tune via `serve(..., max_request_size=N)`. Set to 0 to opt out —
documented as "not recommended, enforce at reverse proxy/WAF
instead."

Both transports covered (MCP streamable-http + SSE, A2A). stdio
skips since there's no HTTP body to police.

- 11 ASGI-level tests: content-length reject at/over/under cap,
  malformed header, chunked mid-stream reject, GET/HEAD/OPTIONS
  bypass, lifespan passthrough, wrapper opt-out, explicit override.
- Docs section in handler-authoring.md with tuning examples.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code reviewer:
- Drop the silent-truncate branch for unexpected ASGI message types.
  http scope only defines http.request / http.disconnect; the previous
  `chunks.append(b"")` + break fallback masqueraded a truncated body
  as complete. Now skips defensively with a debug log.
- max_request_size < 0 now raises ValueError at configure time.
  Negative values have no meaningful interpretation and are almost
  certainly a typo; failing loud beats silent opt-out.
- 413 body now includes the cap value: "Payload too large. Maximum
  request body size is N bytes." The cap isn't a secret; forcing
  legitimate clients to grep docs to find the limit is bad DX.
- http.disconnect path documented — inner app hasn't been invoked
  yet, so dropping the request on the floor is safe.

Security reviewer (Medium):
- M2: max_request_size=0 now emits a WARNING log. The only Pydantic-
  validation DoS guard getting silently disabled by a typo is the
  failure mode we're explicitly preventing.
- M1: docs section expanded with a "What this cap does NOT bound"
  subsection covering slow-loris attacks and the uvicorn / reverse-
  proxy timeout levers that actually bound duration. Plus memory-
  budgeting guidance (workers × concurrency × max_request_size).
- L1: regression test for single-chunk oversized body (catches naive
  "read at least one chunk" refactors).
- L3: debug log when Content-Length is malformed (operators probing
  WAF behavior get a breadcrumb).

Test suite: 15 size-cap tests, 1880 full-suite tests, all passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bokelley bokelley merged commit 21249a5 into main Apr 20, 2026
10 checks passed
bokelley added a commit that referenced this pull request Apr 20, 2026
Closes 5 items from salesagent's feedback on adopting adcp.server in
one cohesive server/transport surface change.

SkillMiddleware parity across transports (#7)
---------------------------------------------
The A2A executor's per-skill middleware (PR #233) is now available on
MCP too. Same SkillMiddleware type alias, same composition semantics
(outermost-first, _step recursion), same call_next contract — a
middleware list written against one transport works unchanged on the
other.

- src/adcp/server/serve.py: new module-level _dispatch_with_middleware
  that A2A's _dispatch_with_middleware delegates to.
- create_mcp_server, _register_handler_tools, _register_tool accept
  middleware=[SkillMiddleware]; _register_tool wraps caller in the
  chain between context build and handler invocation.
- serve() already exposed the kwarg for A2A; now forwards to MCP too.

BearerTokenAuthMiddleware in adcp.server.auth (#1)
--------------------------------------------------
The pattern in examples/mcp_with_auth_middleware.py was four
security-critical concerns (ContextVar carrier, constant-time compare,
discovery bypass, reset-in-finally); every downstream copy-pasted it.
Now shipped as a class.

- src/adcp/server/auth.py: BearerTokenAuthMiddleware, Principal
  (frozen dataclass), TokenValidator, auth_context_factory,
  constant_time_token_match. Seller supplies validate_token; framework
  owns the ContextVar plumbing, RFC 7235 scheme parsing (case-
  insensitive + whitespace-folded), discovery bypass, peek_jsonrpc
  with explicit request._body cache, fail-closed validator exception
  handling, principal metadata that can't shadow SDK audit keys.
- examples/mcp_with_auth_middleware.py shrunk 243 → 89 lines.

A2A message_parser hook (#3)
----------------------------
ADCPAgentExecutor._parse_request was hardcoded to
DataPart({'skill': ..., 'parameters': ...}). Sellers fronting JSON-RPC
or vendor-specific shapes had to subclass privately.

- src/adcp/server/a2a_server.py: new MessageParser type alias,
  message_parser= kwarg on ADCPAgentExecutor, create_a2a_server,
  _serve_a2a, serve(). Default = _default_parse_request (was inline).

Startup advertised-tools log (#9)
---------------------------------
- src/adcp/server/serve.py: _log_advertised_tools() runs from
  _register_handler_tools (MCP) and create_a2a_server (A2A).
  INFO: 'X of Y tools advertised'; DEBUG: list of unadvertised.

Custom tools doc (#8)
---------------------
docs/handler-authoring.md: new section covering the @mcp.tool()
passthrough on create_mcp_server's return value.

Expert-review followups (security + code review)
-------------------------------------------------
- _parse_bearer_header: case-insensitive scheme, folded whitespace.
- validator exceptions → 401 (no stack-trace leak).
- principal metadata can't shadow SDK-owned keys (tool_name,
  transport).
- explicit request._body = body after peek.
- tests use regex to match log messages (not positional tokens).
- Python 3.10 skipif on two new A2A create_a2a_server tests (a2a-sdk
  starlette integration requires 3.11+; matches pre-existing skip).

Tests
-----
+53 tests across three new/modified test files. 1990 tests passing,
mypy clean.

Closes #224, #225, #226, #240, #241 salesagent feedback items #1,
#3, #7, #8, #9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sdk(server): bound request input size before Pydantic model_validate

1 participant