feat(server): default MCP streamable-http to stateful with idle eviction#636
Merged
feat(server): default MCP streamable-http to stateful with idle eviction#636
Conversation
Stateless was the production-broken default — upstream MCP holds GET-SSE streams open with no idle eviction, causing connection accumulation under load. Stateful mode + session_idle_timeout=1800s (the knob added in mcp 1.27.0) is the production-safe shape. To make stateful safe by default, plumb the originating Starlette Request into RequestMetadata.request_context (sourced from upstream's mcp.server.lowlevel.server.request_ctx, set reliably in both dispatch paths). BearerTokenAuthMiddleware mirrors principal/tenant onto request.state in addition to ContextVars; auth_context_factory reads request.state first and falls back to ContextVars. Adopters using the bundled middleware + factory get the fix for free. Adopters with custom factories using ContextVars need to migrate to meta.request_context.state for stateful — documented in the docstrings. Adopters who genuinely need stateless (multi-replica without sticky LB on Mcp-Session-Id) opt back in via stateless_http=True; the SDK suppresses session_idle_timeout in that combination to honor the upstream constructor's stateless+timeout RuntimeError contract. Bumps mcp pin to >=1.27.0,<2.0 — the upper bound is required because create_mcp_server pre-creates StreamableHTTPSessionManager (FastMCP doesn't expose session_idle_timeout in its settings) which reads four FastMCP private attrs whose contract is not preserved across majors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2128b6b to
9583ff1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
session_idle_timeout=1800.0. Stateless was the production-broken default — upstream MCP holds GET-SSE streams open without idle eviction, causing connection accumulation under load (the issue this branch was opened against). Stateful + thesession_idle_timeoutknob added in mcp 1.27.0 is the production-safe shape.RequestintoRequestMetadata.request_contextsocontext_factorycan read auth state through the stateful session-task boundary that breaks ContextVar propagation. The bundledBearerTokenAuthMiddleware+auth_context_factoryare updated to userequest.state(working in both modes) with a ContextVar fallback for backward compat.mcppin to>=1.27.0,<2.0.Why
Two architectural failures landed at the same time when this branch was opened:
_handle_get_requestwhich establishes anEventSourceResponsewith no idle timer, no max-lifetime, no heartbeat-driven reap. In stateless mode this is also semantically useless — every request gets a fresh transport, so the standalone SSE writer can never deliver server-initiated messages. Pure resource sink.app.run()setup + session task spawn for everytools/call. Stateful keeps a session task alive and just dispatches into it.The reason stateless became the default in PR #296 was bundled with
json_response=Trueto dodge an upstream FastMCP SSE-streaming bug — but those flags are orthogonal. This PR decouples them:streaming_responses(the public knob) only controlsjson_responsenow;stateless_httpis a separate kwarg withFalseas the new default.What unblocks the default flip
The first attempt at this flip uncovered a real architectural issue: the stateful session task is a separate async task from the HTTP request task, so middleware-set ContextVars don't propagate to dispatch. That broke the documented auth pattern in
examples/mcp_with_auth_middleware.py.This PR solves it by:
RequestMetadata.request_context: Any(the originating Starlette Request).mcp.server.lowlevel.server.request_ctx.get().request— the upstream MCP contextvar that's reliably set in both dispatch paths inside the dispatch sub-task.BearerTokenAuthMiddlewareandA2ABearerAuthMiddlewareto mirror principal/tenant ontorequest.statealongside the ContextVars.auth_context_factoryto preferrequest.state(works in both stateless and stateful) and fall back to ContextVars (works in stateless and A2A only).End-to-end test (
tests/test_mcp_stateful_session.py::test_stateful_auth_propagates_via_request_state) wires the real middleware, real session manager, real factory, real handler and assertscaller_identity/tenant_idarrive correctly through stateful streamable-http.Migration
Adopters using the bundled
BearerTokenAuthMiddleware+auth_context_factory: no action — the fix is wired in.Adopters with custom
context_factoryusingContextVars: works on stateless, breaks on stateful. Migrate to readmeta.request_context.state.<your_key>. The SDK still threads ContextVars on stateless mode and A2A for backward compat.Adopters who genuinely need stateless (multi-replica without sticky LB on
Mcp-Session-Id): opt in withstateless_http=True. The SDK suppressessession_idle_timeoutin that combo to honor the upstreamRuntimeErrorcontract.Test plan
pytest tests/ --ignore=tests/integration --ignore=tests/conformance— 3807 passed, 17 skipped, 1 xfailed (one wall-clock-flake test deselected — pre-existing, unrelated to this branch)ruff check src/ tests/— cleanmypy src/adcp/server/serve.py src/adcp/server/auth.py— cleanBearerTokenAuthMiddleware+auth_context_factory+ realStreamableHTTPSessionManagertests/test_mcp_middleware_composition.pyfrom ContextVar-only torequest.statepattern (proves new path works under default stateful)Expert review
Two passes by
code-reviewer, one each byad-tech-protocol-expertandpython-expert:Mcp-Session-Idand AdCPcontext_idare orthogonal; idempotency /ctx_metadataecho unaffected becauseRequestContextis built per-request insidefn().session_idle_timeoutin mcp 1.27.x — pre-creating_session_manageris the only viable extension point. Required the<2.0pin upper bound (which is included).request_ctxis reliably set across both dispatch paths,request.statesurvives Starlette'sBaseHTTPMiddleware(it's a live view ontoscope["state"]), and the all-three-None fallback heuristic is safe (each request has a freshscope["state"]).🤖 Generated with Claude Code