Summary
When a client sends notifications/cancelled for an in-flight request handled over stdio, the server's receive loop task group dies. The process stays alive but stops reading stdin, so every subsequent request hangs and the client eventually reports MCP error -32000: Connection closed. The bug is racy — it reproduces in roughly 40–80% of attempts depending on platform timing.
Affects: mcp 1.26.0 and 1.27.1 (latest at time of writing). Confirmed with FastMCP 3.1.x and the in-tree stdio server.
Reproduction
A standalone subprocess test is in mlorentedev/hive tests/test_transport_recovery.py. The relevant flow:
initialize → ack
notifications/initialized
tools/call id=2
notifications/cancelled for requestId=2 (within a few ms of step 3)
- Receive:
{"id": 2, "error": {"code": 0, "message": "Request cancelled"}} ← OK
tools/call id=3
- No response. Server is alive but
proc.stdout.readline() hangs.
Root cause
mcp.shared.session.RequestResponder.__exit__:
def __exit__(self, exc_type, exc_val, exc_tb):
try:
if self._completed:
self._on_complete(self)
finally:
self._entered = False
...
self._cancel_scope.__exit__(exc_type, exc_val, exc_tb) # ← (A)
When notifications/cancelled arrives, RequestResponder.cancel() calls self._cancel_scope.cancel() and sends the error response. The handler task catches the CancelledError in mcp/server/lowlevel/server.py (around line 766) and returns. The with responder: block then exits with exc_type=None while the cancel scope is still in cancelled state — at line (A), anyio's CancelScope.__exit__ re-raises CancelledError.
That exception bubbles up to:
async with anyio.create_task_group() as tg:
async for message in session.incoming_messages:
tg.start_soon(self._handle_message, ...)
…in Server._run. anyio task groups cancel all sibling tasks and propagate the cancellation. The receive loop is one of those sibling tasks, so it dies.
Suggested fix
Swallow the spurious cancellation when the responder has already sent its response:
def __exit__(self, exc_type, exc_val, exc_tb):
try:
if self._completed:
self._on_complete(self)
finally:
self._entered = False
if not self._cancel_scope:
raise RuntimeError("No active cancel scope")
try:
self._cancel_scope.__exit__(exc_type, exc_val, exc_tb)
except BaseException as exc:
if self._completed and isinstance(exc, anyio.get_cancelled_exc_class()):
# cancel() already sent the error response — the scope's
# re-raised cancellation is spurious.
return
raise
Mirrors what we ship in hive src/hive/_compat.py.
Workaround used downstream
We monkey-patch RequestResponder.__exit__ at import time. The patch is self-gated (only fires when _completed=True AND the leaking exception is anyio.get_cancelled_exc_class()), so it stays inert once a fix lands upstream.
Environment
- Reproduced on Windows 11 (
mcp 1.26.0, 1.27.1) and via Claude Code as the host.
- Tracked downstream in mlorentedev/hive#75.
- Regression test passes 5/5 with the patch applied on Python 3.12; fails 2/5 — 4/5 without it.
- Python 3.13 appears to have additional uncancel semantics that the patch does not yet fully cover — verification in progress.
Summary
When a client sends
notifications/cancelledfor an in-flight request handled over stdio, the server's receive loop task group dies. The process stays alive but stops reading stdin, so every subsequent request hangs and the client eventually reportsMCP error -32000: Connection closed. The bug is racy — it reproduces in roughly 40–80% of attempts depending on platform timing.Affects:
mcp1.26.0 and 1.27.1 (latest at time of writing). Confirmed with FastMCP 3.1.x and the in-tree stdio server.Reproduction
A standalone subprocess test is in mlorentedev/hive
tests/test_transport_recovery.py. The relevant flow:initialize→ acknotifications/initializedtools/call id=2notifications/cancelledforrequestId=2(within a few ms of step 3){"id": 2, "error": {"code": 0, "message": "Request cancelled"}}← OKtools/call id=3proc.stdout.readline()hangs.Root cause
mcp.shared.session.RequestResponder.__exit__:When
notifications/cancelledarrives,RequestResponder.cancel()callsself._cancel_scope.cancel()and sends the error response. The handler task catches theCancelledErrorinmcp/server/lowlevel/server.py(around line 766) andreturns. Thewith responder:block then exits withexc_type=Nonewhile the cancel scope is still in cancelled state — at line (A), anyio'sCancelScope.__exit__re-raisesCancelledError.That exception bubbles up to:
…in
Server._run. anyio task groups cancel all sibling tasks and propagate the cancellation. The receive loop is one of those sibling tasks, so it dies.Suggested fix
Swallow the spurious cancellation when the responder has already sent its response:
Mirrors what we ship in hive
src/hive/_compat.py.Workaround used downstream
We monkey-patch
RequestResponder.__exit__at import time. The patch is self-gated (only fires when_completed=TrueAND the leaking exception isanyio.get_cancelled_exc_class()), so it stays inert once a fix lands upstream.Environment
mcp1.26.0, 1.27.1) and via Claude Code as the host.