[v1.x] fix(stdio): handle BrokenResourceError in stdout_reader race (#1960)#2449
Conversation
…contextprotocol#1960) Fix the race condition where `read_stream_writer.aclose()` in the `finally` block of `stdio_client` can close the stream while `stdout_reader` is mid-`send`, producing an unhandled `anyio.BrokenResourceError` that propagates through the task group and surfaces as an `ExceptionGroup` to the caller. Root cause: the two `send` sites in `stdout_reader` did not catch `BrokenResourceError`. The outer `except` only caught `ClosedResourceError`, which is a distinct anyio exception class raised on already-closed streams. `BrokenResourceError` is raised when the receiver is closed while a `send` is in flight, which is the exact shape of this shutdown race. Fix: wrap each `read_stream_writer.send(...)` call in a `try`/`except` that catches both `ClosedResourceError` and `BrokenResourceError`, and return cleanly. Also widen the outer `except` to the same union for defense in depth. Verified by driving a stdio MCP server (jules-mcp-server) through mcp2cli, which previously always surfaced the ExceptionGroup traceback; after the patch, tool calls return cleanly. Github-Issue:modelcontextprotocol#1960
There was a problem hiding this comment.
Pull request overview
Fixes a shutdown race in the stdio transport where stdout_reader could raise anyio.BrokenResourceError during an in-flight send() when the stdio client context exits quickly, causing an ExceptionGroup to escape to callers.
Changes:
- Catch
anyio.BrokenResourceError(in addition toClosedResourceError) aroundread_stream_writer.send(...)calls insidestdout_reader. - Widen the outer
stdout_readerexception handler to also includeBrokenResourceErrorfor defense-in-depth.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| await read_stream_writer.send(exc) | ||
| except (anyio.ClosedResourceError, anyio.BrokenResourceError): | ||
| # Context is closing; exit gracefully (issue #1960). | ||
| return | ||
| continue | ||
|
|
||
| session_message = SessionMessage(message) | ||
| await read_stream_writer.send(session_message) | ||
| except anyio.ClosedResourceError: # pragma: no cover | ||
| try: | ||
| await read_stream_writer.send(session_message) | ||
| except (anyio.ClosedResourceError, anyio.BrokenResourceError): | ||
| # Context is closing; exit gracefully (issue #1960). | ||
| # Happens when the caller exits the stdio_client context | ||
| # while the subprocess is still writing to stdout. | ||
| return | ||
| except (anyio.ClosedResourceError, anyio.BrokenResourceError): # pragma: no cover |
There was a problem hiding this comment.
Add a regression test that exercises the shutdown race by having the subprocess emit at least one complete JSON-RPC line to stdout while the stdio_client context exits quickly, and assert that no ExceptionGroup/BrokenResourceError escapes. Current stdio_client tests cover cleanup timing but don’t appear to trigger the in-flight send failure this change handles.
…kenResourceError race Covers the fix in the previous commit. Spawns a subprocess that emits a burst of valid JSONRPC notifications, exits the stdio_client context immediately, asserts no exception propagates. Fails before the fix (ExceptionGroup / BrokenResourceError), passes after. Wrapped in anyio.fail_after(5.0) per AGENTS.md. Github-Issue:modelcontextprotocol#1960
Summary
Fixes #1960. The
stdio_clientinmcp/client/stdio/__init__.pyhas a racecondition:
read_stream_writer.aclose()in thefinallyblock can close thestream while the background
stdout_readertask is mid-send, which raisesanyio.BrokenResourceErrorand propagates through the task group as anExceptionGroup, failing the caller.Root cause
The two
read_stream_writer.send(...)sites insidestdout_readerwere notguarded against
BrokenResourceError. The outerexceptonly caughtClosedResourceError, which is the sibling class raised on already-closedstreams;
BrokenResourceErroris raised when the receiver is closed duringan in-flight
send, which is the exact shape of this shutdown race.Fix
read_stream_writer.send(...)intry/exceptcatching bothClosedResourceErrorandBrokenResourceError, thenreturncleanly.exceptto the same union for defense in depth.Alternative considered
Issue #1960 suggests
tg.cancel_scope.cancel()before the stream closes infinally. That works but changes shutdown ordering (cancels user-definednotification handlers, etc.). The chosen fix is narrower: it treats the
shutdown race as a graceful exit from
stdout_readerspecifically, withoutaltering the task group lifecycle.
Reproduction
Any stdio MCP server that emits log-channel notifications alongside tool
responses will exhibit this when driven by a downstream wrapper (e.g.,
mcp2cli) that opens a freshstdio_clientper invocation. The task groupexits quickly, the subprocess is still flushing output, and the race is
triggered.
Verification
Drove
jules-mcp-server(an MCP server that emitsnotifications/messagelog frames during init) through
mcp2cliwith alist-schedulestool call.Before this patch:
After this patch:
Closes #1960.