fix: prevent stack overflow from re-entrant close() in StreamableHTTP transport#1705
Conversation
|
@modelcontextprotocol/client
@modelcontextprotocol/server
@modelcontextprotocol/express
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
… transport Add re-entrancy guard to WebStandardStreamableHTTPServerTransport.close() to prevent RangeError when multiple transports close simultaneously. Clear stream mapping before calling cleanup functions to avoid infinite recursion from cancel callbacks that delete from the mapping or chain to other close operations. Fixes modelcontextprotocol#1699
e631b98 to
5fcb221
Compare
travisbreaks
left a comment
There was a problem hiding this comment.
This is the most comprehensive of the three related PRs (#1700, #1704, #1705). Good to see the snapshot-before-clear pattern and individual try/catch on cleanup calls.
A few findings:
1. onclose not protected by finally
In close(), onclose?.() is called after the try/finally block:
try {
const entries = [...this._streamMapping.values()];
this._streamMapping.clear();
for (const { cleanup } of entries) { ... }
this._requestResponseMap.clear();
} finally {
// Don't reset _isClosing
}
this.onclose?.();If _requestResponseMap.clear() throws (or any future code added to the try block), onclose never fires. This should be inside the finally:
} finally {
this.onclose?.();
}2. Overlap with #1700 and #1704
This PR subsumes both:
If this PR lands, the other two should be closed or rebased. Worth noting in the PR description to help maintainers triage.
3. onerror before reject is good
The pattern of calling this.onerror?.(err) before reject(err) in send() is correct: it lets error handlers run before the promise rejection propagates. Consistent across all three transports.
4. Suppressed cleanup errors
The empty catch {} blocks in cleanup are fine for shutdown, but a debug-level log would help operators diagnose issues in production. Not a blocker.
Solid PR. The main actionable item is moving onclose into the finally block.
ctonneslan
left a comment
There was a problem hiding this comment.
Thanks for the thorough review!
Re: point 1 — onclose is already inside the finally block in the current diff. Might have been a misread of the code structure.
Re: point 2 — closed #1700 since this PR fully covers it. Note that #1704 (onerror callbacks on send methods) is actually complementary, not redundant — this PR doesn't touch the send() paths. I'll keep #1704 open and address your defensive wrapping suggestion there.
Re: point 4 — agreed, suppressed cleanup errors are fine for now. Happy to add debug logging in a follow-up if maintainers want it.
travisbreaks
left a comment
There was a problem hiding this comment.
Solid fix for the re-entrant close() problem. The _isClosing guard and snapshot-before-clear pattern are both correct approaches for this class of bug. A few observations:
-
Guard placement is good. The early return on
_isClosingprevents the stack overflow, and snapshotting_streamMapping.values()before calling.clear()prevents thecancelcallbacks from mutating the map mid-iteration. Clean solution. -
Consider resetting
_isClosingon error. Right now, if one of thecleanup()calls throws something unexpected that somehow bypasses the inner catch (e.g., a non-Error thrown value in some edge runtime), the_isClosingflag staystruepermanently. Sinceonclose?.()is in thefinallyblock, the transport signals closure either way, so this may be acceptable. But if there is any scenario whereclose()could be retried after a partial failure, the flag would block it. Might be worth a brief comment noting that_isClosingis intentionally permanent (one-shot). -
The inner
catch {}silently swallows errors. This is fine for "already-closed controllers" as the comment says, but consider whether logging at debug level would help during development. Silent catches in transport code can make debugging painful when something unexpected slips through. -
Missing the same fix on the Node transport? This patch only touches
WebStandardStreamableHTTPServerTransport. Does the Node-specificStreamableHTTPServerTransport(if it exists in this codebase) have the same re-entrant close vulnerability? Worth checking for consistency.
Overall this is a clean, well-scoped fix. Nice work identifying the root cause.
|
Thanks for the thorough review! Re: resetting _isClosing on error — good point. The flag is intentionally permanent (one-shot) since close() should only succeed once. If cleanup partially fails, the transport is in an indeterminate state anyway, and retrying close() on a half-torn-down transport would likely cause more issues. I'll add a comment noting this is intentional. Re: silent catch — agreed, a debug log would help. The catch is specifically for already-closed ReadableStream controllers, but I can see how silent swallowing would be frustrating to debug in other cases. Re: Node transport — checked this. The Node-specific |
Summary
_isClosingre-entrancy guard toWebStandardStreamableHTTPServerTransport.close()to preventRangeError: Maximum call stack size exceeded_streamMappingbefore iterating cleanup functions, preventing re-entrant deletes fromcancelcallbacksNodeStreamableHTTPServerTransportdelegates to the sameclose()method, so both transports are coveredFixes #1699
Related PRs:
Test plan
pnpm --filter @modelcontextprotocol/server test)close()multiple times concurrently does not throw RangeErrorclose()after already closed is a no-op