Skip to content

fix(http-stream): prevent session destruction on transient errors#136

Merged
QuantGeekDev merged 1 commit intomainfrom
fix/session-resilience-streamable-http
Feb 5, 2026
Merged

fix(http-stream): prevent session destruction on transient errors#136
QuantGeekDev merged 1 commit intomainfrom
fix/session-resilience-streamable-http

Conversation

@QuantGeekDev
Copy link
Copy Markdown
Owner

Problem

Clients using Streamable HTTP transport (like Cline with vscode:// protocol) experience "Session not found" (-32001) errors after a session is established and working briefly. Other IDEs are unaffected because they don't trigger the specific conditions that cause session destruction.

{"jsonrpc":"2.0","error":{"code":-32001,"message":"Session not found"},"id":null}

Root Cause

Three bugs in HttpStreamTransport cause premature session destruction:

1. onerror callback destroys sessions on transient errors (Primary)

The SDK fires onerror for non-fatal issues (parse errors on a single malformed request, failed SSE writes during event replay). The framework's handler unconditionally deleted the session from _transports, permanently killing it. Any subsequent request → -32001.

2. Re-initialization with a stale session ID is rejected (Secondary)

After a session is lost, if the client sends a new initialize request while still including the old mcp-session-id header, the framework rejects it with 404 instead of creating a new session. The client is stuck.

3. Broadcast send() failures remove sessions

If transport.send() throws during a broadcast (e.g., no open SSE stream for a request ID), the session is permanently removed from the map.

Fix

  1. onerror: Log the error but preserve the session. Only onclose (explicit DELETE or server shutdown) removes sessions.

  2. Re-initialization: Detect initialize requests with stale/invalid session IDs and create a new session instead of rejecting with 404.

  3. Broadcast: Log failures but preserve sessions for future requests.

Tests

6 new regression tests in server-session-resilience.test.ts:

  • ✅ Session survives single onerror event
  • ✅ Session survives multiple onerror events
  • ✅ Re-initialization with stale session ID creates new session
  • ✅ Non-initialize requests with unknown session IDs still get 404
  • onclose still correctly removes sessions (DELETE behavior preserved)
  • ✅ Broadcast send failures preserve sessions

Full suite: 179 tests passing (14 suites)

Two bugs in HttpStreamTransport caused 'Session not found' (-32001) errors
for clients like Cline after a session was established and working:

1. onerror callback destroyed sessions on transient SDK errors
   The SDK fires onerror for non-fatal issues (parse errors on malformed
   requests, failed SSE writes during event replay). The framework's
   handler unconditionally deleted the session from the transport map,
   making all subsequent requests from that client fail with -32001.
   Fix: onerror now logs but preserves the session. Only onclose (explicit
   DELETE or server shutdown) removes sessions.

2. Re-initialization with a stale session ID was rejected
   After a session was lost (server restart, error, etc.), if the client
   sent a new initialize request while still including the old session ID
   header, the framework rejected it with 'Session not found' instead of
   creating a new session.
   Fix: detect initialize requests with stale session IDs and create a
   new session instead of rejecting.

Also fixed: broadcast send() failures no longer remove sessions from the
transport map.

Includes 6 regression tests covering all three scenarios.
@QuantGeekDev QuantGeekDev merged commit b0ff78b into main Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant