Skip to content

Streamable HTTP client: slow reception of large SSE tool responses (fromLineSubscriber line-assembly bottleneck) #1042

Description

@TaJoal

Summary

When a tool returns a large response over the Streamable HTTP client transport, the client takes ~5s to receive a ~4 MB body that curl/HttpClient.ofString() reads in ~0.4s. The bottleneck is the client-side SSE body reading in ResponseSubscribers.sseToBodySubscriber, not the server, network, or JSON parsing.

Environment

  • io.modelcontextprotocol.sdk:mcp-core / mcp 2.0.0 (latest)
  • JDK 25, Reactor (via SDK)
  • Server: Spring AI 2.0.0 mcp-spring-webmvc (HttpServletStreamableServerTransportProvider) returning a single large SSE message event — one compact-JSON data: line (~4.17 MB)
  • Client: HttpClientStreamableHttpTransport (McpSyncClient.callTool)

Measurements (steady-state, 3 runs)

Path Time to receive ~4.17 MB
McpSyncClient.callTool (this SDK) ~5,300 ms
Same payload via curl / HttpClient BodyHandlers.ofString() ~0.4–0.75 s
Jackson parse of the received JSON < 70 ms

→ ~10–13× slower than a plain one-shot read of the identical bytes.

Root cause analysis

The response body is effectively a single huge data: line (compact JSON has no newlines). ResponseSubscribers.sseToBodySubscriber uses HttpResponse.BodySubscribers.fromLineSubscriber(...); assembling that one ~4 MB line through the JDK line subscriber is the cost.

Things I tried:

  • Changing SseLineSubscriber demand from upstream().request(1) to request(Long.MAX_VALUE)no improvement (so it isn't per-line backpressure).
  • Replacing the body subscriber with a streaming byte-level SSE parser (BodySubscribers.fromSubscriber, unbounded demand, accumulate ByteBuffers, split on \n\n event boundaries, decode each event once) → ~0.4 s (≈13×).

Important constraint (must stay streaming)

A whole-body ofString read fixes the speed but breaks progress: the server interleaves notifications/progress on the same POST response SSE stream before the final result. So the fix must remain a streaming parser that emits each SSE event as its boundary arrives (a byte-level parser does this while still avoiding the line-assembly cost).

Questions

  1. Is there a recommended approach/workaround for large tool responses on the client that we're missing?
  2. Would you accept a PR replacing fromLineSubscriber with a streaming byte-level SSE parser in ResponseSubscribers.sseToBodySubscriber (preserving incremental event emission)?
  3. Is this related to Support application/json responses in Streamable HTTP transport (opt-in JSON response mode) #844 (opt-in application/json response mode)? That would avoid SSE framing on the server side, but clients receiving SSE responses would still benefit from this fix.

Happy to open a PR with the streaming parser + a benchmark if that's welcome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions