Streamable HTTP client: slow reception of large SSE tool responses (fromLineSubscriber line-assembly bottleneck)

## Summary

When a tool returns a large response over the Streamable HTTP **client** transport, the client takes ~5s to receive a ~4 MB body that `curl`/`HttpClient.ofString()` reads in ~0.4s. The bottleneck is the client-side SSE body reading in `ResponseSubscribers.sseToBodySubscriber`, not the server, network, or JSON parsing.

## Environment

- `io.modelcontextprotocol.sdk:mcp-core` / `mcp` **2.0.0** (latest)
- JDK 25, Reactor (via SDK)
- Server: Spring AI 2.0.0 `mcp-spring-webmvc` (`HttpServletStreamableServerTransportProvider`) returning a single large SSE `message` event — one compact-JSON `data:` line (~4.17 MB)
- Client: `HttpClientStreamableHttpTransport` (`McpSyncClient.callTool`)

## Measurements (steady-state, 3 runs)

| Path | Time to receive ~4.17 MB |
|---|---|
| `McpSyncClient.callTool` (this SDK) | **~5,300 ms** |
| Same payload via `curl` / `HttpClient` `BodyHandlers.ofString()` | ~0.4–0.75 s |
| Jackson parse of the received JSON | < 70 ms |

→ ~10–13× slower than a plain one-shot read of the identical bytes.

## Root cause analysis

The response body is effectively a single huge `data:` line (compact JSON has no newlines). `ResponseSubscribers.sseToBodySubscriber` uses `HttpResponse.BodySubscribers.fromLineSubscriber(...)`; assembling that one ~4 MB line through the JDK line subscriber is the cost.

Things I tried:
- Changing `SseLineSubscriber` demand from `upstream().request(1)` to `request(Long.MAX_VALUE)` → **no improvement** (so it isn't per-line backpressure).
- Replacing the body subscriber with a **streaming byte-level SSE parser** (`BodySubscribers.fromSubscriber`, unbounded demand, accumulate `ByteBuffer`s, split on `\n\n` event boundaries, decode each event once) → **~0.4 s (≈13×)**.

## Important constraint (must stay streaming)

A whole-body `ofString` read fixes the speed but **breaks progress**: the server interleaves `notifications/progress` on the *same* POST response SSE stream before the final result. So the fix must remain a **streaming** parser that emits each SSE event as its boundary arrives (a byte-level parser does this while still avoiding the line-assembly cost).

## Questions

1. Is there a recommended approach/workaround for large tool responses on the client that we're missing?
2. Would you accept a PR replacing `fromLineSubscriber` with a streaming byte-level SSE parser in `ResponseSubscribers.sseToBodySubscriber` (preserving incremental event emission)?
3. Is this related to #844 (opt-in `application/json` response mode)? That would avoid SSE framing on the server side, but clients receiving SSE responses would still benefit from this fix.

Happy to open a PR with the streaming parser + a benchmark if that's welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Streamable HTTP client: slow reception of large SSE tool responses (fromLineSubscriber line-assembly bottleneck) #1042

Summary

Environment

Measurements (steady-state, 3 runs)

Root cause analysis

Important constraint (must stay streaming)

Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Path	Time to receive ~4.17 MB
`McpSyncClient.callTool` (this SDK)	~5,300 ms
Same payload via `curl` / `HttpClient` `BodyHandlers.ofString()`	~0.4–0.75 s
Jackson parse of the received JSON	< 70 ms

Uh oh!

Streamable HTTP client: slow reception of large SSE tool responses (fromLineSubscriber line-assembly bottleneck) #1042

Description

Summary

Environment

Measurements (steady-state, 3 runs)

Root cause analysis

Important constraint (must stay streaming)

Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions