Validate upstream JSON-RPC responses in transparent proxy#5288
Validate upstream JSON-RPC responses in transparent proxy#5288bishnubista wants to merge 2 commits into
Conversation
The transparent proxy forwarded malformed upstream MCP frames to clients with HTTP 200 even when the response violated JSON-RPC 2.0 structure. This adds a boundary check in NoOpResponseProcessor that rejects structurally invalid upstream frames and returns a synthetic 502 carrying a JSON-RPC error to the client, so the proxy stops being a silent amplifier for malformed (or adversarial) upstream servers. Validation runs only for streamable-http POST/200 responses that carry an MCP request signal (MCP-Protocol-Version or Mcp-Session-Id) and an application/json content type, with non-identity Content-Encoding traffic passed through untouched. Body reads are bounded to 100 MiB to match existing streamable-HTTP limits in pkg/vmcp. Rewritten error responses replace headers wholesale so upstream session/cookie/cache metadata is not smuggled into the proxy-generated error. SSE traffic is unaffected. Closes stacklok#5247 Signed-off-by: bishnubista <bista.developer@gmail.com>
4fcdb4b to
a7badd9
Compare
| if dec.More() { | ||
| return fmt.Errorf("JSON-RPC response must contain a single JSON value") | ||
| } | ||
| if err := dec.Decode(&struct{}{}); err != io.EOF { |
There was a problem hiding this comment.
The second Decode here is unreachable. When dec.More() returns false there is nothing left in the stream, so the subsequent Decode will always return io.EOF and the condition never fires. The More() check above already covers the trailing-value case completely. Nit, but worth removing the dead branch.
|
|
||
| // NoOpResponseProcessor is a processor that does nothing. | ||
| // Used for transports that don't require response processing (e.g., streamable-http). | ||
| // NoOpResponseProcessor is the default processor for non-SSE transports. |
There was a problem hiding this comment.
Scoping to non-SSE makes sense for now — SSE validation would need a fundamentally different approach. By the time ProcessResponse runs on an SSE response, the 200 OK and Content-Type: text/event-stream headers are already committed downstream, so there is no way to rewrite to 502 for a malformed mid-stream event. A proper fix would need a per-event streaming interceptor inside the SSE processor that can either synthesize an error event or close the stream — a meaningfully larger scope than this PR.
Worth opening a follow-up issue to track it explicitly, so it is clear the gap is deferred rather than forgotten.
Summary
The transparent proxy (
thv run) currently forwards upstream MCP server responses to clients with HTTP 200 even when the JSON-RPC frame is structurally invalid — missingjsonrpc:"2.0", invalididtype, non-object body, etc. The proxy sits between potentially untrusted upstream MCP servers and trusted clients, so a compromised or misconfigured upstream can push malformed frames straight through, where they may crash strict JSON-RPC parsers, be misinterpreted as legitimate responses, or trigger undefined behaviour in client-side parsers.This PR validates upstream JSON-RPC frames at the proxy boundary and rewrites invalid responses into a structured JSON-RPC error so the proxy stops being a silent amplifier for malformed (or adversarial) upstream servers.
NoOpResponseProcessor.ProcessResponse(streamable-http/ default transport). SSE is unaffected.jsonrpc=="2.0",id∈ {string, number, null}, exactly one ofresult/error;error.codemust be an integer,error.messagemust be a string.POST+200,application/json(-rpc) media type, non-identityContent-Encodingpassed through untouched, request must carry an MCP streamable-HTTP signal (MCP-Protocol-VersionorMcp-Session-Id) so non-MCP JSON traffic flowing through the catch-all is not rewritten.{"jsonrpc":"2.0","id":null,"error":{"code":-32000,"message":"Invalid upstream JSON-RPC response","data":"<detail>"}}.-32000is the JSON-RPC implementation-defined server-error code;-32603is reserved for internal JSON-RPC errors. Response headers are replaced wholesale so upstreamMcp-Session-Id,Set-Cookie,ETag,Cache-Control, etc. are not smuggled into the proxy-generated error.io.LimitReaderatmaxJSONRPCResponseBytes = 100 << 20(100 MiB), matching streamable-HTTP precedent inpkg/vmcp/clientandpkg/vmcp/session/internal/backend. Oversized responses are rejected with the same 502 shape so the proxy cannot be amplified into a memory DoS.Closes #5247
Type of change
Test plan
task test)task test-e2e)task lint-fix)Local verification:
go test ./pkg/transport/proxy/transparent/... -count=1— pass.golangci-lint run --timeout=5m ./pkg/transport/proxy/transparent/...— 0 issues.New test cases in
pkg/transport/proxy/transparent/response_processor_test.go:result: null,application/json; charset=utf-8.jsonrpc, invalididtype, non-object body,result+errorboth present, trailing JSON, fractionalerror.code.text/event-stream,application/jsonsomethingelse.Content-Encoding: gzip) passed through unchanged for both valid and malformed payloads; explicitContent-Encoding: identitystill validates.MCP-Protocol-VersionorMcp-Session-Id→ validate.Mcp-Session-Id,Set-Cookie,Etag,Cache-Controlfrom the response.maxJSONRPCResponseBytes) is rejected with a size-limit error.API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.No CRD or operator API surface is touched.
Does this introduce a user-facing change?
Yes. Clients of the transparent proxy that previously received malformed upstream responses verbatim with HTTP 200 will now receive HTTP 502 with a synthetic JSON-RPC error body. Conformant upstream MCP servers are unaffected — only structurally invalid responses are rewritten.
Special notes for reviewers
Two scope decisions called out explicitly so reviewers can push back if either is wrong for the project:
Validation runs in
ModifyResponse, aftertracingTransport.RoundTripmay have observed an upstreamMcp-Session-Idand registered proxy-side session state. This PR strips upstream session headers from the proxy-generated 502 so clients never receive a session id derived from a malformed initialize response, but it does not roll back server-side proxy session state created before validation. Moving validation earlier (or adding a rollback path for invalid initialize responses) touchestracingTransportand felt outside the scope of "structurally validate the upstream frame." Happy to open a follow-up — or fold it into this PR if maintainers prefer.Backward-compat clients that POST without
MCP-Protocol-Versionon the very first initialize will not trigger validation — same as today's behaviour, so no regression. If broader coverage is preferred,tracingTransport.RoundTripalready buffers and parses the request body to detectinitialize; a small context-propagated marker would letProcessResponsevalidate those frames too. Easy follow-up.Open to feedback on:
pkg/vmcpprecedent; can tighten to 10 MiB if maintainers prefer an explicit security default).