x/net/http2: Race in handler execution results in zero-byte data frame, causing incompatibility with gRPC #56317
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
In a proprietary HTTP + gRPC reverse proxy, when issuing unary gRPC calls, I observe intermittent occurrences of the error:
The occurrence of this error is not deterministically reproducible, and affects only a small percentage of requests. Usually, a client retry of the RPC alleviates the problem.
See the investigation notes after the survey questions in this issue.
What did you expect to see?
I expect to see no occurrences of this error under regular operation.
What did you see instead?
I see this error affecting 1 - 5% of requests.
Context
I'm working with a proprietary HTTP reverse proxy with built-in support for gRPC over HTTP/2.
Example
The proxy is proprietary, but its core logic is demonstrated below.
Symptom
Clients issuing gRPC calls through the proxy that return gRPC application-level errors intermittently (non-determinstically) observe errors from the grpc-go library
server closed the stream without sending trailers
.GODEBUG=http2debug=2
reveals that the issue manifests only when Go'shttp2.Server
writes aHEADERS
frame with flagEND_HEADERS
followed by a zero-byteDATA
frame with flagEND_STREAM
.The issue does not manifest (i.e. the application-level error is propagated correctly) when Go's
http2.Server
writes aHEADERS
frame with flagsEND_HEADERS | END_STREAM
.Note that there are no trailers included in this message.
Example trace with no errors (RPC returns successfully)
Example trace with error (internal error raised by grpc-go)
RCA
I believe this is due to a race caused by concurrent
http.Handler
execution inhttp2/server.go
.In the case that handler execution completes before headers are written,
rws.handlerDone
is true and Go includesEND_STREAM
in the initialHEADERS
frame. In the case that handler execution is still in-progress when the first write occurs, theHEADERS
frame is written withoutEND_STREAM
, and a subsequent write sends a zero-byte data frame withEND_STREAM
, acting purely as a control message.Ultimately this causes non-determinism where the specific scenario that unary gRPC methods that return errors quickly are disproportionately affected.
According to gRPC specification,
END_STREAM
should be included in the lastHEADERS
frame to indicate termination of the response. In grpc-go, encounteringEND_STREAM
in a data frame is an explicit error case. However, HTTP/2 protocol specification itself doesn't prohibit this.Proposal
A similar (identical?) issue was identified in nghttp2 (see: nghttp2/nghttp2#588). The submitted fix was to include
END_STREAM
in theHEADERS
payload if the body is empty and no trailers exist. I'm not sure if a similar approach is feasible inhttp2.Server
.The text was updated successfully, but these errors were encountered: