Support gRPC proxy phase1#37
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces comprehensive gRPC support to the proxy, including a request classifier, status mapping, deadline enforcement, and support for Trailers-Only responses. Key changes involve adding gRPC-specific configuration options, observability tracking for gRPC identity and status, and logic for handling gRPC-specific error synthesis. Additionally, a new unit test suite has been added to verify the gRPC helper functionality. I have no feedback to provide.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ade512ea5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
LGTM |
gRPC Proxying — Phase 1 (gRPC↔gRPC passthrough)
Ships the minimum-credible "gRPC proxying" layer on top of the existing HTTP/2 transport: inbound classification,
grpc-timeout-driven local deadline enforcement, end-to-end Trailers-Only synthesis, and gRPC-aware error mapping at every middleware-reject + proxy-failure emission site.Almost no new transport code — HTTP/2 inbound, HTTP/2 outbound (ALPN, multiplexed sessions, donated leases), end-to-end trailer forwarding, streaming request/response bodies, and the proxy engine are already shipped. This PR is policy plus a thin per-request classifier that flips a flag at HEADERS-complete.
Summary
proxy.protocol) flipsreq.is_grpc_at HEADERS-complete with sub-microsecond cost.grpc-timeoutparser distinguishes Valid / Invalid / SubMs / Absent; sub-ms grammar emits Trailers-OnlyDEADLINE_EXCEEDEDpre-dispatch without ever touching the upstream.ProxyTransaction::ArmGrpcDeadlineenforcesmin(client grpc-timeout, route.response_timeout_ms). Deadline expiry routes through the newLocalAbortAndDeliverpath (gateway-driven termination, distinct fromCancel()for client disconnect).MakeGrpcErrorResponsewired intoDeliverTerminalError— every proxy failure on a gRPC route (502 / 503 / 504 / 413) is delivered as Trailers-Onlygrpc-status: UNAVAILABLE/RESOURCE_EXHAUSTED/ etc., not a raw HTTP response.MaybeSynthesizeGrpcRejectFromHttpStatus+HandleClassifierRejecthelpers wrap every middleware-reject emission site (sync H1, async-resume, H2 streaming dispatch, H2 buffered dispatch) — auth / rate-limit / body-limit / 404 / 405 / 5xx all become Trailers-Only on gRPC routes.OnHeaders/OnTrailersscan peer-emittedgrpc-statusand publish it through the observability snapshot for happy-path metrics.RESULT_GRPC_DEADLINE_EXCEEDED = -19is a new result code distinct fromRESULT_RESPONSE_TIMEOUTso the circuit breaker classifies deadline expiry correctly.te: trailersforced viaforce_te_trailers_ = client_te_trailers_ \|\| is_grpc_cached at construction. Some gRPC backends use the header as a proxy-compatibility marker.HeaderRewriterstrips client-suppliedgrpc-status/grpc-message/grpc-status-details-binso a forged request can't surface trailer-equivalent semantics.ObservabilitySnapshotowns gRPC identity (is_grpc_,grpc_service_,grpc_method_,grpc_fully_qualified_method_) and response status (grpc_response_status_). Populated byObservabilityMiddleware::SetGrpcIdentityat construction and by every synthesis path before delivery.proxy.protocolconfig (auto/grpc/rest) is loaded, validated, round-tripped, and threaded throughRouteProtocolFromConfigStringat route registration.restart-required field; surfaces a warning on reload mismatch.What's NOT in this PR (deferred)
grpc-timeoutdecrement:EncodeGrpcTimeoutis shipped but not yet called at HEADERS-build time. The local-sideArmGrpcDeadlineenforces the deadline correctly today; only the upstream-visible deadline is unaffected. Reserved for a Phase 1.5 follow-up.ParseGrpcRetryPushback(gRFC A6 tri-state) andIsTrailersOnlyResponseShapeare reserved hook points.MaybeTriggerGrpcTrailerStatusRetryandRetryPolicy::RetryCondition::GRPC_UNAVAILABLEare not wired.rpc.*emission: the snapshot owns the data, but finalize-site emission ofrpc.method/rpc.response.status_code/rpc.server.call.durationis deferred.is_grpc_web_is reserved onHttpRequest; the bridge module is not built.http_major != 2per spec — gRPC requires HTTP/2.grpc_proxyend-to-end integration suite (~30 wire-level tests): deferred. The unit suite + existing http2/proxy regression suites exercise the integration sites indirectly.Architecture
Layer placement: new code lives at Layer 6 (proxy/transport-adjacent) for
ProxyTransactionextensions and Layer 7 (middleware) forSynthesizeMiddlewareReject+ classifier. No new transport-layer changes.Files changed
New (8 files, ~1300 LoC):
include/grpc/grpc_reject_kind.hinclude/grpc/grpc_status.h+server/grpc_status.ccinclude/grpc/grpc_timeout.h+server/grpc_timeout.ccinclude/grpc/grpc_synthesis.h+server/grpc_synthesis.cctest/grpc_test.hModified (15 files, +612 / -23):
include/http/http_request.his_grpc_,is_grpc_web_,grpc_timeout_ms,grpc_reject_kind_,grpc_service_,grpc_method_+ symmetricReset()include/http/http_response.hMarkTrailersOnly()/IsTrailersOnly()marker;Trailer(k,v)/GetTrailers()for post-commit trailer attachinclude/http/route_options.hRouteProtocolenum (Auto/Grpc/Rest) +protocolfieldinclude/config/server_config.hProxyConfig::protocolstring field threaded throughoperator==include/observability/observability_snapshot.hSetGrpcIdentity(req)template +set_grpc_response_statussetterinclude/upstream/proxy_transaction.hRESULT_GRPC_DEADLINE_EXCEEDED = -19;MakeGrpcErrorResponse,OnGrpcTimeoutExpired,ArmGrpcDeadline,ClearGrpcDeadline,LocalAbortAndDeliver,AbortUpstreamForLocalReason,IsTrailersOnlyResponseShape; fieldsis_grpc_,grpc_deadline_ms_,grpc_deadline_generation_,body_bytes_seen_,force_te_trailers_server/proxy_transaction.ccStart()computes effective deadline + armsArmGrpcDeadline;Cleanup()bumps deadline generation;OnBodyChunkincrementsbody_bytes_seen_;OnHeaders/OnTrailerspublish peer grpc-status viaExtractGrpcStatusToSnapshothelper; both H2 submit sites use cachedforce_te_trailers_;DeliverTerminalErrorforks on gRPC for pre-commit (Trailers-Only) and post-commit (trailers-with-END_STREAM); new method impls at file tail (MakeGrpcErrorResponse,OnGrpcTimeoutExpired,LocalAbortAndDeliver,AbortUpstreamForLocalReason,ArmGrpcDeadline,ClearGrpcDeadline,IsTrailersOnlyResponseShape)server/header_rewriter.ccgrpc-status/grpc-message/grpc-status-details-binserver/http2_session.ccClassifyRequestruns at both HEADERS-complete sites; both dispatch sites callHandleClassifierReject(sentinel short-circuit) +MaybeSynthesizeGrpcRejectFromHttpStatus(post-handler wrap)server/http_server.cctweak_response; proxy-registration call sites passRouteProtocolFromConfigString(proxy.protocol)server/observability_middleware.ccsnap->SetGrpcIdentity(request)at snapshot constructionserver/config_loader.ccproxy.protocol(allowlistauto/grpc/rest)MakefileGRPC_SRCSadded toLIB_SRCStest/run_test.ccgrpcsuite + CLI flag +PrintUsageentry.github/workflows/ci.ymlgrpcadded tobuild-linux-tsan-restsuite enumerationCorrectness contracts (load-bearing)
obs_snapshot_->set_grpc_response_status(grpc_status)BEFORE returning the response or emitting trailers. Missed sites surface as__missing__inrpc.response.status_code.LocalAbortAndDelivercallsReportBreakerOutcome(result_code)BEFOREAbortUpstreamForLocalReasonand the delivery branch.Cleanupneutral-releases breaker admission, so reporting after cleanup would lose the failure signal.ArmGrpcDeadline,ClearGrpcDeadline,Cleanup, andAbortUpstreamForLocalReasonbumpgrpc_deadline_generation_to invalidate in-flight closures. Closure body gates on generation match,cancelled_,IsKilledForShutdown(), andcomplete_cb_invoked_.Cancel()is the client-disconnect path (nullscomplete_cb_, marks invoked).LocalAbortAndDeliveris the gateway-driven termination path with a synthesized response — it leavescomplete_cb_alive for the pre-commitDeliverResponseto fire.AbortUpstreamForLocalReasondeliberately does NOT callCleanup. The pre-commit branch reaches Cleanup viaDeliverResponse; the post-commit branch calls Cleanup directly afterstream_sender_.End.Test plan
./test_runner grpc— 37/37 new unit tests covering:RESULT_*→ grpc-status family classification (UNAVAILABLE / INTERNAL / DEADLINE_EXCEEDED)ParseGrpcTimeoutMs(Valid / Invalid / SubMs / Absent + overflow + missing-unit + 9+ digits)EncodeGrpcTimeoutround-trip across all units (m/S/M/H)PercentEncodeGrpcMessage%-escape per PROTOCOL-HTTP2 (100%→100%25) + non-ASCII / control-byte escapeMakeTrailersOnlyResponseshape verification (:status 200+grpc-status+grpc-message+ Trailers-Only marker)MiddlewareRejectKind→ grpc-status branchSynthesizeMiddlewareRejectidempotency + non-gRPC pass-throughMaybeSynthesizeGrpcRejectFromHttpStatusHTTP→Trailers-Only mapping + 2xx pass-throughClassifyRequest: happy path, H1 suppression,application/grpc-webexclusion,+xxxcontent-type variants,RouteProtocol::Restsuppression,RouteProtocol::Grpcforce, non-POST + invalid + sub-ms + validgrpc-timeoutParseGrpcRetryPushbacktri-state (Absent / Delay / negative-Terminal / unparseable-Terminal)MakeGrpcErrorResponseforDEADLINE_EXCEEDEDandCHECKOUT_FAILEDHttpResponse::Trailer/GetTrailersaccessor contracttest_runner(34 MB) andserver_runner(20 MB) link without errors.grpcsuite picks up.grpc_proxyend-to-end integration suite (~30 wire-level tests includingAsyncJwksFailureEmitsTrailersOnlyand+xxxcontent-type variants on real gRPC clients).