fix(api): restore HTTP GET /health endpoint#25234
Conversation
Convert tonic's Server to an axum Router via into_router(), then serve over the same TcpListener via hyper::Server. Enables HTTP/1.1 acceptance so additional HTTP routes can be added alongside gRPC on the same port. Behavior-preserving. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-expose the HTTP health endpoint that was removed as part of the GraphQL-to-gRPC migration (#24364). The endpoint matches the pre-0.55 response shape: 200 with body {"ok": true} while serving and 503 with body {"ok": false} after set_not_serving() is called during drain. HEAD is also handled. gRPC clients continue to use grpc.health.v1.Health/Check; both probes now share the same serving state so they agree during shutdown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two integration tests hitting the restored HTTP health endpoint
via reqwest:
- GET returns 200 with body {"ok":true}
- HEAD returns 200
Exposes harness.api_port() so tests can reach the API port directly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document the HTTP GET/HEAD /health endpoint served alongside the gRPC API, framed as compatibility with Vector 0.54.0 and earlier. Updates the reference endpoints schema to allow HEAD, adds HEAD/GET entries for /health in api.cue with 200/503 responses, and adds a curl example to the API reference page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
thomasqueirozb
left a comment
There was a problem hiding this comment.
Nice! This will make upgrading simpler 🙂. We can look into removal at a later date
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 281bbf9d3c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .into_router() | ||
| .merge(http_router(router_serving)); |
There was a problem hiding this comment.
Keep honoring gRPC deadlines after router merge
For gRPC clients that set request deadlines via grpc-timeout metadata, converting the tonic server into an axum router and serving it directly through hyper bypasses tonic's transport MakeSvc, which is where the previous serve_with_incoming_shutdown path wrapped routes in GrpcTimeout. Those deadline-bounded API calls/streams will no longer be cancelled with a deadline-exceeded status by the server, and can keep consuming server work until they otherwise finish or the client disconnects; please preserve tonic's timeout handling when serving the merged router.
Useful? React with 👍 / 👎.
Summary
Restores the HTTP
GET /healthendpoint on the Vector API to avoid user disruption. The endpoint is served on the same port as the gRPC API, so existing HTTP probes (AWS ALB health checks, Kubernetes HTTP liveness/readiness probes, etc.) keep working unchanged.200 {"ok":true}while serving,503 {"ok":false}during drain.HEAD /healthis also accepted for load balancers that prefer it.grpc.health.v1.Health/Check) is unchanged. Both probes share the same serving flag so they agree during shutdown.Flagged in #25210.
No changelog fragment: no release has shipped without
/health, so there is no user-visible change relative to 0.54.0. Theno-changeloglabel applies.Implementation
src/api/grpc_server.rsnow builds the tonicServerinto anaxum::Routerviainto_router(), merges an axum router exposingGET/HEAD /health, and serves the combined router viahyper::Serverover the existingTcpListener.accept_http1(true)lets plain HTTP/1.1 requests reach the axum routes while gRPC continues to use HTTP/2.set_not_serving()now flips both the gRPCHealthReporterand the shared HTTP serving flag.Vector configuration
Any config with the API enabled works, e.g.:
How did you test this PR?
tests/vector_api/health.rscoveringGET /health(200 + body) andHEAD /health(200). Existing gRPCgrpc.health.v1.Health/Checktest still passes.curl -i http://127.0.0.1:8686/healthreturns200 {"ok":true};grpcurl -plaintext 127.0.0.1:8686 grpc.health.v1.Health/Checkstill returnsSERVING.make check-fmt,make check-clippy,make check-markdown, andwebsite/make cue-buildall pass.Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References