[serve] hap grpc by akyang-anyscale · Pull Request #63247 · ray-project/ray

akyang-anyscale · 2026-05-09T00:42:40Z

support for haproxy grpc

known limitations:

streaming (to be implemented)
metrics (to be implemented)
healthz response is changed such that the response object is None. The message is in the grpc call details instead of the response now (because of haproxy limitations). The status code is unchanged (still OK on healthy and UNAVAILABLE otherwise).
ListApplications request is forwarded to the serve proxy, so it does not work if serve proxy is down.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

gemini-code-assist

Code Review

This pull request implements gRPC support for the HAProxy ingress in Ray Serve. Key changes include adding a protocol field to backend configurations, implementing TCP-level health checks for gRPC, and updating HAProxy templates to handle HTTP/2 and header-based routing via the application metadata. Feedback focuses on refactoring duplicated health check logic, ensuring gRPC backends are correctly represented in the /-/routes and /-/healthz management endpoints, and fixing a potential infinite loop in the serving() method when scale-to-zero applications are present. Additionally, the reviewer suggested a more robust selection for the default gRPC backend.

I am having trouble creating individual review comments. Click here to see my feedback.

python/ray/serve/_private/haproxy.py (481-527)

The logic for resolving health check parameters (fall, rise, inter, etc.) and constructing the default-server directive is duplicated from build_health_check_config. This should be refactored into a shared helper method to improve maintainability and reduce code duplication.

python/ray/serve/_private/haproxy.py (940)

The health_route_info is currently built using only http_backends. This means the /-/routes endpoint will not include gRPC applications, even though they have valid route prefixes. It is better to use the full backends list to provide a complete view of all applications.

            health_route_info = self.cfg.build_health_route_info(backends)

python/ray/serve/_private/haproxy.py (947)

The healthz_rules are rendered using only http_backends. In a deployment with only gRPC applications, http_backends will be empty, causing the /-/healthz endpoint to not render any health check rules and potentially return a 404 or 503 error. This would lead external load balancers to incorrectly mark the node as unhealthy. Passing the full backends list ensures that gRPC backends are also considered in the global health check.

                    "backends": backends,

python/ray/serve/_private/haproxy.py (1257-1260)

Removing the filter for gRPC backends in serving() introduces a potential hang for applications configured with min_replicas=0 (scale-to-zero). The current logic requires every backend in stats (which includes all configured backends) to have at least one UP server. For scale-to-zero apps, no replicas will be started until a request is received, meaning ready_backends will never match all_backends, causing this loop to run indefinitely. The logic should be updated to only wait for backends that are expected to have at least one replica.

python/ray/serve/_private/haproxy_templates.py (201)

Using grpc_backends[0] as the default_backend for gRPC Healthz and ListApplications requests is fragile. If the first gRPC application in the list is unhealthy or has no replicas, these global management requests will fail even if other gRPC applications are healthy. Consider using a dedicated backend that aggregates all gRPC replicas or implementing a more robust fallback mechanism. Additionally, the or 'unknown' check is redundant as the name field is mandatory in BackendConfig.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Reviewed by Cursor Bugbot for commit 137c94c. Configure here.}

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

akyang-anyscale added 2 commits May 8, 2026 20:55

grpc support

46c79d0

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

grpc test

a2ed441

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

akyang-anyscale added the go add ONLY when ready to merge, run all tests label May 9, 2026

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

akyang-anyscale added 6 commits May 11, 2026 18:42

revert test

4ae8703

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

revert test

75c4155

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

simplify hc config

bcec1ac

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

lint

5e25264

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

healthz + list apps

91657fa

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

lint

15baf91

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

akyang-anyscale marked this pull request as ready for review May 11, 2026 21:29

akyang-anyscale requested a review from a team as a code owner May 11, 2026 21:29

tests

137c94c

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

abrarsheikh approved these changes May 11, 2026

View reviewed changes

cursor Bot reviewed May 11, 2026

View reviewed changes

Comment thread python/ray/serve/_private/haproxy.py

healthz

43d1d2d

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

akyang-anyscale merged commit 6ca5f8a into ray-project:hap-ga-1 May 11, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[serve] hap grpc#63247

[serve] hap grpc#63247
akyang-anyscale merged 10 commits into
ray-project:hap-ga-1from
akyang-anyscale:alexyang/hap-grpc

akyang-anyscale commented May 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

akyang-anyscale commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

python/ray/serve/_private/haproxy.py (481-527)

python/ray/serve/_private/haproxy.py (940)

python/ray/serve/_private/haproxy.py (947)

python/ray/serve/_private/haproxy.py (1257-1260)

python/ray/serve/_private/haproxy_templates.py (201)

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akyang-anyscale commented May 9, 2026 •

edited

Loading