Skip to content

[serve] hap grpc#63247

Merged
akyang-anyscale merged 10 commits into
ray-project:hap-ga-1from
akyang-anyscale:alexyang/hap-grpc
May 11, 2026
Merged

[serve] hap grpc#63247
akyang-anyscale merged 10 commits into
ray-project:hap-ga-1from
akyang-anyscale:alexyang/hap-grpc

Conversation

@akyang-anyscale
Copy link
Copy Markdown
Contributor

@akyang-anyscale akyang-anyscale commented May 9, 2026

support for haproxy grpc

known limitations:

  • streaming (to be implemented)
  • metrics (to be implemented)
  • healthz response is changed such that the response object is None. The message is in the grpc call details instead of the response now (because of haproxy limitations). The status code is unchanged (still OK on healthy and UNAVAILABLE otherwise).
  • ListApplications request is forwarded to the serve proxy, so it does not work if serve proxy is down.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale akyang-anyscale added the go add ONLY when ready to merge, run all tests label May 9, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements gRPC support for the HAProxy ingress in Ray Serve. Key changes include adding a protocol field to backend configurations, implementing TCP-level health checks for gRPC, and updating HAProxy templates to handle HTTP/2 and header-based routing via the application metadata. Feedback focuses on refactoring duplicated health check logic, ensuring gRPC backends are correctly represented in the /-/routes and /-/healthz management endpoints, and fixing a potential infinite loop in the serving() method when scale-to-zero applications are present. Additionally, the reviewer suggested a more robust selection for the default gRPC backend.

I am having trouble creating individual review comments. Click here to see my feedback.

python/ray/serve/_private/haproxy.py (481-527)

medium

The logic for resolving health check parameters (fall, rise, inter, etc.) and constructing the default-server directive is duplicated from build_health_check_config. This should be refactored into a shared helper method to improve maintainability and reduce code duplication.

python/ray/serve/_private/haproxy.py (940)

medium

The health_route_info is currently built using only http_backends. This means the /-/routes endpoint will not include gRPC applications, even though they have valid route prefixes. It is better to use the full backends list to provide a complete view of all applications.

            health_route_info = self.cfg.build_health_route_info(backends)

python/ray/serve/_private/haproxy.py (947)

medium

The healthz_rules are rendered using only http_backends. In a deployment with only gRPC applications, http_backends will be empty, causing the /-/healthz endpoint to not render any health check rules and potentially return a 404 or 503 error. This would lead external load balancers to incorrectly mark the node as unhealthy. Passing the full backends list ensures that gRPC backends are also considered in the global health check.

                    "backends": backends,

python/ray/serve/_private/haproxy.py (1257-1260)

medium

Removing the filter for gRPC backends in serving() introduces a potential hang for applications configured with min_replicas=0 (scale-to-zero). The current logic requires every backend in stats (which includes all configured backends) to have at least one UP server. For scale-to-zero apps, no replicas will be started until a request is received, meaning ready_backends will never match all_backends, causing this loop to run indefinitely. The logic should be updated to only wait for backends that are expected to have at least one replica.

python/ray/serve/_private/haproxy_templates.py (201)

medium

Using grpc_backends[0] as the default_backend for gRPC Healthz and ListApplications requests is fragile. If the first gRPC application in the list is unhealthy or has no replicas, these global management requests will fail even if other gRPC applications are healthy. Consider using a dedicated backend that aggregates all gRPC replicas or implementing a more robust fallback mechanism. Additionally, the or 'unknown' check is redundant as the name field is mandatory in BackendConfig.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale akyang-anyscale marked this pull request as ready for review May 11, 2026 21:29
@akyang-anyscale akyang-anyscale requested a review from a team as a code owner May 11, 2026 21:29
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 137c94c. Configure here.

Comment thread python/ray/serve/_private/haproxy.py
Signed-off-by: akyang-anyscale <alexyang@anyscale.com>
@akyang-anyscale akyang-anyscale merged commit 6ca5f8a into ray-project:hap-ga-1 May 11, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants