[serve] Raise error when multiplex on ingress deployment used with direct ingress by akyang-anyscale · Pull Request #64045 · ray-project/ray

akyang-anyscale · 2026-06-11T22:52:35Z

Serve currently does not support model multiplexing on the ingress deployment when direct ingress is enabled (also when HAProxy is enabled). Raise an error instead of silently serving the app without proper multiplexing support.

We do this 2 ways:

when the multiplexing decorator is used statically, we can detect multiplexing early and raise an error when the controller is building the DeploymentInfos
when multiplexing is enabled dynamically, we check in the replica's initialization process if multiplexing is used, and will raise an error if so.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

gemini-code-assist

Code Review

This pull request introduces validation to reject model multiplexing on the ingress deployment when direct ingress is enabled. It adds static detection of @serve.multiplexed decorators and implements the validation check at application build time, accompanied by unit and integration tests. The review feedback points out that the validation check should be updated to also verify if HAProxy is enabled (RAY_SERVE_ENABLE_HA_PROXY), ensuring consistency with the raised error message and preventing unsupported deployments under HAProxy.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

cursor

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

^{Reviewed by Cursor Bugbot for commit 0c1a3a9. Configure here.}

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

eicherseiji · 2026-06-12T06:42:26Z

        built_app.validate_single_fastapi_ingress()
+        # This task runs on the cluster, so its view of the direct-ingress flag
+        # mirrors the replicas' (they inherit this task's runtime_env).
+        built_app.validate_multiplexing_with_direct_ingress(


What if we added a uses_multiplexing bit to the deploy args proto? Then the controller can validate in deploy_applications just once instead of duplicating the check

will we be able to catch dynamically initialized multiplexing https://github.com/ray-project/ray/blob/master/python/ray/llm/_internal/serve/core/server/llm_server.py#L242-L244

@eicherseiji I'm pretty sure deploy_applications isn't covered in the declarative path, but I could add the check when creating the deployment info, which happens in both cases.

@abrarsheikh this method would not catch that. I think the only way to do that would be at replica initialization time, wdyt?

@akyang-anyscale Ah DeploymentInfo makes sense then

Also I would support a check at replica initialization as well for correctness in the dynamic multiplexing case

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

abrarsheikh · 2026-07-02T18:58:24Z

+        # Imported lazily to avoid a circular import at module load time
+        # (multiplex -> metrics -> context -> client -> application_state).
+        from ray.serve.multiplex import _callable_uses_multiplexing


let's move _callable_uses_multiplexing into utility file to break the cir dep

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

on build

a0897be

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

gemini-code-assist Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread python/ray/serve/_private/build_app.py Outdated

akyang-anyscale added the go add ONLY when ready to merge, run all tests label Jun 11, 2026

akyang-anyscale marked this pull request as ready for review June 11, 2026 22:59

akyang-anyscale requested a review from a team as a code owner June 11, 2026 22:59

cursor Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread python/ray/serve/_private/build_app.py Outdated

akyang-anyscale added 3 commits June 11, 2026 23:05

comment

6471881

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

use controller's view

7f51c11

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

Merge branch 'master' into alexyang/di-multiplex-raise

0c1a3a9

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

cursor Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread python/ray/serve/_private/application_state.py Outdated

make private

13ac9f1

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

ray-gardener Bot added the serve Ray Serve Related Issue label Jun 12, 2026

eicherseiji reviewed Jun 12, 2026

View reviewed changes

akyang-anyscale added 5 commits June 26, 2026 06:51

merge

f5484c5

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

use deploy args

13f06c7

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

Merge branch 'master' into alexyang/di-multiplex-raise

a3b5cdd

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

pass byte in proto

079e8b0

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

lint

61c88ff

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

eicherseiji mentioned this pull request Jun 29, 2026

[serve][ci] Run all serve tests with HAProxy, drop the test whitelist #64210

Draft

marker

0106643

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

abrarsheikh reviewed Jul 2, 2026

View reviewed changes

akyang-anyscale added 2 commits July 2, 2026 19:02

Merge branch 'master' into alexyang/di-multiplex-raise

c3bd685

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

move to utils

21eea6b

Signed-off-by: akyang-anyscale <alexyang@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[serve] Raise error when multiplex on ingress deployment used with direct ingress#64045

[serve] Raise error when multiplex on ingress deployment used with direct ingress#64045
akyang-anyscale wants to merge 13 commits into
ray-project:masterfrom
akyang-anyscale:alexyang/di-multiplex-raise

akyang-anyscale commented Jun 11, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

eicherseiji Jun 12, 2026

Uh oh!

abrarsheikh Jun 12, 2026

Uh oh!

akyang-anyscale Jun 15, 2026

Uh oh!

eicherseiji Jun 24, 2026

Uh oh!

akyang-anyscale Jun 29, 2026

Uh oh!

abrarsheikh Jul 2, 2026

Uh oh!

akyang-anyscale Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

akyang-anyscale commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eicherseiji Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

abrarsheikh Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

akyang-anyscale Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

eicherseiji Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

akyang-anyscale Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

abrarsheikh Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

akyang-anyscale Jul 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

akyang-anyscale commented Jun 11, 2026 •

edited

Loading