[Serve] Add RouterConfig field to DeploymentConfig to configure RequestRouter #53870

eicherseiji · 2025-06-17T00:54:41Z

Why are these changes needed?

Follow ups to #52725.

Allow custom router kwargs in LLMConfig/DeploymentConfig

Design doc. We also discussed passing kwargs directly to the derived class' __init__, but concerned that this may lead to typos getting swallowed by kwargs in RequestRouter.__init__. Instead, initialize_state without kwargs can throw a TypeError, e.g.:

class RequestRouter:

	def __init__(self, ..., request_initizer_config: dict)

		self.initialize_state(**request_initizer_config)

	def initialize_state(self, **kwargs):
		pass


class MyRouter(RequestRouter):

	def initialize_state(self, threshold: float = 0.1):
		...

	def choose_replica(self, ...):
		...

# Result is TypeError: initialize_state() got an unexpected keyword argument 'thresh'
@serve.deployment(cls=MyRouter, cls_kwargs={"thresh": 0.2})
class Deploy:
	pass

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

eicherseiji · 2025-06-18T18:46:41Z

kwargs can be passed to a custom router class like so:

from ray import serve
from ray.serve.llm import LLMConfig, build_openai_app
from ray.serve._private.request_router.prefix_aware_router import PrefixAwarePow2ReplicaRouter

llm_config = LLMConfig(
    model_loading_config=dict(
        model_id="deepseek",
        model_source="qwen/Qwen2.5-7B-Instruct",
    ),
    runtime_env=dict(
        env_vars={"VLLM_USE_V1": "1"}
    ),
    deployment_config=dict(
        autoscaling_config=dict(min_replicas=1, max_replicas=1),
        request_router_class=PrefixAwarePow2ReplicaRouter,
        request_router_kwargs=dict(
            imbalanced_threshold=9,
        )
    ),
    engine_kwargs=dict(
        tensor_parallel_size=2,
        pipeline_parallel_size=2,
        gpu_memory_utilization=0.92,
        dtype="auto",
        max_num_seqs=40,
        max_model_len=16384,
        enable_chunked_prefill=True,
        enable_prefix_caching=True,
        trust_remote_code=True,
    ),
)

app = build_openai_app({"llm_configs": [llm_config]})
serve.run(app, blocking=True)

Copilot

Pull Request Overview

This PR enables passing custom keyword arguments (request_router_kwargs) through the Serve configuration to user‐provided request router classes. Additionally, it refactors the prefix‐aware router’s eviction loop from asyncio tasks to a background thread and updates related tests.

Add request_router_kwargs field in protobuf, config model, deployment API, and router
Wire serialization/deserialization of request_router_kwargs
Refactor eviction loop in prefix_tree.py from asyncio to threading
Update tests to call the router’s private selection method

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/ray/protobuf/serve.proto	Add new `bytes request_router_kwargs` field to `DeploymentConfig`
python/ray/serve/deployment.py	Expose `request_router_kwargs` in `options()` and apply to config
python/ray/serve/_private/router.py	Forward `request_router_kwargs` to router constructor and store it
python/ray/serve/_private/config.py	Define `request_router_kwargs`, validate JSON, and handle proto I/O
python/ray/llm/tests/serve/cpu/deployments/test_prefix_aware_request_router.py	Replace public call with private `_choose_replica_for_request`
python/ray/llm/_internal/serve/request_router/prefix_aware/prefix_tree.py	Convert eviction loop from `asyncio` to a background `threading.Thread`

Comments suppressed due to low confidence (2)

python/ray/serve/deployment.py:241

[nitpick] Inserting a new parameter into the middle of the options() signature can break callers using positional arguments. Consider making it keyword-only or placing it at the end with a default.

        request_router_kwargs: Default[Union[Dict, None]] = DEFAULT.VALUE,

python/ray/llm/tests/serve/cpu/deployments/test_prefix_aware_request_router.py:127

[nitpick] The test now calls a private method (_choose_replica_for_request) instead of the public API (choose_replica_for_request). It's better to test via the public interface to avoid coupling to internal implementation.

            chosen = await prefix_request_router._choose_replica_for_request(req)

python/ray/serve/_private/config.py

python/ray/llm/_internal/serve/request_router/prefix_aware/prefix_tree.py

python/ray/serve/_private/router.py

eicherseiji · 2025-06-18T23:28:14Z

Hi @kouroshHakha! This is ready for your review.

src/ray/protobuf/serve.proto

kouroshHakha

Two points:

Let's separate out serve only changes from the eviction thread changes and review the serve changes with serve team
Let's talk about the request router kwargs. The original intention of the design was to not expose the complexity of the constructor of the RequestRouter to the user. Right now the request_router_kwargs are passed through to the constructor which inflates the other kwargs that were supposed to stay hidden. Here is my proposal:

Modify the RequestRouter's constructor and interface this way:

class RequestRouter: 
        def __init__(self, ..., custom_init_kwargs=...)
              ...
              self.init(**custom_init_kwargs)

         def init(**kwargs): 
              # custom initialization for the Request Router. Called after the base constructor __init__ is done.

This way when I inherit this class I can simply do:

class MyRouter(RequestRouter)
        
        def init(self, param1=None)
               self.param1 = param1

        def choose_replica(...): 
              # create a policy based on self.param1


@serve.deployment(request_router_class=MyRouter, request_router_init_kwargs={"param1": 10})
class MyDeployment:
    ....

python/ray/serve/_private/request_router/request_router.py

python/ray/serve/config.py

src/ray/protobuf/serve.proto

eicherseiji · 2025-06-30T20:06:45Z

Implementation as aligned in design doc

angelinalg

Some formatting issues and style nits.

java/serve/src/main/java/io/ray/serve/config/RouterConfig.java

python/ray/serve/_private/request_router/request_router.py

python/ray/serve/config.py

src/ray/protobuf/serve.proto

python/ray/serve/config.py

python/ray/serve/_private/request_router/request_router.py

kouroshHakha

cool. I left some comments with a 5 min review with more focus around the llm specific changes. I think @abrarsheikh / @zcin should take a closer look at the serve changes.

doc/source/serve/api/index.md

python/ray/serve/config.py

python/ray/serve/_private/request_router/prefix_aware_router.py

python/ray/serve/_private/router.py

abrarsheikh

lg2m

kouroshHakha

A few comment are left. Most important one being the stability of the API is still Alpha, so don't mark it as stable.

python/ray/llm/_internal/serve/request_router/prefix_aware/prefix_aware_router.py

python/ray/serve/config.py

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

…s, make it private Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

…stRouter (ray-project#53870) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Signed-off-by: joshlee <joshlee@anyscale.com>

eicherseiji added the go label Jun 18, 2025

eicherseiji self-assigned this Jun 18, 2025

eicherseiji marked this pull request as ready for review June 18, 2025 18:48

Copilot AI review requested due to automatic review settings June 18, 2025 18:48

eicherseiji requested review from a team as code owners June 18, 2025 18:48

Copilot AI reviewed Jun 18, 2025

View reviewed changes

eicherseiji force-pushed the prefix-router branch from 9e660d4 to 58174ac Compare June 18, 2025 23:26

kouroshHakha reviewed Jun 19, 2025

View reviewed changes

src/ray/protobuf/serve.proto Outdated Show resolved Hide resolved

kouroshHakha reviewed Jun 19, 2025

View reviewed changes

eicherseiji force-pushed the prefix-router branch from 768de7c to 98e11ef Compare June 19, 2025 20:44

eicherseiji commented Jun 25, 2025

View reviewed changes

python/ray/serve/_private/request_router/request_router.py Outdated Show resolved Hide resolved

abrarsheikh reviewed Jun 27, 2025

View reviewed changes

python/ray/serve/_private/request_router/request_router.py Outdated Show resolved Hide resolved

python/ray/serve/config.py Outdated Show resolved Hide resolved

src/ray/protobuf/serve.proto Outdated Show resolved Hide resolved

eicherseiji requested a review from a team as a code owner June 30, 2025 19:05

eicherseiji requested review from abrarsheikh and kouroshHakha June 30, 2025 19:59

eicherseiji force-pushed the prefix-router branch 2 times, most recently from 3cde81f to 41676a8 Compare July 1, 2025 22:27

eicherseiji changed the title ~~Pass parameters to custom routers through LLMConfig~~ Add RouterConfig field to DeploymentConfig to configure RequestRouter Jul 1, 2025

angelinalg approved these changes Jul 2, 2025

View reviewed changes

eicherseiji commented Jul 3, 2025

View reviewed changes

python/ray/serve/_private/request_router/request_router.py Outdated Show resolved Hide resolved

eicherseiji mentioned this pull request Jul 3, 2025

Enable field documentation with Pydantic #54306

Draft

9 tasks

kouroshHakha reviewed Jul 3, 2025

View reviewed changes

eicherseiji requested a review from zcin July 3, 2025 22:01

abrarsheikh approved these changes Jul 7, 2025

View reviewed changes

eicherseiji requested a review from kouroshHakha July 7, 2025 18:49

kouroshHakha reviewed Jul 8, 2025

View reviewed changes

python/ray/llm/_internal/serve/request_router/prefix_aware/prefix_aware_router.py Outdated Show resolved Hide resolved

python/ray/serve/config.py Outdated Show resolved Hide resolved

eicherseiji and others added 24 commits July 9, 2025 14:06

Fix java files

3b01a12

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Pickle/unpickle request_router_kwargs

f5444c8

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Add to API .rst, don't serialize bytes, update tests

3c403e2

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Update comments

13db8dd

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Fix tests to use RouterConfig, document attributes

b7b3c1d

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Fix bad rebase

07462aa

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Fix test to use RouterConfig

e408b13

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Set router_kwargs to empty bytes in Java

f73df36

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Fix test to use RouterConfig

b415a2f

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Fix ThroughputAwareRequestRouterApp

6f77624

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Only support request_router_kwargs in Python

8c03deb

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Lint

67a8453

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Add test

2776012

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Apply suggestions from code review

7e1850e

Co-authored-by: angelinalg <122562471+angelinalg@users.noreply.github.com> Signed-off-by: Seiji Eicher <58963096+eicherseiji@users.noreply.github.com>

Improve RouterConfig documentation

dcf843a

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Sphinx format

42d08c3

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Rename RouterConfig -> RequestRouterConfig

1c5b556

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Add docstring to initialize_state and move to ray.llm

f867e09

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Rename serialized_request_router_cls -> _serialized_request_router_cl…

0d3938c

…s, make it private Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Rename RouterConfig.java -> RequestRouterConfig.java

ffa0ef1

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Update Protobuf field name

e33402e

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Complete renaming

34aab98

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Correct API stability

b4d42e0

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

Remove PrefixAwarePow2ReplicaRouter.__init__

c3b0d44

Signed-off-by: Seiji Eicher <seiji@anyscale.com>

eicherseiji force-pushed the prefix-router branch from 9cca537 to c3b0d44 Compare July 9, 2025 21:08

kouroshHakha approved these changes Jul 9, 2025

View reviewed changes

kouroshHakha changed the title ~~Add RouterConfig field to DeploymentConfig to configure RequestRouter~~ [Serve] Add RouterConfig field to DeploymentConfig to configure RequestRouter Jul 9, 2025

kouroshHakha merged commit 5322950 into ray-project:master Jul 10, 2025
5 checks passed

kouroshHakha mentioned this pull request Jul 10, 2025

[Serve] DeepSeek-R1 mode load stuck in H20 #50975

Closed

[Serve] Add RouterConfig field to DeploymentConfig to configure RequestRouter #53870

[Serve] Add RouterConfig field to DeploymentConfig to configure RequestRouter #53870

Uh oh!

Conversation

eicherseiji commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

eicherseiji commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eicherseiji commented Jun 18, 2025

Uh oh!

Uh oh!

kouroshHakha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eicherseiji commented Jun 30, 2025

Uh oh!

angelinalg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kouroshHakha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abrarsheikh left a comment

Choose a reason for hiding this comment

Uh oh!

kouroshHakha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eicherseiji commented Jun 17, 2025 •

edited

Loading

eicherseiji commented Jun 18, 2025 •

edited

Loading