Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] Add default configuration for autoscaling that works out of the box #42613

Closed
zcin opened this issue Jan 23, 2024 · 3 comments · Fixed by #42850
Closed

[serve] Add default configuration for autoscaling that works out of the box #42613

zcin opened this issue Jan 23, 2024 · 3 comments · Fixed by #42850
Assignees
Labels
enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks ray 2.10 serve Ray Serve Related Issue

Comments

@zcin
Copy link
Contributor

zcin commented Jan 23, 2024

Provide a default configuration for autoscaling that works out of the box.

@zcin zcin added enhancement Request for new feature and/or capability serve Ray Serve Related Issue ray-team-created Ray Team created ray 2.10 labels Jan 23, 2024
@zcin zcin self-assigned this Jan 23, 2024
@edoakes edoakes added P1 Issue that should be fixed within a few weeks and removed ray-team-created Ray Team created labels Jan 25, 2024
@edoakes
Copy link
Contributor

edoakes commented Jan 25, 2024

Specifically, this means that setting: num_replicas="auto" should be sufficient for basic POCs.

@edoakes
Copy link
Contributor

edoakes commented Jan 29, 2024

One issue here is the default for max_concurrent_queries is too high right now (100). We should change this in an upcoming release.

As a temporary fix, when num_replicas="auto" is set and max_concurrent_queries is unset, we'll default it.

Open question: what should the defaults be? To start, let's go with max_concurrent_queries=5 and target_num_ongoing_requests_per_replica=0.4*max_concurrent_queries=2.

@edoakes
Copy link
Contributor

edoakes commented Jan 30, 2024

Plan is:

  • target_num_ongoing_requests_per_replica will default to 0.4 * max_concurrent_queries when not set.
  • For 2.10, when num_replicas="auto" is set AND max_concurrent_queries is not set AND target_num_ongoing_requests_per_replica is not set, we’ll set max_concurrent_queries=5 (value is up for discussion).
  • We’ll also add a warning and in 2.10 that the default will change and change the default entirely in 2.11.

zcin added a commit to zcin/ray that referenced this issue Jan 31, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.
Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Jan 31, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.
Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 1, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.
Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 1, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.
Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 1, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 1, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 1, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 1, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 1, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 2, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this issue Feb 2, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
edoakes pushed a commit that referenced this issue Feb 2, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes #42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
tterrysun pushed a commit to tterrysun/ray that referenced this issue Feb 14, 2024
Add default configuration for autoscaling that works out of the box. This can be used by setting `num_replicas="auto"`.

Relationship between `num_replicas` and `autoscaling_config`:
- If `num_replicas="auto"` is set without setting `autoscaling_config`, a default autoscaling configuration that works out the box will be used.
- If `num_replicas="auto"` and `autoscaling_config` are both set, then the fields in `autoscaling_config` will override that of the default autoscaling configuration used by `num_replicas="auto"`.
- If `num_replicas` is not set and `autoscaling_config` is set, the behavior doesn't change.

Behavior between `num_replicas` and `max_concurrent_queries`:
- If `num_replicas="auto"` and `max_concurrent_queries` is unset, the max concurrent queries will be overrided to a new default (5).
- If `num_replicas="auto"` and `max_concurrent_queries` is manually configured, nothing is modified.

Since `num_replicas="auto"` is a new API, there is no migration plan.

Closes ray-project#42613

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: tterrysun <terry@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Request for new feature and/or capability P1 Issue that should be fixed within a few weeks ray 2.10 serve Ray Serve Related Issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants