Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RayService] Add e2e tests #1167

Merged
merged 11 commits into from
Jun 17, 2023
Merged

Conversation

zcin
Copy link
Contributor

@zcin zcin commented Jun 14, 2023

Why are these changes needed?

4 e2e tests for RayService:

  • deploy 2 applications
  • deploy then execute in place update
  • deploy then execute zero downtime rollout
  • autoscaling

The e2e tests should be run with the following command:

RAY_IMAGE=rayproject/ray:2.5.0 OPERATOR_IMAGE=controller:latest pytest -vs tests/test_sample_rayservice_yamls.py --log-cli-level=INFO

To run a specific test in test_sample_rayservice_yamls.py, specify the test name with -k, e.g:

RAY_IMAGE=rayproject/ray:2.5.0 OPERATOR_IMAGE=controller:latest pytest -vs tests/test_sample_rayservice_yamls.py -k test_service_autoscaling --log-cli-level=INFO

Related issue number

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

wip
Signed-off-by: cindyz <cindyz@anyscale.com>
@zcin zcin requested a review from kevin85421 June 14, 2023 21:19
cindyz added 3 commits June 14, 2023 23:34
wip
Signed-off-by: cindyz <cindyz@anyscale.com>
wip
Signed-off-by: cindyz <cindyz@anyscale.com>
wip
Signed-off-by: cindyz <cindyz@anyscale.com>
@zcin zcin marked this pull request as ready for review June 15, 2023 16:19
@zcin zcin changed the title [WIP][RayService] Add e2e tests [RayService] Add e2e tests Jun 15, 2023
@zcin zcin requested a review from sihanwang41 June 15, 2023 17:13
Signed-off-by: cindyz <cindyz@anyscale.com>
ray-operator/config/samples/ray-service.autoscaler.yaml Outdated Show resolved Hide resolved
ray-operator/config/samples/ray-service.autoscaler.yaml Outdated Show resolved Hide resolved
tests/framework/prototype.py Outdated Show resolved Hide resolved
tests/test_sample_rayservice_yamls.py Show resolved Hide resolved
tests/test_sample_rayservice_yamls.py Show resolved Hide resolved
cindyz added 2 commits June 16, 2023 18:27
Signed-off-by: cindyz <cindyz@anyscale.com>
Signed-off-by: cindyz <cindyz@anyscale.com>
Copy link
Contributor

@shrekris-anyscale shrekris-anyscale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all my comments! The change looks good to me.

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave some comments; the rest looks good to me! Having these tests would be very helpful!

tests/framework/utils.py Outdated Show resolved Hide resolved
ray-operator/config/samples/ray-service.autoscaler.yaml Outdated Show resolved Hide resolved
routePrefix: "/"
rayActorOptions:
numCpus: 0.1
serveConfigV2: |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many CPUs will these applications utilize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.5 per replica, it should scale to 14 replicas in the test so that would require 7 CPUs.

######################headGroupSpecs#################################
headGroupSpec:
# The `rayStartParams` are used to configure the `ray start` command.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayStartParams.md for the default settings of `rayStartParams` in KubeRay.
# See https://docs.ray.io/en/latest/cluster/cli.html#ray-start for all available options in `rayStartParams`.
rayStartParams: {"num-cpus": "0"}
rayStartParams:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make the changes from the original autoscaler file, instead I added autoscaling configurations to the original config/samples/ray_v1alpha1_rayservice.yaml file.

ray-operator/config/samples/ray-service.autoscaler.yaml Outdated Show resolved Hide resolved
tests/framework/prototype.py Outdated Show resolved Hide resolved
tests/framework/prototype.py Outdated Show resolved Hide resolved
tests/framework/utils.py Outdated Show resolved Hide resolved
Signed-off-by: cindyz <cindyz@anyscale.com>
cindyz added 2 commits June 16, 2023 20:41
Signed-off-by: cindyz <cindyz@anyscale.com>
Signed-off-by: cindyz <cindyz@anyscale.com>

scale_up_rule = AutoscaleRule(
query={"path": "/", "json_args": {}},
num_repeat=20,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean sending 20 requests? How do we know it will scale up to 14 replicas (7 CPUs) instead of another number, such as 5 replicas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand when will the Serving autoscaling be triggered.

Copy link
Contributor Author

@zcin zcin Jun 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use the deployment from https://github.com/ray-project/serve_workloads/blob/main/autoscaling_test/blocked.py. If you send requests to this deployment, the request will block indefinitely until https://github.com/ray-project/serve_workloads/blob/main/autoscaling_test/signaling.py releases the "lock". The target number of ongoing requests per replica is 1, so serve will try to add enough replicas to serve all 20 requests (since all requests are blocked).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation! We may need to add comments to both this test and the YAML file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've added comments to both the test and yaml.

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


scale_up_rule = AutoscaleRule(
query={"path": "/", "json_args": {}},
num_repeat=20,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation! We may need to add comments to both this test and the YAML file.

Signed-off-by: cindyz <cindyz@anyscale.com>
@kevin85421 kevin85421 merged commit 8db4f6d into ray-project:master Jun 17, 2023
20 checks passed
@zcin zcin deleted the services-e2e-tests branch August 25, 2023 17:35
lowang-bh pushed a commit to lowang-bh/kuberay that referenced this pull request Sep 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants