Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] reduce remote calls for get handle APIs #44812

Merged
merged 1 commit into from
Apr 22, 2024

Conversation

zcin
Copy link
Contributor

@zcin zcin commented Apr 17, 2024

[serve] reduce remote calls for get handle APIs

This PR makes two changes to the get handle APIs:

  1. First, it optimizes serve.get_app_handle, which made two separate calls to the controller: one to get the ingress deployment name for the specified application, and one to check if the deployment exists. This is unnecessary and can be squashed into one call; in other words the second call doesn't need to be made.
  2. serve.get_deployment_handle makes one call to the controller to check if the specified deployment exists. This remains the default behavior, but this PR adds a private flag _check_exists which can be set to False if the user wants to bypass that check for performance reasons.

Signed-off-by: Cindy Zhang cindyzyx9@gmail.com

This PR makes two changes to the get handle APIs:
1. First, it optimizes `serve.get_app_handle`, which made two separate calls to the controller: one to get the ingress deployment name for the specified application, and one to check if the deployment exists. This is unnecessary and can be squashed into one call; in other words the second call doesn't need to be made.
2. `serve.get_deployment_handle` makes one call to the controller to check if the specified deployment exists. This remains the default behavior, but this PR adds a private flag `_check_exists` which can be set to False if the user wants to bypass that check for performance reasons.

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
@zcin zcin marked this pull request as ready for review April 17, 2024 20:42
@zcin zcin requested a review from edoakes April 17, 2024 20:42
@JoshKarpel
Copy link
Contributor

Thanks for jumping on this so quickly @zcin , hugely appreciated!

While we're continuing to debug our performance issues, I think we spotted another place that could set check_exists=False, in the LongestPrefixRouter

handle = self._get_handle(endpoint.name, endpoint.app_name).options(

There's some layers of indirection here, but ultimately this update does come from the Serve Controller directly, if I understood the code correctly, so it probably doesn't need to check for deployment existence again.

@zcin
Copy link
Contributor Author

zcin commented Apr 22, 2024

Thanks for jumping on this so quickly @zcin , hugely appreciated!

While we're continuing to debug our performance issues, I think we spotted another place that could set check_exists=False, in the LongestPrefixRouter

handle = self._get_handle(endpoint.name, endpoint.app_name).options(

There's some layers of indirection here, but ultimately this update does come from the Serve Controller directly, if I understood the code correctly, so it probably doesn't need to check for deployment existence again.

@JoshKarpel Good catch, I believe that is an optimization we can make. I can address this in a separate PR!

@zcin
Copy link
Contributor Author

zcin commented Apr 22, 2024

@edoakes Is this ready to merge?

@JoshKarpel
Copy link
Contributor

Thanks for jumping on this so quickly @zcin , hugely appreciated!
While we're continuing to debug our performance issues, I think we spotted another place that could set check_exists=False, in the LongestPrefixRouter

handle = self._get_handle(endpoint.name, endpoint.app_name).options(

There's some layers of indirection here, but ultimately this update does come from the Serve Controller directly, if I understood the code correctly, so it probably doesn't need to check for deployment existence again.

@JoshKarpel Good catch, I believe that is an optimization we can make. I can address this in a separate PR!

Awesome, sounds good!

@edoakes edoakes merged commit 3de13e9 into ray-project:master Apr 22, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Serve] Remove unnecessary checks in serve.get_app_handle/ServeControllerClient.get_handle()
3 participants