Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[serve] Cherry-pick fix for router queue length metric #38020

Merged
merged 1 commit into from
Aug 8, 2023

Conversation

edoakes
Copy link
Contributor

@edoakes edoakes commented Aug 2, 2023

Why are these changes needed?

Cherry-pick #37965

Related issue number

#37943

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

…isconnects (ray-project#37965)

The `ray_serve_deployment_queued_queries` metric tracks the number of queries that have yet to be assigned a replica. If a client disconnects before its query has been assigned a replica– but after the metric has counted their query– the query terminates, but the metric doesn't decrease.

This change decrements `ray_serve_deployment_queued_queries` when a queued request is disconnected.

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
@aslonnie aslonnie removed their request for review August 2, 2023 19:32
@aslonnie
Copy link
Collaborator

aslonnie commented Aug 2, 2023

(I am not qualified to review this.)

@zhe-thoughts zhe-thoughts self-assigned this Aug 2, 2023
@zhe-thoughts
Copy link
Collaborator

@edoakes This is a P1 fix. Is it OK to put it in 2.7?

@edoakes
Copy link
Contributor Author

edoakes commented Aug 2, 2023

@zhe-thoughts we discovered today that the bug doesn't only affect observability but also can affect our autoscaling behavior, preventing downscaling in some circumstances. This bumps it to more like a P0.5 -- would like to get it in if possible but if it delays the release significantly then I think we can stomach 2.7.

@rickyyx rickyyx changed the base branch from releases/2.6.2 to releases/2.6.3 August 7, 2023 21:36
@rickyyx rickyyx merged commit 1945d8f into ray-project:releases/2.6.3 Aug 8, 2023
30 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants