Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Ensure inference queue is cleared after shutdown #96738

Merged
merged 4 commits into from Jun 9, 2023

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Jun 9, 2023

If an inference request is inserted into the work queue after the queue has shutdown the request will never get processed causing it to hang. When adding to the work queue there is a small window after the is shutdown check where the the work item is added but the worker thread may have stopped. In addition, any processed requests are notified when the deployment is stopped.

@davidkyle davidkyle added >bug :ml Machine learning auto-backport-and-merge Automatically create backport pull requests and merge when ready v8.9.0 v8.8.2 labels Jun 9, 2023
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Jun 9, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I had a question, but feel free to merge without changing anything if it doesn't make sense.

@davidkyle
Copy link
Member Author

@droberts195

I had a question, but feel free to merge without changing anything if it doesn't make sense.

What's the question?

docs/changelog/96738.yaml Outdated Show resolved Hide resolved
String msg = "unable to process as " + processName + " worker service has shutdown";
Exception ex = error.get();
for (Runnable runnable : notExecuted) {
if (runnable instanceof AbstractRunnable ar) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect every runnable to be an AbstractRunnable?

If so then there can be an else branch to assert that it's always the case.

If not, should we log something for other types of Runnable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a solution that uses typing to guarantee we have AbstractRunnable by making the class generic type <T extends AbstractRunnable > instead of <T extends Runnable > but as far as I can tell that is not possible due to the inheritance hierarchy.

There are multiple uses of this class, this code is only run when there is work left after the shutdown. I think using Runnable is reasonable and I want to keep this change as small as possible

@droberts195
Copy link
Contributor

What's the question?

I think you force-pushed part way through my review, so it got lost. I added it again. It's a reason why merging latest main is better than force pushing.

@davidkyle davidkyle merged commit 307425f into elastic:main Jun 9, 2023
11 of 12 checks passed
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Jun 9, 2023
HiDAl pushed a commit to HiDAl/elasticsearch that referenced this pull request Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-and-merge Automatically create backport pull requests and merge when ready >bug :ml Machine learning Team:ML Meta label for the ML team v8.8.2 v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants