New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Ensure inference queue is cleared after shutdown #96738
Conversation
Pinging @elastic/ml-core (Team:ML) |
Hi @davidkyle, I've created a changelog YAML for you. |
2472471
to
fa8c604
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I had a question, but feel free to merge without changing anything if it doesn't make sense.
What's the question? |
String msg = "unable to process as " + processName + " worker service has shutdown"; | ||
Exception ex = error.get(); | ||
for (Runnable runnable : notExecuted) { | ||
if (runnable instanceof AbstractRunnable ar) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect every runnable
to be an AbstractRunnable
?
If so then there can be an else
branch to assert
that it's always the case.
If not, should we log something for other types of Runnable
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a solution that uses typing to guarantee we have AbstractRunnable
by making the class generic type <T extends AbstractRunnable >
instead of <T extends Runnable >
but as far as I can tell that is not possible due to the inheritance hierarchy.
There are multiple uses of this class, this code is only run when there is work left after the shutdown. I think using Runnable is reasonable and I want to keep this change as small as possible
I think you force-pushed part way through my review, so it got lost. I added it again. It's a reason why merging latest |
If an inference request is inserted into the work queue after the queue has shutdown the request will never get processed causing it to hang. When adding to the work queue there is a small window after the
is shutdown
check where the the work item is added but the worker thread may have stopped. In addition, any processed requests are notified when the deployment is stopped.