-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Machine learning avoid thread pool deadlocks #109134
Comments
Pinging @elastic/ml-core (Team:ML) |
Inference runner does not appear to wait for itself, it only appears to fork to another thread in TrainedModelProvider which I am assuming is the transport thread? Same as TrainedModel... in DeploymentManager Doesn't appear to be an immediate risk, but we can still remove the use of PlainActionFuture |
You may be right about it having a similar underlying cause, but I do have some details for
|
A future change will refactor how InferenceRunner's `run` method functions, and I want to reuse the existing unit tests to verify the behavior does not change. Relate elastic#109134
A future change will refactor how InferenceRunner's `run` method functions, and I want to reuse the existing unit tests to verify the behavior does not change. Relate #109134
InferenceStep now passes a Listener to InferenceRunner. InferenceRunner will chain the listener in the `run` method such that the nested model loading and inference happen asynchronously without blocking threads. Relate elastic#109134
The model loading scheduled thread iterates through the model queue and deploys each model. Rather than block and wait on each deployment, the thread will attach a listener that will either iterate to the next model (if one is in the queue) or reschedule the thread. This change should not impact: 1. the iterative nature of the model deployment process - each model is still deployed one at a time, and no additional threads are consumed per model. 2. the 1s delay between model deployment tries - if a deployment fails but can be retried, the retry is added to the next batch of models that are consumed after the 1s scheduled delay. Relate elastic#109134
The model loading scheduled thread iterates through the model queue and deploys each model. Rather than block and wait on each deployment, the thread will attach a listener that will either iterate to the next model (if one is in the queue) or reschedule the thread. This change should not impact: 1. the iterative nature of the model deployment process - each model is still deployed one at a time, and no additional threads are consumed per model. 2. the 1s delay between model deployment tries - if a deployment fails but can be retried, the retry is added to the next batch of models that are consumed after the 1s scheduled delay. Relate elastic#109134
InferenceStep now passes a Listener to InferenceRunner. InferenceRunner will chain the listener in the `run` method such that the nested model loading and inference happen asynchronously without blocking threads. Relate #109134 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Elasticsearch Version
8.15
Installed Plugins
No response
Java Version
bundled
OS Version
Linux
Problem Description
In #108934 we added assertions to ensure we do not complete a future on the same executor that waits for it, since this can lead to deadlocks. Two ML usages were identified that need to be fixed:
Ideally those would be converted to asynchronous waits instead.
Steps to Reproduce
NA
Logs (if relevant)
No response
Tasks
The text was updated successfully, but these errors were encountered: