[ML] Stopping a datafeed while it switches nodes does not auto-close the associated job #74354

droberts195 · 2021-06-21T10:25:53Z

Datafeeds that have an end time auto-close their job when they stop. They do this when they stop due to reaching their end time, but also usually when they are stopped by an API call.

However, there is an inconsistency in the "stopped by an API call" part. If the datafeed gets stopped by an API call when it is not assigned to a node (for example when the node it was originally running on has left the cluster, and it hasn't yet been reassigned), stopping the datafeed simply cancels its persistent task and hence the associated job will remain open.

This sort of inconsistency is more evidence that we should move towards a world where the job and datafeed are one single thing.

The problem described in this issue will be a rare occurrence, and is not particularly hard to recover from manually if it is noticed. But the issue is that manual intervention is required so in situations where nobody is looking at the state of the ML jobs the job could unnecessarily remain open for a very long time, just wasting resources.

elasticmachine · 2021-06-21T10:25:55Z

Pinging @elastic/ml-core (Team:ML)

droberts195 added >bug :ml Machine learning labels Jun 21, 2021

elasticmachine added the Team:ML Meta label for the ML team label Jun 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Stopping a datafeed while it switches nodes does not auto-close the associated job #74354

[ML] Stopping a datafeed while it switches nodes does not auto-close the associated job #74354

droberts195 commented Jun 21, 2021

elasticmachine commented Jun 21, 2021

[ML] Stopping a datafeed while it switches nodes does not auto-close the associated job #74354

[ML] Stopping a datafeed while it switches nodes does not auto-close the associated job #74354

Comments

droberts195 commented Jun 21, 2021

elasticmachine commented Jun 21, 2021