Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Stopping a datafeed while it switches nodes does not auto-close the associated job #74354

Open
droberts195 opened this issue Jun 21, 2021 · 1 comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team

Comments

@droberts195
Copy link
Contributor

Datafeeds that have an end time auto-close their job when they stop. They do this when they stop due to reaching their end time, but also usually when they are stopped by an API call.

However, there is an inconsistency in the "stopped by an API call" part. If the datafeed gets stopped by an API call when it is not assigned to a node (for example when the node it was originally running on has left the cluster, and it hasn't yet been reassigned), stopping the datafeed simply cancels its persistent task and hence the associated job will remain open.

This sort of inconsistency is more evidence that we should move towards a world where the job and datafeed are one single thing.

The problem described in this issue will be a rare occurrence, and is not particularly hard to recover from manually if it is noticed. But the issue is that manual intervention is required so in situations where nobody is looking at the state of the ML jobs the job could unnecessarily remain open for a very long time, just wasting resources.

@droberts195 droberts195 added >bug :ml Machine learning labels Jun 21, 2021
@elasticmachine elasticmachine added the Team:ML Meta label for the ML team label Jun 21, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :ml Machine learning Team:ML Meta label for the ML team
Projects
None yet
Development

No branches or pull requests

2 participants