New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEST] Ignore zen2 discovery task in waitForPendingTasks #36381
Conversation
Pinging @elastic/ml-core |
Could the same error occur in the tests that extend MlNativeAutodetectIntegTestCase |
Maybe, although that's a much bigger change to fix, so I'd rather do it as a separate PR if it's required. The change in this PR fixes a problem that has caused a few CI builds to fail so it would be good to get it merged ASAP. It's not completely clear if the discovery task is supposed to run for 10+ minutes. If it is then it brings into question the functionality of the list tasks request's |
We're working on a fix so that the discovery task completes as soon as a cluster is formed. As this fix will require some non-trivial changes, it might not be merged today. So it would be good to proceed with this quick-fix, which can then be reverted once we have the proper fix. |
Today the `GetDiscoveredNodesAction` waits, possibly indefinitely, to discover enough nodes to bootstrap the cluster. However it is possible that the cluster forms before a node has discovered the expected collection of nodes, in which case the action will wait indefinitely despite the fact that it is no longer required. This commit changes the behaviour so that the action fails once a node receives a cluster state with a nonempty configuration, indicating that the cluster has been successfully bootstrapped and therefore the `GetDiscoveredNodesAction` need wait no longer. Relates elastic#36380 and elastic#36381; reverts 558f4ec.
Today the `GetDiscoveredNodesAction` waits, possibly indefinitely, to discover enough nodes to bootstrap the cluster. However it is possible that the cluster forms before a node has discovered the expected collection of nodes, in which case the action will wait indefinitely despite the fact that it is no longer required. This commit changes the behaviour so that the action fails once a node receives a cluster state with a nonempty configuration, indicating that the cluster has been successfully bootstrapped and therefore the `GetDiscoveredNodesAction` need wait no longer. Relates #36380 and #36381; reverts 558f4ec.
Fixes #36380