Fix scheduling of ClusterInfoService#refresh #59880

DaveCTurner · 2020-07-20T13:48:11Z

Today the InternalClusterInfoService uses the
LocalNodeMasterListener interface to start/stop its operations. Since
the onMaster and offMaster methods are called on the MANAGEMENT
threadpool, there's no guarantee that they run in the correct sequence,
which could result in an elected master failing to regularly update the
cluster info.

Since this service is also a ClusterStateListener we may as well drop
the usage of the LocalNodeMasterListener interface and simply update
the status of the local node on the applier thread in clusterChanged
to ensure consistency.

Additionally, today the InternalClusterInfoService uses a simple flag
to track whether the local node is the elected master or not. If the
node stops being the master and then starts again within a few seconds
then the scheduled updates from the old mastership might carry on
running in addition to the ones for the new mastership.

This commit addresses that by tracking the identity of the scheduled
update job and creating a new job for each mastership.

Today the `InternalClusterInfoService` uses the `LocalNodeMasterListener` interface to start/stop its operations. Since the `onMaster` and `offMaster` methods are called on the `MANAGEMENT` threadpool, there's no guarantee that they run in the correct sequence, which could result in an elected master failing to regularly update the cluster info. Since this service is also a `ClusterStateListener` we may as well drop the usage of the `LocalNodeMasterListener` interface and simply update the status of the local node on the applier thread in `clusterChanged` to ensure consistency. Additionally, today the `InternalClusterInfoService` uses a simple flag to track whether the local node is the elected master or not. If the node stops being the master and then starts again within a few seconds then the scheduled updates from the old mastership might carry on running in addition to the ones for the new mastership. This commit addresses that by tracking the identity of the scheduled update job and creating a new job for each mastership.

elasticmachine · 2020-07-20T13:48:13Z

Pinging @elastic/es-distributed (:Distributed/Allocation)

ywelsch

It's annoying that this service is blocking a management thread (and there is no reason for it to do so). Anyway, removing that would be a larger change. Getting the data race fixed is a good step.

ywelsch · 2020-07-21T12:38:37Z

server/src/main/java/org/elasticsearch/cluster/InternalClusterInfoService.java

+    public void clusterChanged(ClusterChangedEvent event) {
+        if (event.localNodeMaster() && refreshAndRescheduleRunnable.get() == null) {
+            logger.trace("elected as master, scheduling cluster info update tasks");
+            executeRefresh(clusterService.state(), "became master");


Maybe just event.state() here?

++ 6209182

…-bug

DaveCTurner · 2020-07-21T13:29:56Z

Yeah the blocking nature of refresh() is indeed unnecessary but fixing that is orthogonal to this.

Today the `InternalClusterInfoService` uses the `LocalNodeMasterListener` interface to start/stop its operations. Since the `onMaster` and `offMaster` methods are called on the `MANAGEMENT` threadpool, there's no guarantee that they run in the correct sequence, which could result in an elected master failing to regularly update the cluster info. Since this service is also a `ClusterStateListener` we may as well drop the usage of the `LocalNodeMasterListener` interface and simply update the status of the local node on the applier thread in `clusterChanged` to ensure consistency. Additionally, today the `InternalClusterInfoService` uses a simple flag to track whether the local node is the elected master or not. If the node stops being the master and then starts again within a few seconds then the scheduled updates from the old mastership might carry on running in addition to the ones for the new mastership. This commit addresses that by tracking the identity of the scheduled update job and creating a new job for each mastership.

DaveCTurner added >bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.0.0 v7.10.0 labels Jul 20, 2020

DaveCTurner requested a review from ywelsch July 20, 2020 13:48

elasticmachine added the Team:Distributed Meta label for distributed team label Jul 20, 2020

ywelsch approved these changes Jul 21, 2020

View reviewed changes

DaveCTurner added 2 commits July 21, 2020 14:01

No need to use clusterService any more

6209182

Merge branch 'master' into 2020-07-20-cluster-info-service-scheduling…

b1a2330

…-bug

DaveCTurner merged commit fefb31b into elastic:master Jul 21, 2020

DaveCTurner deleted the 2020-07-20-cluster-info-service-scheduling-bug branch July 21, 2020 15:06

DaveCTurner mentioned this pull request Jul 21, 2020

Thread safe clean up of LocalNodeModeListeners #59932

Merged

jaymode added a commit to jaymode/elasticsearch that referenced this pull request Jul 21, 2020

update after changes from elastic#59880

59a7fee

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scheduling of ClusterInfoService#refresh #59880

Fix scheduling of ClusterInfoService#refresh #59880

DaveCTurner commented Jul 20, 2020

elasticmachine commented Jul 20, 2020

ywelsch left a comment

ywelsch Jul 21, 2020

DaveCTurner Jul 21, 2020

DaveCTurner commented Jul 21, 2020

Fix scheduling of ClusterInfoService#refresh #59880

Fix scheduling of ClusterInfoService#refresh #59880

Conversation

DaveCTurner commented Jul 20, 2020

elasticmachine commented Jul 20, 2020

ywelsch left a comment

Choose a reason for hiding this comment

ywelsch Jul 21, 2020

Choose a reason for hiding this comment

DaveCTurner Jul 21, 2020

Choose a reason for hiding this comment

DaveCTurner commented Jul 21, 2020