Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix scheduling of ClusterInfoService#refresh #59880

Conversation

DaveCTurner
Copy link
Contributor

Today the InternalClusterInfoService uses the
LocalNodeMasterListener interface to start/stop its operations. Since
the onMaster and offMaster methods are called on the MANAGEMENT
threadpool, there's no guarantee that they run in the correct sequence,
which could result in an elected master failing to regularly update the
cluster info.

Since this service is also a ClusterStateListener we may as well drop
the usage of the LocalNodeMasterListener interface and simply update
the status of the local node on the applier thread in clusterChanged
to ensure consistency.

Additionally, today the InternalClusterInfoService uses a simple flag
to track whether the local node is the elected master or not. If the
node stops being the master and then starts again within a few seconds
then the scheduled updates from the old mastership might carry on
running in addition to the ones for the new mastership.

This commit addresses that by tracking the identity of the scheduled
update job and creating a new job for each mastership.

Today the `InternalClusterInfoService` uses the
`LocalNodeMasterListener` interface to start/stop its operations. Since
the `onMaster` and `offMaster` methods are called on the `MANAGEMENT`
threadpool, there's no guarantee that they run in the correct sequence,
which could result in an elected master failing to regularly update the
cluster info.

Since this service is also a `ClusterStateListener` we may as well drop
the usage of the `LocalNodeMasterListener` interface and simply update
the status of the local node on the applier thread in `clusterChanged`
to ensure consistency.

Additionally, today the `InternalClusterInfoService` uses a simple flag
to track whether the local node is the elected master or not. If the
node stops being the master and then starts again within a few seconds
then the scheduled updates from the old mastership might carry on
running in addition to the ones for the new mastership.

This commit addresses that by tracking the identity of the scheduled
update job and creating a new job for each mastership.
@DaveCTurner DaveCTurner added >bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.0.0 v7.10.0 labels Jul 20, 2020
@DaveCTurner DaveCTurner requested a review from ywelsch July 20, 2020 13:48
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Allocation)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team label Jul 20, 2020
Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's annoying that this service is blocking a management thread (and there is no reason for it to do so). Anyway, removing that would be a larger change. Getting the data race fixed is a good step.

public void clusterChanged(ClusterChangedEvent event) {
if (event.localNodeMaster() && refreshAndRescheduleRunnable.get() == null) {
logger.trace("elected as master, scheduling cluster info update tasks");
executeRefresh(clusterService.state(), "became master");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just event.state() here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ 6209182

@DaveCTurner
Copy link
Contributor Author

Yeah the blocking nature of refresh() is indeed unnecessary but fixing that is orthogonal to this.

@DaveCTurner DaveCTurner merged commit fefb31b into elastic:master Jul 21, 2020
@DaveCTurner DaveCTurner deleted the 2020-07-20-cluster-info-service-scheduling-bug branch July 21, 2020 15:06
DaveCTurner added a commit that referenced this pull request Jul 21, 2020
Today the `InternalClusterInfoService` uses the
`LocalNodeMasterListener` interface to start/stop its operations. Since
the `onMaster` and `offMaster` methods are called on the `MANAGEMENT`
threadpool, there's no guarantee that they run in the correct sequence,
which could result in an elected master failing to regularly update the
cluster info.

Since this service is also a `ClusterStateListener` we may as well drop
the usage of the `LocalNodeMasterListener` interface and simply update
the status of the local node on the applier thread in `clusterChanged`
to ensure consistency.

Additionally, today the `InternalClusterInfoService` uses a simple flag
to track whether the local node is the elected master or not. If the
node stops being the master and then starts again within a few seconds
then the scheduled updates from the old mastership might carry on
running in addition to the ones for the new mastership.

This commit addresses that by tracking the identity of the scheduled
update job and creating a new job for each mastership.
jaymode added a commit to jaymode/elasticsearch that referenced this pull request Jul 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) Team:Distributed Meta label for distributed team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants