New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ZenDiscovery fields via the cluster service update task. #7790
Update ZenDiscovery fields via the cluster service update task. #7790
Conversation
… detection via a cluster state update task
if (!masterNode.equals(currentState.nodes().masterNode())) { | ||
logger.debug("Master node has switched on us, rejoining..."); | ||
return rejoin(currentState, "rejoin_due_to_master_switch"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above test effectively has the same test as below, the currentState will by definition need to be updated to reflect the fact that the joining node will get the published state with it "in it" and the master node set. I think that we can remove the below check, yet still have the mentioned comment above, and mention that the master might get switched on us or that we haven't completed a full circle and we will retry again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the pr, the check has been removed and changed the comment
38a4d04
to
3e40fd3
Compare
// We need to be sure that the master in the new cluster state is the same | ||
// as the one we picked before joining it, so we retry by doing a retry | ||
logger.debug("Master node has switched on us, rejoining..."); | ||
return rejoin(currentState, "rejoin_due_to_master_switch"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might fail because the currentJoinThread is not null. I wonder if we can keep this simple and have an atomic success boolean this task sets to true + a count down latch for the outer thread to wait on.
I like the direction - left some comments |
Closed in favour for #7834 |
On join, update the latestDiscoNodes, master flag and fault detection via a cluster state update task.
This should prevent rare concurrency issues, where during a join the cluster state master node can't be seen from the join thread.