Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ZenDiscovery fields via the cluster service update task. #7790

Conversation

Projects
None yet
5 participants
@martijnvg
Copy link
Member

commented Sep 18, 2014

On join, update the latestDiscoNodes, master flag and fault detection via a cluster state update task.

This should prevent rare concurrency issues, where during a join the cluster state master node can't be seen from the join thread.

Discovery: On join update the latestDiscoNodes, master flag and fault…
… detection via a cluster state update task
if (!masterNode.equals(currentState.nodes().masterNode())) {
logger.debug("Master node has switched on us, rejoining...");
return rejoin(currentState, "rejoin_due_to_master_switch");
}

This comment has been minimized.

Copy link
@kimchy

kimchy Sep 18, 2014

Member

I think the above test effectively has the same test as below, the currentState will by definition need to be updated to reflect the fact that the joining node will get the published state with it "in it" and the master node set. I think that we can remove the below check, yet still have the mentioned comment above, and mention that the master might get switched on us or that we haven't completed a full circle and we will retry again

This comment has been minimized.

Copy link
@martijnvg

martijnvg Sep 18, 2014

Author Member

I updated the pr, the check has been removed and changed the comment

@martijnvg martijnvg force-pushed the martijnvg:improvements/zen-update-via-clusterservice-update-task branch to 3e40fd3 Sep 18, 2014

// We need to be sure that the master in the new cluster state is the same
// as the one we picked before joining it, so we retry by doing a retry
logger.debug("Master node has switched on us, rejoining...");
return rejoin(currentState, "rejoin_due_to_master_switch");

This comment has been minimized.

Copy link
@bleskes

bleskes Sep 19, 2014

Member

This might fail because the currentJoinThread is not null. I wonder if we can keep this simple and have an atomic success boolean this task sets to true + a count down latch for the outer thread to wait on.

} else {
// TODO in theory, there is no need to even start the masterFD, since it will be started in
// handleNewClusterStateFromMaster
masterFD.start(masterNode, "initial_join");

This comment has been minimized.

Copy link
@bleskes

bleskes Sep 19, 2014

Member

+1 on removing this - it's not the job of this thread

clusterService.submitStateUpdateTask("rejoin_on_join_master", Priority.IMMEDIATE, new ClusterStateUpdateTask() {
@Override
public ClusterState execute(ClusterState currentState) throws Exception {
logger.debug("Rejoining rejoin failed", t1);

This comment has been minimized.

Copy link
@bleskes

bleskes Sep 19, 2014

Member

AtomicBoolean for success will make this simpler too.

@bleskes

This comment has been minimized.

Copy link
Member

commented Sep 19, 2014

I like the direction - left some comments

@martijnvg

This comment has been minimized.

Copy link
Member Author

commented Sep 26, 2014

Closed in favour for #7834

@martijnvg martijnvg closed this Sep 26, 2014

@clintongormley clintongormley changed the title Discovery: Update ZenDiscovery fields via the cluster service update task. Resiliency: Update ZenDiscovery fields via the cluster service update task. Sep 26, 2014

@jpountz jpountz removed the review label Oct 21, 2014

@martijnvg martijnvg deleted the martijnvg:improvements/zen-update-via-clusterservice-update-task branch May 18, 2015

@clintongormley clintongormley changed the title Resiliency: Update ZenDiscovery fields via the cluster service update task. Update ZenDiscovery fields via the cluster service update task. Jun 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.