Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node receiving a cluster state with a wrong master node should reject and throw an error #9963

Merged

Conversation

martijnvg
Copy link
Member

Previously it was ignored and the publish cluster state timeout would kick in. In that case a stale master node would just wait for the inevitable and waste valuable time.

This issue was discovered by the DiscoveryWithServiceDisruptionsTests#testStaleMasterNotHijackingMajority test.

Note this issue doesn't occur in any released version. (just on 1.x and master branches)

@martijnvg martijnvg added >bug v2.0.0-beta1 v1.5.0 :Distributed/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure review labels Mar 3, 2015
@@ -766,7 +761,10 @@ public ClusterState execute(ClusterState currentState) {
if (updatedState == null) {
updatedState = currentState;
}
if (shouldIgnoreNewClusterState(logger, currentState, updatedState)) {
if (checkWrongMaster(logger, currentState, updatedState)) {
throw new ElasticsearchIllegalStateException("not following master " + updatedState.nodes().masterNode() + " but following master " + currentState.nodes().masterNode());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a similar message to the log bellow: "cluster state from a different master then the current one, rejecting (received {}, current {})

@bleskes
Copy link
Contributor

bleskes commented Mar 3, 2015

Left some small comments

@martijnvg
Copy link
Member Author

@bleskes I've updated the PR.

@bleskes
Copy link
Contributor

bleskes commented Mar 5, 2015

LGTM. Thx @martijnvg

@martijnvg martijnvg force-pushed the zen/reject_cs_with_wrong_master branch 2 times, most recently from e27e5c5 to 6ef67f3 Compare March 5, 2015 22:33
…eject and throw an error.

Previously it was ignored and the publish cluster state timeout would kick in. In that case a stale master node would just wait for the inevitable and waste valuable time.
This issue was discovered by the DiscoveryWithServiceDisruptionsTests#testStaleMasterNotHijackingMajority test.

Also only perform cluster state versions and wrong master node check inside cluster state update task.
@martijnvg martijnvg force-pushed the zen/reject_cs_with_wrong_master branch from 6ef67f3 to 0c254e9 Compare March 6, 2015 07:47
@martijnvg martijnvg merged commit 0c254e9 into elastic:master Mar 6, 2015
@martijnvg martijnvg deleted the zen/reject_cs_with_wrong_master branch May 18, 2015 23:26
@clintongormley clintongormley changed the title Zen: Node receiving a cluster state with a wrong master node should reject and throw an error Node receiving a cluster state with a wrong master node should reject and throw an error Jun 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Discovery-Plugins Anything related to our integration plugins with EC2, GCP and Azure v1.5.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants