-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Node appears to be in multiple clusters #247
Comments
import static org.junit.Assert.assertEquals;
import java.util.Arrays;
import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
import org.junit.runner.RunWith;
import com.hazelcast.config.Config;
import com.hazelcast.impl.GroupProperties;
@RunWith(com.hazelcast.util.RandomBlockJUnit4ClassRunner.class)
public class JoinMultipleMasters {
@BeforeClass
@AfterClass
public static void init() throws Exception {
Hazelcast.shutdownAll();
}
/*
* This test illustrates that Hazelcast can get into a state where a node
* appears to be in more than one cluster.
*/
@Test
public void testMultiJoins() throws Exception {
Config c1 = buildConfig();
Config c2 = buildConfig();
Config c3 = buildConfig();
Config c4 = buildConfig();
c1.getNetworkConfig().setPort(15701);
c2.getNetworkConfig().setPort(15702);
c3.getNetworkConfig().setPort(15703);
c4.getNetworkConfig().setPort(15704);
c1.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15701"));
c2.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15702"));
c3.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15703"));
c4.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15701, 127.0.0.1:15702, 127.0.0.1:15703, 127.0.0.1:15704"));
HazelcastInstance h1 = Hazelcast.newHazelcastInstance(c1);
HazelcastInstance h2 = Hazelcast.newHazelcastInstance(c2);
HazelcastInstance h3 = Hazelcast.newHazelcastInstance(c3);
// First three nodes are up. All should be in separate clusters.
assertEquals(1, h1.getCluster().getMembers().size());
assertEquals(1, h2.getCluster().getMembers().size());
assertEquals(1, h3.getCluster().getMembers().size());
HazelcastInstance h4 = Hazelcast.newHazelcastInstance(c4);
// Fourth node is up. Should join one of the other three clusters.
int numNodesWithTwoMembers = 0;
if (h1.getCluster().getMembers().size() == 2) {
numNodesWithTwoMembers++;
}
if (h2.getCluster().getMembers().size() == 2) {
numNodesWithTwoMembers++;
}
if (h3.getCluster().getMembers().size() == 2) {
numNodesWithTwoMembers++;
}
if (h4.getCluster().getMembers().size() == 2) {
numNodesWithTwoMembers++;
}
Member h4Member = h4.getCluster().getLocalMember();
int numNodesThatKnowAboutH4 = 0;
if (h1.getCluster().getMembers().contains(h4Member)) {
numNodesThatKnowAboutH4++;
}
if (h2.getCluster().getMembers().contains(h4Member)) {
numNodesThatKnowAboutH4++;
}
if (h3.getCluster().getMembers().contains(h4Member)) {
numNodesThatKnowAboutH4++;
}
if (h4.getCluster().getMembers().contains(h4Member)) {
numNodesThatKnowAboutH4++;
}
/*
* At this point h4 should have joined a single node out of the other
* three. There should be two clusters of one and one cluster of two. h4
* should only be in one cluster.
*
* This is not what is happening. Instead, h4 thinks it joined in a
* cluster of two with one of the other three nodes. And each of the
* other three nodes (h1, h2, and h3) thinks that h4 is joined with
* them.
*/
assertEquals(2, h4.getCluster().getMembers().size());
assertEquals(2, numNodesWithTwoMembers);
assertEquals(2, numNodesThatKnowAboutH4);
}
private static Config buildConfig() {
Config c = new Config();
c.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);
c.getNetworkConfig().getJoin().getTcpIpConfig().setEnabled(true);
c.getNetworkConfig().setPortAutoIncrement(false);
c.setProperty(GroupProperties.PROP_WAIT_SECONDS_BEFORE_JOIN, "0");
return c;
}
} |
Thanks for findings and test case. |
No problem. Thanks for getting a fix in so quickly. |
SeriyBg
pushed a commit
to SeriyBg/hazelcast
that referenced
this issue
Jul 9, 2021
Bumps [mockito-core](https://github.com/mockito/mockito) from 3.5.5 to 3.5.6. - [Release notes](https://github.com/mockito/mockito/releases) - [Commits](mockito/mockito@v3.5.5...v3.5.6) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In a production environment I ran into a situation where a Hazlecast node seemed to be in multiple clusters at once.
The cluster was split-brained so there were 3 different masters. When the new node started up it connected to 3 different masters and appeared to successfully join each one. There were no errors or warnings in the logs. The node repeatedly logged at various times (over a period of several minutes) that it was in each of the different clusters. Each of the 3 masters logged that the node was in their member list.
I'll attach a test case that consistently reproduces the situation. The test case causes a node to startup successfully and join three masters at the same time. All three masters then think that the new node is one of their members.
The text was updated successfully, but these errors were encountered: