Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node appears to be in multiple clusters #247

Closed
marshalium opened this issue Aug 16, 2012 · 3 comments
Closed

Node appears to be in multiple clusters #247

marshalium opened this issue Aug 16, 2012 · 3 comments
Assignees
Milestone

Comments

@marshalium
Copy link

In a production environment I ran into a situation where a Hazlecast node seemed to be in multiple clusters at once.

The cluster was split-brained so there were 3 different masters. When the new node started up it connected to 3 different masters and appeared to successfully join each one. There were no errors or warnings in the logs. The node repeatedly logged at various times (over a period of several minutes) that it was in each of the different clusters. Each of the 3 masters logged that the node was in their member list.

I'll attach a test case that consistently reproduces the situation. The test case causes a node to startup successfully and join three masters at the same time. All three masters then think that the new node is one of their members.

@marshalium
Copy link
Author

import static org.junit.Assert.assertEquals;

import java.util.Arrays;

import org.junit.AfterClass;
import org.junit.BeforeClass;
import org.junit.Test;
import org.junit.runner.RunWith;

import com.hazelcast.config.Config;
import com.hazelcast.impl.GroupProperties;

@RunWith(com.hazelcast.util.RandomBlockJUnit4ClassRunner.class)
public class JoinMultipleMasters {

    @BeforeClass
    @AfterClass
    public static void init() throws Exception {
        Hazelcast.shutdownAll();
    }

    /*
     * This test illustrates that Hazelcast can get into a state where a node
     * appears to be in more than one cluster.
     */
    @Test
    public void testMultiJoins() throws Exception {
        Config c1 = buildConfig();
        Config c2 = buildConfig();
        Config c3 = buildConfig();
        Config c4 = buildConfig();

        c1.getNetworkConfig().setPort(15701);
        c2.getNetworkConfig().setPort(15702);
        c3.getNetworkConfig().setPort(15703);
        c4.getNetworkConfig().setPort(15704);

        c1.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15701"));
        c2.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15702"));
        c3.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15703"));
        c4.getNetworkConfig().getJoin().getTcpIpConfig().setMembers(Arrays.asList("127.0.0.1:15701, 127.0.0.1:15702, 127.0.0.1:15703, 127.0.0.1:15704"));

        HazelcastInstance h1 = Hazelcast.newHazelcastInstance(c1);
        HazelcastInstance h2 = Hazelcast.newHazelcastInstance(c2);
        HazelcastInstance h3 = Hazelcast.newHazelcastInstance(c3);

        // First three nodes are up. All should be in separate clusters.
        assertEquals(1, h1.getCluster().getMembers().size());
        assertEquals(1, h2.getCluster().getMembers().size());
        assertEquals(1, h3.getCluster().getMembers().size());

        HazelcastInstance h4 = Hazelcast.newHazelcastInstance(c4);

        // Fourth node is up. Should join one of the other three clusters.
        int numNodesWithTwoMembers = 0;
        if (h1.getCluster().getMembers().size() == 2) {
            numNodesWithTwoMembers++;
        }
        if (h2.getCluster().getMembers().size() == 2) {
            numNodesWithTwoMembers++;
        }
        if (h3.getCluster().getMembers().size() == 2) {
            numNodesWithTwoMembers++;
        }
        if (h4.getCluster().getMembers().size() == 2) {
            numNodesWithTwoMembers++;
        }

        Member h4Member = h4.getCluster().getLocalMember();

        int numNodesThatKnowAboutH4 = 0;
        if (h1.getCluster().getMembers().contains(h4Member)) {
            numNodesThatKnowAboutH4++;
        }
        if (h2.getCluster().getMembers().contains(h4Member)) {
            numNodesThatKnowAboutH4++;
        }
        if (h3.getCluster().getMembers().contains(h4Member)) {
            numNodesThatKnowAboutH4++;
        }
        if (h4.getCluster().getMembers().contains(h4Member)) {
            numNodesThatKnowAboutH4++;
        }

        /*
         * At this point h4 should have joined a single node out of the other
         * three. There should be two clusters of one and one cluster of two. h4
         * should only be in one cluster.
         * 
         * This is not what is happening. Instead, h4 thinks it joined in a
         * cluster of two with one of the other three nodes. And each of the
         * other three nodes (h1, h2, and h3) thinks that h4 is joined with
         * them.
         */
        assertEquals(2, h4.getCluster().getMembers().size());
        assertEquals(2, numNodesWithTwoMembers);
        assertEquals(2, numNodesThatKnowAboutH4);
    }

    private static Config buildConfig() {
        Config c = new Config();
        c.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);
        c.getNetworkConfig().getJoin().getTcpIpConfig().setEnabled(true);
        c.getNetworkConfig().setPortAutoIncrement(false);
        c.setProperty(GroupProperties.PROP_WAIT_SECONDS_BEFORE_JOIN, "0");
        return c;
    }
}

@ghost ghost assigned mdogan Aug 17, 2012
@mdogan mdogan closed this as completed in d089c7b Aug 17, 2012
@mdogan
Copy link
Contributor

mdogan commented Aug 17, 2012

Thanks for findings and test case.

@marshalium
Copy link
Author

No problem. Thanks for getting a fix in so quickly.

SeriyBg pushed a commit to SeriyBg/hazelcast that referenced this issue Jul 9, 2021
Bumps [mockito-core](https://github.com/mockito/mockito) from 3.5.5 to 3.5.6.
- [Release notes](https://github.com/mockito/mockito/releases)
- [Commits](mockito/mockito@v3.5.5...v3.5.6)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants