Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two members of a cluster each becomes master and ignore each others! #7016

Closed
victornoel opened this issue Dec 7, 2015 · 7 comments

Comments

Projects
None yet
4 participants
@victornoel
Copy link

commented Dec 7, 2015

Hi,

I made the simple following example to show the problem (it must be run twice):

import com.hazelcast.config.Config;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;

public class GettingStarted {
    public static void main(String[] args) throws Exception {

        final Config config = new Config();
        config.setProperty("hazelcast.socket.bind.any", "false");
        config.getNetworkConfig().setReuseAddress(true).setPort(7900).setPortAutoIncrement(true).getInterfaces()
                .addInterface("127.0.0.1")
                .setEnabled(true);
        config.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);
        config.getNetworkConfig().getJoin().getAwsConfig().setEnabled(false);
        config.getNetworkConfig().getJoin().getTcpIpConfig().addMember("localhost:7900").addMember("localhost:7901")
                .setEnabled(true);
        config.getGroupConfig().setName("test-cluster").setPassword("testPassword");

        HazelcastInstance hazelcastInstance = Hazelcast.newHazelcastInstance(config);

        Thread.sleep(10000);

        System.out.println(hazelcastInstance.getCluster().getMembers());
    }
}

Basically, they both start and don't consider each other as members of the same cluster, even though they successfully connect to each others.
If running in debug, it seems they each consider the other one is not there and thus assume the master role.

Maybe the problem comes from the fact that the first node blacklist the second one (because when it is started the second one is not yet up) and thus decide to ignore it later?

The first one says:

Dec 07, 2015 2:24:28 PM com.hazelcast.nio.tcp.SocketConnector
INFO: [localhost]:7900 [test-cluster] [3.5.4] Could not connect to: /127.0.0.1:7901. Reason: SocketException[Connection refused to address /127.0.0.1:7901]
Dec 07, 2015 2:24:28 PM com.hazelcast.cluster.impl.TcpIpJoiner
INFO: [localhost]:7900 [test-cluster] [3.5.4] Address[127.0.0.1]:7901 is added to the blacklist.

The second one says:

Dec 07, 2015 2:24:32 PM com.hazelcast.nio.tcp.SocketConnector
INFO: [localhost]:7901 [test-cluster] [3.5.4] Connecting to /127.0.0.1:7900, timeout: 0, bind-any: false
Dec 07, 2015 2:24:32 PM com.hazelcast.nio.tcp.TcpIpConnectionManager
INFO: [localhost]:7901 [test-cluster] [3.5.4] Established socket connection between /127.0.0.1:54269
@victornoel

This comment has been minimized.

Copy link
Author

commented Dec 7, 2015

Note: I'm using hazelcast 3.5.4

@victornoel

This comment has been minimized.

Copy link
Author

commented Dec 7, 2015

Apparently if I replace "localhost" with "127.0.0.1" in the addMember methods, it works as desired.

I think this is a bug because members are only meant to contain hostnames or addresses to contact other nodes, once they are contacted, it shouldn't be a problem that they are listening to a given interface that is not string-equals to the member's name using by the node contacting it. Or am I wrong?!

@jerrinot

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

Hello @victornoel,

thanks for a very nice bug report. You reasoning makes sense to me. @mdogan: WDYT?

@metanet metanet self-assigned this Dec 8, 2015

@metanet metanet added this to the 3.6 milestone Dec 8, 2015

metanet added a commit to metanet/hazelcast that referenced this issue Dec 8, 2015

metanet added a commit to metanet/hazelcast that referenced this issue Dec 8, 2015

metanet added a commit to metanet/hazelcast that referenced this issue Dec 8, 2015

metanet added a commit to metanet/hazelcast that referenced this issue Dec 8, 2015

@mdogan mdogan modified the milestones: 3.5.5, 3.6 Dec 8, 2015

@metanet

This comment has been minimized.

Copy link
Contributor

commented Dec 8, 2015

hi @victornoel

Thank you for the bug report. We fixed the issue in both master and maintenance branches. You can follow the PRs: #7023 and #7024

@metanet metanet modified the milestones: 3.6, 3.5.5 Dec 8, 2015

@victornoel

This comment has been minimized.

Copy link
Author

commented Dec 8, 2015

Great, thanks :)

What is "hazelcast.local.localAddress" for? I couldn't find it anywhere in the documentation…

@metanet

This comment has been minimized.

Copy link
Contributor

commented Dec 8, 2015

afaik, it is mostly used for testing

@victornoel

This comment has been minimized.

Copy link
Author

commented Dec 8, 2015

ok, thank you, so I don't need to bother with it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.