Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multicast/trusted-interfaces not handled properly at start #9953

Closed
dsukhoroslov opened this issue Feb 21, 2017 · 1 comment
Closed

multicast/trusted-interfaces not handled properly at start #9953

dsukhoroslov opened this issue Feb 21, 2017 · 1 comment

Comments

@dsukhoroslov
Copy link
Contributor

@dsukhoroslov dsukhoroslov commented Feb 21, 2017

HZ 3.7.x, 3.8. Suppose I have the following HZ server network config (Spring):

<hz:network port="${bdb.cluster.port:3331}" port-auto-increment="true">
	<hz:join>
                <hz:multicast enabled="true" multicast-timeout-seconds="10">
                        <!--hz:trusted-interfaces>
                                <hz:interface>localhost</hz:interface>
                        </hz:trusted-interfaces-->   
                </hz:multicast>					
		<hz:tcp-ip enabled="false" />
		<hz:aws enabled="false" />
	</hz:join>
</hz:network>

I start two server nodes on my local PC and they connect to each other properly:

2017-02-21 08:57:42.416 [hz.hzInstance.priority-generic-operation.thread-0] INFO  com.hazelcast.internal.cluster.ClusterService - [192.168.1.87]:3331 [system] [3.8] 

Members [2] {
	Member [192.168.1.87]:3331 - be781745-37e9-4375-a296-578427bea8a6 this lite
	Member [192.168.1.87]:3332 - 0d6fc616-a705-4498-8c2c-c5d3c3fb02de lite
}

If I enable trusted-interfaces section then the nodes cannot connect initially:

2017-02-21 08:52:26.871 [hz.hzInstance.MulticastThread] DEBUG com.hazelcast.internal.cluster.impl.NodeMulticastListener - [192.168.1.87]:3332 [system] [3.8] JoinMessage from 192.168.1.87 is dropped because its sender is not a trusted interface
2017-02-21 08:52:27.004 [hz.hzInstance.MulticastThread] DEBUG com.hazelcast.internal.cluster.impl.NodeMulticastListener - [192.168.1.87]:3332 [system] [3.8] Dropped: JoinRequest{packetVersion=4, buildNumber=20170217, memberVersion=3.8.0, address=[192.168.1.87]:3332, uuid='c0c583e3-0af3-41e8-804d-b56048ff3607', liteMember=true, credentials=null, memberCount=0, tryCount=5}

But after about 5 minutes the nodes were able to connect:

2017-02-21 08:57:42.418 [hz.hzInstance.priority-generic-operation.thread-0] DEBUG com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager - [192.168.1.87]:3332 [system] [3.8] MasterConfirmation has been received from Member [192.168.1.87]:3331 - be781745-37e9-4375-a296-578427bea8a6 lite
2017-02-21 08:57:42.418 [hz.hzInstance.priority-generic-operation.thread-0] DEBUG com.hazelcast.internal.cluster.ClusterService - [192.168.1.87]:3332 [system] [3.8] Updating members [Member [192.168.1.87]:3331 - be781745-37e9-4375-a296-578427bea8a6 lite, Member [192.168.1.87]:3332 - 0d6fc616-a705-4498-8c2c-c5d3c3fb02de this lite]
2017-02-21 08:57:42.419 [hz.hzInstance.priority-generic-operation.thread-0] DEBUG com.hazelcast.internal.partition.InternalPartitionService - [192.168.1.87]:3332 [system] [3.8] Adding Member [192.168.1.87]:3331 - be781745-37e9-4375-a296-578427bea8a6 lite
2017-02-21 08:57:42.419 [hz.hzInstance.priority-generic-operation.thread-0] INFO  com.hazelcast.internal.cluster.ClusterService - [192.168.1.87]:3332 [system] [3.8] 

Members [2] {
	Member [192.168.1.87]:3331 - be781745-37e9-4375-a296-578427bea8a6 lite
	Member [192.168.1.87]:3332 - 0d6fc616-a705-4498-8c2c-c5d3c3fb02de this lite
}

Is it possible to enable this behavior right from the beginning?

Thanks, Denis

@jerrinot
Copy link
Contributor

@jerrinot jerrinot commented Feb 21, 2017

Hi @dsukhoroslov,
I think the intended behavior here is to NOT join these 2 members - as your trusted interface does not match the address where members are listening. When you change the trusted-interface element to 192.168.1.87 then your members will join quickly. Also I believe the trusted-interface section does not work with hostnames, it works with IPs only.

Now the question is why your members join eventually. I reckon it's related to split-brain handling, probably it does not check trusted interfaces on receiving a split-brain message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants
You can’t perform that action at this time.