[cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... #10587

wellprado · 2017-05-16T02:01:19Z

Hi guys, How are you ?

Something strange happened about 2 times in 2 months of hazelcast usage.
Today was the second day.

I stopped/started the tomcat server nodeA, nodeB Become the "master" node. When NodeA was online again and join the hazelcast cluster, some messages was writen in catalina.out:

...
[[2m2017-05-15 14:48:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[thread-Acceptor][[0;39m [[36mc.h.nio.tcp.SocketAcceptorThread        [[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Accepting socket connection from /NODEB:46454
[[2m2017-05-15 14:48:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[cached.thread-6][[0;39m [[36mc.h.nio.tcp.TcpIpConnectionManager      [[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Established socket connection between /NODEA:5710 and /NODEB:46454
[[2m2017-05-15 14:48:59.763[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
2017-05-15 14:56:59.754  INFO 2567 --- [ration.thread-0] c.i.o.SplitBrainMergeValidationOperation : [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
[[2m2017-05-15 14:50:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
[[2m2017-05-15 14:52:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
[[2m2017-05-15 14:54:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
...
2017-05-15 15:16:59.754  INFO 11893 --- [ration.thread-0] c.i.o.SplitBrainMergeValidationOperation : [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...

Even with this messages above, every thing seemed to work normal, but for this messages stop, I stopped NODEB for NODEA to become "Master" and started NODEB again, but I did this just this time today. After that I stopped/started many times each node in differente sequences and the error not appear any more. It's something intermittent that occurs rare times.

Do you have some guess about that ?

My env 64 bits: Linux RedHat 7.3, Tomcat 8.5, hazelcast-tomcat85-sessionmanager-1.1.jar, hazelcast-all-3.8.jar

Thank you very much!!!

The text was updated successfully, but these errors were encountered:

metanet · 2017-05-18T21:10:35Z

Hi @wellprado

This message should be safe and should not create any problem. Did you see the log message printed repeatedly without terminating?

Regards,

wellprado · 2017-05-19T03:28:06Z

Hi @metanet

Yes, the message repeated but I didn't realize any termination. So, are we good ?

Thank you very much

Regards,

metanet · 2017-05-19T10:17:31Z

Ok, as I understand, when NodeA is restarted, it successfully joins to the master NodeB but still you see that message repeatedly printed. Can you provide any log files so that we investigate the issue more efficiently. Especially the logs of NodeA's restart that lead to these log messages.

Regards,

mmedenjak · 2017-07-11T13:18:20Z

@wellprado are you able to provide some more logs :

from nodeB when nodeA has shut down
from nodeA and node B when nodeA is started

mdogan · 2017-08-18T08:01:30Z

@wellprado: Without more logs, we can't identify the exact problem. Reason might be a short network interruption, or some long pause in the system/application.

For now closing this issue, we can reopen when you have more logs.

tombujok added the Team: Core label May 16, 2017

mmedenjak added the Type: Defect label Jul 11, 2017

mmedenjak added this to the 3.9 milestone Jul 11, 2017

mmedenjak changed the title ~~[ration.thread-0] c.i.o.SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master...~~ [cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... Jul 16, 2017

mdogan closed this as completed Aug 18, 2017

mmedenjak added the Source: Community PR or issue was opened by a community user label Apr 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... #10587

[cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... #10587

wellprado commented May 16, 2017 •

edited by mmedenjak

Loading

metanet commented May 18, 2017

wellprado commented May 19, 2017

metanet commented May 19, 2017

mmedenjak commented Jul 11, 2017

mdogan commented Aug 18, 2017

[cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... #10587

[cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... #10587

Comments

wellprado commented May 16, 2017 • edited by mmedenjak Loading

metanet commented May 18, 2017

wellprado commented May 19, 2017

metanet commented May 19, 2017

mmedenjak commented Jul 11, 2017

mdogan commented Aug 18, 2017

wellprado commented May 16, 2017 •

edited by mmedenjak

Loading