Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... #10587

Closed
wellprado opened this issue May 16, 2017 · 5 comments
Labels
Source: Community PR or issue was opened by a community user Team: Core Type: Defect
Milestone

Comments

@wellprado
Copy link

wellprado commented May 16, 2017

Hi guys, How are you ?

Something strange happened about 2 times in 2 months of hazelcast usage.
Today was the second day.

I stopped/started the tomcat server nodeA, nodeB Become the "master" node. When NodeA was online again and join the hazelcast cluster, some messages was writen in catalina.out:

...
[[2m2017-05-15 14:48:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[thread-Acceptor][[0;39m [[36mc.h.nio.tcp.SocketAcceptorThread        [[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Accepting socket connection from /NODEB:46454
[[2m2017-05-15 14:48:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[cached.thread-6][[0;39m [[36mc.h.nio.tcp.TcpIpConnectionManager      [[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Established socket connection between /NODEA:5710 and /NODEB:46454
[[2m2017-05-15 14:48:59.763[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
2017-05-15 14:56:59.754  INFO 2567 --- [ration.thread-0] c.i.o.SplitBrainMergeValidationOperation : [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
[[2m2017-05-15 14:50:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
[[2m2017-05-15 14:52:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
[[2m2017-05-15 14:54:59.754[[0;39m [[32m INFO[[0;39m [[35m2567[[0;39m [[2m---[[0;39m [[2m[ration.thread-0][[0;39m [[36mc.i.o.SplitBrainMergeValidationOperation[[0;39m [[2m:[[0;39m [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...
...
2017-05-15 15:16:59.754  INFO 11893 --- [ration.thread-0] c.i.o.SplitBrainMergeValidationOperation : [NODEA]:5710 [hazelepp] [3.8] Ignoring join check from [NODEB]:5701, because this node is not master...

Even with this messages above, every thing seemed to work normal, but for this messages stop, I stopped NODEB for NODEA to become "Master" and started NODEB again, but I did this just this time today. After that I stopped/started many times each node in differente sequences and the error not appear any more. It's something intermittent that occurs rare times.

Do you have some guess about that ?

My env 64 bits: Linux RedHat 7.3, Tomcat 8.5, hazelcast-tomcat85-sessionmanager-1.1.jar, hazelcast-all-3.8.jar

Thank you very much!!!

@metanet
Copy link
Contributor

metanet commented May 18, 2017

Hi @wellprado

This message should be safe and should not create any problem. Did you see the log message printed repeatedly without terminating?

Regards,

@wellprado
Copy link
Author

Hi @metanet

Yes, the message repeated but I didn't realize any termination. So, are we good ?

Thank you very much

Regards,

@metanet
Copy link
Contributor

metanet commented May 19, 2017

Ok, as I understand, when NodeA is restarted, it successfully joins to the master NodeB but still you see that message repeatedly printed. Can you provide any log files so that we investigate the issue more efficiently. Especially the logs of NodeA's restart that lead to these log messages.

Regards,

@mmedenjak
Copy link
Contributor

@wellprado are you able to provide some more logs :

  • from nodeB when nodeA has shut down
  • from nodeA and node B when nodeA is started

@mmedenjak mmedenjak added this to the 3.9 milestone Jul 11, 2017
@mmedenjak mmedenjak changed the title [ration.thread-0] c.i.o.SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... [cluster] SplitBrainMergeValidationOperation Ignoring join check from [NODEB]:5701, because this node is not master... Jul 16, 2017
@mdogan
Copy link
Contributor

mdogan commented Aug 18, 2017

@wellprado: Without more logs, we can't identify the exact problem. Reason might be a short network interruption, or some long pause in the system/application.

For now closing this issue, we can reopen when you have more logs.

@mdogan mdogan closed this as completed Aug 18, 2017
@mmedenjak mmedenjak added the Source: Community PR or issue was opened by a community user label Apr 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Source: Community PR or issue was opened by a community user Team: Core Type: Defect
Projects
None yet
Development

No branches or pull requests

5 participants