Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive logging of 3.6 and 3.7 cluster multicast discovery join failure #8867

Closed
shakuzen opened this issue Sep 10, 2016 · 4 comments

Comments

Projects
None yet
4 participants
@shakuzen
Copy link

commented Sep 10, 2016

As mentioned on Gitter:

I have an existing Hazelcast 3.6 cluster running. Then I start up a cluster with a different group name using Hazelcast 3.7 on the same network. They are both using the same multicast IP and port.

Expected
They do not join each others' cluster because the group names differ, and there is no log output (at INFO level or above).

Actual
They do not join each others' cluster, but there is a lot of log output about failing to join, which is confusing and floods the logs. For example, there are logs like the following:

2016-09-09 16:56:56.728  WARN 12687 --- [hz._hzInstance_1_test-dev.MulticastThread] c.h.i.cluster.impl.MulticastService      : [1xx.xx.xx.xxx]:xxxxx [test-dev] [3.7] Received data format is invalid. (An old version of Hazelcast may be running here.)

com.hazelcast.nio.serialization.HazelcastSerializationException: Problem while reading DataSerializable, namespace: 0, id: 0, class: 'com.hazelcast.cluster.impl.JoinRequest', exception: com.hazelcast.cluster.impl.JoinRequest
    at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.rethrowReadException(DataSerializableSerializer.java:141) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:130) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:52) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:46) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:216) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:600) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.cluster.impl.MulticastService.receive(MulticastService.java:207) [hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.cluster.impl.MulticastService.run(MulticastService.java:165) [hazelcast-3.7.jar:3.7]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
Caused by: java.lang.ClassNotFoundException: com.hazelcast.cluster.impl.JoinRequest
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[na:1.8.0_72]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_72]
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) ~[na:1.8.0_72]
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_72]
    at com.hazelcast.nio.ClassLoaderUtil.tryLoadClass(ClassLoaderUtil.java:151) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.nio.ClassLoaderUtil.loadClass(ClassLoaderUtil.java:120) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.nio.ClassLoaderUtil.newInstance(ClassLoaderUtil.java:73) ~[hazelcast-3.7.jar:3.7]
    at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:125) ~[hazelcast-3.7.jar:3.7]
    ... 7 common frames omitted

Workaround
Use a different multicast IP or port.

Other than logging output, I think there is no actual effect, but it certainly caused concern in our DEV environment when starting up the new 3.7 cluster. We can work around it by changing the multicast IP or port for now, as having our logs flooded with those stack traces is not a reasonable option for us and I'm not sure it is smart for us to silence logs from MulticastService. However, this means we need to worry about the multicast IP and port of all other Hazelcast instances (that also vary in minor version) on the network and may have this happen again in the future. We'd rather not manage the port/IP and instead have a policy of unique group name.

Note that I tested and this is not an issue if both clusters are Hazelcast 3.6. If the info so far is not enough, I can put together a repository on GitHub to reproduce the issue.

@jerrinot jerrinot added this to the 3.8 milestone Sep 10, 2016

@jerrinot

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2016

Hi @shakuzen,

once more thanks for reporting this!

@jerrinot jerrinot self-assigned this Dec 6, 2016

@arikanorh

This comment has been minimized.

Copy link

commented Dec 8, 2016

Adding more info to help to resolve issue.I'm having same problem with following use case.

  1. I started a hazelcast instance from my IDE using Java API with default config.
  2. I started another node from console.bat

Then same log drops the 2nd node always. Hope it helps.

@jerrinot

This comment has been minimized.

Copy link
Contributor

commented Dec 8, 2016

@arikanorh are both Hazelcast instance at the same version?

@arikanorh

This comment has been minimized.

Copy link

commented Dec 8, 2016

@jerrinot I just noticed that they are not and issue is solved when i changed the versions to be same.

@jerrinot jerrinot removed their assignment Jan 17, 2017

@tombujok tombujok self-assigned this Jan 17, 2017

tombujok added a commit to tombujok/hazelcast that referenced this issue Jan 18, 2017

Suppressing excessive logging on serialization failure during a clust…
…er multicast discovery. Fixes hazelcast#8867 (or at least makes it less verbose)

tombujok added a commit to tombujok/hazelcast that referenced this issue Jan 18, 2017

Suppressing excessive logging on serialization failure during a clust…
…er multicast discovery. Fixes hazelcast#8867 (or at least makes it less verbose)

tombujok added a commit to tombujok/hazelcast that referenced this issue Jan 18, 2017

Suppressing excessive logging on serialization failure during a clust…
…er multicast discovery. Fixes hazelcast#8867 (or at least makes it less verbose)

tombujok added a commit to tombujok/hazelcast that referenced this issue Jan 19, 2017

Suppressing excessive logging on serialization failure during a clust…
…er multicast discovery. Fixes hazelcast#8867 (or at least makes it less verbose)

tombujok added a commit that referenced this issue Jan 19, 2017

Suppressing excessive logging on serialization failure during a clust…
…er multicast discovery. Fixes #8867 (or at least makes it less verbose) (#9686)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.