Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Multicast socket exception if no network present. #9081
If networking in the hazelcast.xml is set as,
and there is only 127.0.0.1 available, on Hazelcast 3.7.2 the following exception appears
although the server does actually start.
Multicast without network isn't a real scenario, but it's the kind of thing you might use when doing development while travelling, and the exception is misleading.
It may IPV6 related, as using
Apparently it's caused by the fact multicast is disabled(?) on local loopback on Mac OS X? I don't know why the
I tried to have a quick check: jerrinot@002bae7 and when the loopback device is used then it's throwing an exception directly on this check.
@neilstevenson I managed to reproduce as well. Apparently this is not happening with an earlier OSX version, I had to upgrade to macOS Sierra. From the first looks it seems to be related to the default-interface selection mechanism, I will try to identify the exact root cause.
Would you be able to share the output of the following commands, in your environment ?
After some more investigation, I strongly believe that this is not related to macOS specifically but rather the network interfaces layout. Apparently when you signup for iCloud services a new interface gets created, usually named utun. This however, is not a unique problem with iCloud per se, in my case I was able to reproduce also by having VPN configured on the machine, which also has another tun device.
When joining a multicast group, some OSes need to know to which interface you are doing the join. If you don't specify an interface, then the algorithm falls back on the default interface. In Java, the NetworkInterface.defaultInterface() is selected in the native lib, and has the following rules
Therefore when your main internet connection is offline (WiFi in my case) then the next interface with this order is the utun devide. utun device in my case, is configured with IPv6 addresses, so the join is using this configuration by default.
In an attempt to makes this more deterministic, I tried to specify the NetworkInterface instead of allowing the un-deterministric default selection take place.
forcing the default interface to be the loopback one. This also, looks like a good idea to introduce in the configuration, if config exists specifying multicast interface, use that, otherwise fallback to default.
Although the join succeeds when doing the above, every subsequent send() operation fails with
As @neilstevenson correctly pointed out, using
Looking for a way to hand-wire this potentially using some configuration settings, but I personally don't see value in doing so, since this is going to mostly affect development environments, and can be bypassed by disabling cluster join. @jerrinot WDYT ?
@jerrinot noted that we already do set the interface in lines https://github.com/hazelcast/hazelcast/blob/v3.7.1/hazelcast/src/main/java/com/hazelcast/internal/cluster/impl/MulticastService.java#L94-L98 which was not executed in my use-case.
Also, as seen in these lines of code, there is a reference to a bug that describes almost the same scenario for multicasting on loopback, where a custom route needs to be added statically.
Regarding the difference between
However, using the IPv4 even though it manages to join, and it doesn't complain on send() operations either, but looking at the route table, I can see that there is NO default gateway for IPv4. Further confirming this, I tried to
Note: The multicast socket based on the above observations should always be IPv6 unless
TL;DR In conclusion, I will try to patch this with an exception, to at least give the user some useful feedback. This works as expected, but its quite confusing to the end user without any context.
For those curious minds out there, here is the native code making the decision of using the IPv6 approach: