Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Akka.Cluster / Akka.Remote uses ipv6 instead of ipv4 in ipv4 environment #2194

Closed
kantora opened this issue Jul 16, 2016 · 16 comments · Fixed by #2311
Closed

Akka.Cluster / Akka.Remote uses ipv6 instead of ipv4 in ipv4 environment #2194

kantora opened this issue Jul 16, 2016 · 16 comments · Fixed by #2311

Comments

@kantora
Copy link
Contributor

kantora commented Jul 16, 2016

At first. I've reported it in #2161, but suppose it is some separate bug.

Have the Docker / Ubuntu / Mono environment.
Updated to the latest Akka.Net night build (1.1.1.248-beta) and Helios 2.1.2.
There are one seed node and several worker nodes.

Remote is configured with fixed port, 0.0.0.0 hostname and public-hostname according to the container name. All names are resolved ok into the correct ipv4 address.

Unfortunately, a cluster is not formed. I see nothing in worker node logs and sometimes errors in the seed node logs:

[36mseed_1 | [0m2016-07-14 16:02:48 [Error] Error caught channel ["::ffff:172.18.0.5:3090"->"::ffff:172.18.0.2:50152"] (Id="ChannelId(-1914377680)")
[36mseed_1 | [0mSystem.Net.Sockets.SocketException: Operation aborted
[36mseed_1 | [0m  at Helios.Channels.Sockets.SocketChannelAsyncOperation.Validate () <0x42187b20 + 0x00057> in <filename unknown>:0
[36mseed_1 | [0m  at Helios.Channels.Sockets.AbstractSocketByteChannel+SocketByteChannelUnsafe.FinishRead (Helios.Channels.Sockets.SocketChannelAsyncOperation operation) <0x421926a0 + 0x0013b> in <filename unknown>:0
[36mseed_1 | [0m2016-07-14 16:02:48 [Information] "Message Disassociated from NoSender to akka://ClusterKit/system/transports/akkaprotocolmanager.tcp.0/akkaProtocol-tcp%3A%2F%2FClusterKit%40%5B%3A%3Affff%3A172.18.0.2%5D%3A50152-3 was notdelivered. 3 dead letters encountered."

As for windows environment ::ffff:172.18.0.2 and 172.18.0.2 are the same addresses, But in linux (as for mine configuration) are not.

root@52e368960bf4:/opt/clusterkit# ping 172.18.0.2
PING 172.18.0.2 (172.18.0.2) 56(84) bytes of data.
64 bytes from 172.18.0.2: icmp_seq=1 ttl=64 time=0.060 ms
64 bytes from 172.18.0.2: icmp_seq=2 ttl=64 time=0.086 ms
64 bytes from 172.18.0.2: icmp_seq=3 ttl=64 time=0.049 ms
64 bytes from 172.18.0.2: icmp_seq=4 ttl=64 time=0.106 ms
^C
--- 172.18.0.2 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2997ms
rtt min/avg/max/mdev = 0.049/0.075/0.106/0.023 ms

root@52e368960bf4:/opt/clusterkit# ping6 ::ffff:172.18.0.2
PING ::ffff:172.18.0.2(::ffff:172.18.0.2) 56 data bytes
ping: sendmsg: Network is unreachable
ping: sendmsg: Network is unreachable
ping: sendmsg: Network is unreachable
@kantora
Copy link
Contributor Author

kantora commented Jul 20, 2016

I thought that the problem is in docker configuration. It is showing system ipv6 support, but no network connectivity. So I managed to completely wipe out ipv6 from system configuration.
Still same errors:

←[36mseed_1          |←[0m 2016-07-20 08:44:16 [Error] Error caught channel ["::ffff:172.18.0.3:3090"->"::ffff:172.18.0.5:60660"](Id="ChannelId(392055264)")
←[36mseed_1          |←[0m System.Net.Sockets.SocketException: Operation aborted
←[36mseed_1          |←[0m   at Helios.Channels.Sockets.SocketChannelAsyncOperation.Validate () <0x41aee8d0 + 0x00057> in <filename unknown>:0
←[36mseed_1          |←[0m   at Helios.Channels.Sockets.AbstractSocketByteChannel+SocketByteChannelUnsafe.FinishRead (Helios.Channels.Sockets.SocketChannelAsyncOperation operation) <0x41af9c90 + 0x0013b> in <filename unknown>:0
←[36mseed_1          |←[0m 2016-07-20 08:44:16 [Information] "Message Disassociated from NoSender to akka://ClusterKit/system/transports/akkaprotocolmanager.tcp.0/akkaProtocol-tcp%3A%2F%2FClusterKit%40%5B%3A%3Affff%3A172.18.0.5%5D%3A60660-2 was not delivered. 1 dead letters encountered."

@kantora
Copy link
Contributor Author

kantora commented Jul 20, 2016

As far as I could trace problem in Helios - DNS resolving is ok. The hostname resolved in correct ipv4 address. It seems to me that problem is in server, accepting connections and corrupting addresses... I'll try to catch it. So it is internal Helios problem...

@kantora
Copy link
Contributor Author

kantora commented Jul 20, 2016

@Aaronontheweb I've finally found the cause of this trouble, but, to be honest, I don't know what to do with it.
It is located in Helios package.

Helios.Channels.Sockets.TcpServerSocketChannel and Helios.Channels.Sockets.TcpSocketChannel.
They create the Socket objects in constructors. By default, the AddressFamily is not specified and, in my case, it was set as AddressFamily.InterNetworkV6. So even after binding the correct ipv4 address to the socket it was converted to ipv6 one.

I tried to make a hack and set AddressFamily.InterNetwork as default AddressFamily for the socket in both classes and Cluster started well. Of course, this is just the concept proof.

Unfortunately, the AddressFamily can only be provided in Socket constructor and cannot be modified on address bind.

@Aaronontheweb
Copy link
Member

@kantora I had looked at this with the DNS stuff in 1.1.1; had not tested any of this on Mono, so thanks for looking into it. Looks like socket treatment on Mono is a bit different than Windows. I'll see about patching this in a new Helios release.

@Aaronontheweb Aaronontheweb modified the milestone: Mono CI Support Jul 20, 2016
andreyleskov added a commit to andreyleskov/akka.net that referenced this issue Aug 11, 2016
In some environments (for example Azure WebApp) default AdressFamily for Socket is determined incorrectly.
Changed to get it explicitly from config
andreyleskov added a commit to andreyleskov/akka.net that referenced this issue Aug 14, 2016
In some environments (for example Azure WebApp) default AdressFamily for Socket is determined incorrectly.
Changed to get it explicitly from config
andreyleskov pushed a commit to andreyleskov/akka.net that referenced this issue Aug 14, 2016
@Aaronontheweb
Copy link
Member

Looks like there are some issues in Mono 4.x related to DNS and IPV6 - not strictly related to this, but where there's smoke there's fire... https://bugzilla.xamarin.com/show_bug.cgi?id=35536

@stefansedich
Copy link
Contributor

I am not sure but is this related to the fix I did in Mono a while ago now:
mono/mono#2420

On Tue, Aug 16, 2016 at 4:07 PM Aaron Stannard notifications@github.com
wrote:

Looks like there are some issues in Mono 4.x related to DNS and IPV6 - not
strictly related to this, but where there's smoke there's fire...
https://bugzilla.xamarin.com/show_bug.cgi?id=35536


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2194 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIqVFToUL2ny0R1fX3H3sg78LyZXX7Lks5qgkK7gaJpZM4JN_bm
.

@stefansedich
Copy link
Contributor

For some context sorry this was during my efforts to get akka running under
mono a long time ago before I fell off the face of the earth, and IIRC this
fix was to get the cluster tests passing. I could be wrong so feel free to
ignore me :)

Cheers

On Tue, Aug 16, 2016 at 4:12 PM Stefan Sedich stefan.sedich@gmail.com
wrote:

I am not sure but is this related to the fix I did in Mono a while ago
now: mono/mono#2420

On Tue, Aug 16, 2016 at 4:07 PM Aaron Stannard notifications@github.com
wrote:

Looks like there are some issues in Mono 4.x related to DNS and IPV6 -
not strictly related to this, but where there's smoke there's fire...
https://bugzilla.xamarin.com/show_bug.cgi?id=35536


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#2194 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIqVFToUL2ny0R1fX3H3sg78LyZXX7Lks5qgkK7gaJpZM4JN_bm
.

@Aaronontheweb
Copy link
Member

@stefansedich yep, looks like the issue. I was just able to recreate the problem on Windows by making some tweaks to the way Helios boots up. Looks like not correctly supporting dual-mode sockets is the problem. Do you know if this has been included in a Mono release since?

@stefansedich
Copy link
Contributor

It was merged into master I would think 6 months ago now, have not been
following releases for some time now unfortunately, I would assume it would
be out there by now. Another fix was also the ToIPV4 extension that you had
added I saw last week, that should be part of mono now too.

On Tue, Aug 16, 2016 at 4:16 PM Aaron Stannard notifications@github.com
wrote:

@stefansedich https://github.com/stefansedich yep, looks like the
issue. I was just able to recreate the problem on Windows by making some
tweaks to the way Helios boots up. Looks like not correctly supporting
dual-mode sockets is the problem. Do you know if this has been included in
a Mono release since?


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#2194 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAIqVAgtRLbsIQhfiV5jxwyGKHO242tvks5qgkTcgaJpZM4JN_bm
.

@Aaronontheweb
Copy link
Member

Aaronontheweb commented Aug 16, 2016

Actually I stand corrected - the issue is that a Socket that gets allocated has different default behaviors under Windows and Mono.

Under Windows the socket defaults to the highest supported IP protocol. IPV6 in our case. And if you pass in an IPV4 address to an IPV6 socket, no big deal - the platform can handle that just fine.

Under Mono the socket address defaults to AddressFamily.Unspecified - which explodes when you attempt to bind to it (with IPV6 at least, so far.)

@stefansedich
Copy link
Contributor

Ok no problems, https://bugzilla.xamarin.com/show_bug.cgi?id=36192 was the exact bug I found and it seems like it is fixed as part of 4.4, this will also be an issue for getting all tests running under Mono as this fix was something that helped me get further.

Anyway I saw mono and thought I would chime in, enjoy the pain :)!

@Aaronontheweb
Copy link
Member

Aaronontheweb commented Aug 16, 2016

Helios.Channels.Sockets.TcpServerSocketChannel and Helios.Channels.Sockets.TcpSocketChannel.
They create the Socket objects in constructors. By default, the AddressFamily is not specified and, in my case, it was set as AddressFamily.InterNetworkV6. So even after binding the correct ipv4 address to the socket it was converted to ipv6 one.
I tried to make a hack and set AddressFamily.InterNetwork as default AddressFamily for the socket in both classes and Cluster started well. Of course, this is just the concept proof.
Unfortunately, the AddressFamily can only be provided in Socket constructor and cannot be modified on address bind.

@kantora Yep, you're right on.

I saw that @andreyleskov has some changes in his Akka.NET fork to try to resolve this issue. I'll try to tackle this from the Helios end as well since Helios' own test suite doesn't pass on Mono at the moment.

@stefansedich we need you back in .NET land buddy

@andreyleskov
Copy link
Contributor

Hi, @Aaronontheweb, @kantora I've faced same problem running Akka.Net in Azure WebApp.
In my case default option of AddressFamily= Ipv6 used in Socket constructor leads to errors on hostname\IP resolution. Simple change of default address family leads to broken test, so I've introduced additional configuration value to explicit set address family. I've already using modified version in my project,
and going to create pull request on next week.

@Aaronontheweb
Copy link
Member

@andreyleskov correct me if I'm wrong, but won't 100% of Mono users need to have this setting turned on in order for Akka.Remote to work as expected?

@andreyleskov
Copy link
Contributor

@Aaronontheweb seems yes, I'll add Mono runtime autodetect. My specific case is related to Azure environment, but it looks identical to Mono issue.

@Aaronontheweb
Copy link
Member

ah, so there are some Windows environments that have this issue as well? @andreyleskov

andreyleskov added a commit to andreyleskov/akka.net that referenced this issue Aug 20, 2016
added new config setting - enforce-ip-family to overcome socket ipv4 \ ipv6 issue
in Mono and Azure WebApp environments
for mono setting will defaulted to true
added tests for new functionality
added setting description to Remote.conf (+2 squashed commit)
andreyleskov added a commit to andreyleskov/akka.net that referenced this issue Aug 20, 2016
added new config setting - enforce-ip-family to overcome socket ipv4 \ ipv6 issue
in Mono and Azure WebApp environments
for mono setting will defaulted to true
added tests for new functionality
added setting description to Remote.conf (+2 squashed commit)
andreyleskov added a commit to andreyleskov/akka.net that referenced this issue Aug 21, 2016
added new config setting - enforce-ip-family to overcome socket ipv4 \ ipv6 issue
in Mono and Azure WebApp environments
for mono setting will defaulted to true
added tests for new functionality
added setting description to Remote.conf (+2 squashed commit)
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Sep 16, 2016
@Aaronontheweb Aaronontheweb self-assigned this Sep 16, 2016
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Sep 16, 2016
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Sep 17, 2016
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Sep 19, 2016
Aaronontheweb added a commit to Aaronontheweb/akka.net that referenced this issue Sep 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants