New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue connecting to dual-stack instances #3762

Closed
mherrb opened this Issue Jun 15, 2017 · 16 comments

Comments

Projects
None yet
8 participants
@mherrb

mherrb commented Jun 15, 2017

It looks like that when mastodon instances like mastodon.rocks publish an IPv6 address (via a DNS AAAA record) but isn't reachable on it, other IPv6 enabled instances can't communicate with it, even though the legacy IPv4 protocol still works on both.

The IPv6 connection timeouts and doesn't fall back to IPv4


  • I searched or browsed the repo’s other issues to ensure this is not a duplicate.
  • This bug happens on a tagged release and not on master (If you're a user, don't worry about this).
@Shnoulle

This comment has been minimized.

Show comment
Hide comment
@Shnoulle

Shnoulle Jun 15, 2017

But : it's an instance issue , not a mastodon issue.

Shnoulle commented Jun 15, 2017

But : it's an instance issue , not a mastodon issue.

@Gargron

This comment has been minimized.

Show comment
Hide comment
@Gargron

Gargron Jun 15, 2017

Member

Mastodon doesn't have any IPv4 vs IPv6 specific logic in the code... So I don't know what we can do here.

Member

Gargron commented Jun 15, 2017

Mastodon doesn't have any IPv4 vs IPv6 specific logic in the code... So I don't know what we can do here.

@mherrb

This comment has been minimized.

Show comment
Hide comment
@mherrb

mherrb Jun 15, 2017

I don't know how rails handles this, so I can't really help, but it's either something to setup when configuring the rails application or an issue common to all rails applications. We need some advice from a developper with more experience with rails than me....

mherrb commented Jun 15, 2017

I don't know how rails handles this, so I can't really help, but it's either something to setup when configuring the rails application or an issue common to all rails applications. We need some advice from a developper with more experience with rails than me....

@mherrb

This comment has been minimized.

Show comment
Hide comment
@mherrb

mherrb Jun 15, 2017

But : it's an instance issue , not a mastodon issue.

Well, yes and no.

Yes it's an instance issue: it shouldn't advertise an IP address that isn't reachable.
No not only. In the dual-stack transition mechanism to IPv6, it's explicitly mentioned (for instance in RFC 6555) that an application should fall back to IPv4 if the IPv6 connection fails. So in this case mastodon (or may be rails, see my previous comment) isn't compliant.

mherrb commented Jun 15, 2017

But : it's an instance issue , not a mastodon issue.

Well, yes and no.

Yes it's an instance issue: it shouldn't advertise an IP address that isn't reachable.
No not only. In the dual-stack transition mechanism to IPv6, it's explicitly mentioned (for instance in RFC 6555) that an application should fall back to IPv4 if the IPv6 connection fails. So in this case mastodon (or may be rails, see my previous comment) isn't compliant.

@Shnoulle

This comment has been minimized.

Show comment
Hide comment
@Shnoulle

Shnoulle Jun 15, 2017

I have near same issue with another mastodon instance with GnuSocial. Then PHP have same issue ?

Shnoulle commented Jun 15, 2017

I have near same issue with another mastodon instance with GnuSocial. Then PHP have same issue ?

@mherrb

This comment has been minimized.

Show comment
Hide comment
@mherrb

mherrb Jun 15, 2017

I'm a C language guy. In C, to conect to a server, first you call getaddrinfo() which returns a list of addresses (from different address families) for the server. Then it's the responsability of the application to try each address of the list until either one succeeds or all fail. This introduces quite large delays when one of the addresses isn't responding. The RFC6555 discusses strategies to avoid having to wait for the initial TCP handshake time-out before trying the next address in order to make the user experience better.

I can't tell if Rails or PHP are failing because they don't have code to try the IPv4 connection at all or if it's because a global time out on the execution of the request make it fail before it has had a chance to try it.

mherrb commented Jun 15, 2017

I'm a C language guy. In C, to conect to a server, first you call getaddrinfo() which returns a list of addresses (from different address families) for the server. Then it's the responsability of the application to try each address of the list until either one succeeds or all fail. This introduces quite large delays when one of the addresses isn't responding. The RFC6555 discusses strategies to avoid having to wait for the initial TCP handshake time-out before trying the next address in order to make the user experience better.

I can't tell if Rails or PHP are failing because they don't have code to try the IPv4 connection at all or if it's because a global time out on the execution of the request make it fail before it has had a chance to try it.

@zorun

This comment has been minimized.

Show comment
Hide comment
@zorun

zorun Jun 15, 2017

@mherrb which part of the federation protocol is failing?

If it's the subscription, it's handled by OStatus: https://github.com/tootsuite/mastodon/blob/master/app/models/account.rb#L135

It ends up calling this function in OStatus: https://github.com/tootsuite/ostatus2/blob/master/lib/ostatus2/subscription.rb#L52
Notice the call to HTTP.timeout, maybe this is why the HTTP lib has no time to try a second address?

zorun commented Jun 15, 2017

@mherrb which part of the federation protocol is failing?

If it's the subscription, it's handled by OStatus: https://github.com/tootsuite/mastodon/blob/master/app/models/account.rb#L135

It ends up calling this function in OStatus: https://github.com/tootsuite/ostatus2/blob/master/lib/ostatus2/subscription.rb#L52
Notice the call to HTTP.timeout, maybe this is why the HTTP lib has no time to try a second address?

@Shnoulle

This comment has been minimized.

Show comment
Hide comment
@Shnoulle

Shnoulle Jun 15, 2017

Yep, maybe time out here. In GS:

FeedSubBadURLException: Unable to connect to ssl://mastodon.rocks:443. Error: Connexion termin?e par expiration du d?lai d'attente

expiration du délais d'attente mean : time out issue ;)

Shnoulle commented Jun 15, 2017

Yep, maybe time out here. In GS:

FeedSubBadURLException: Unable to connect to ssl://mastodon.rocks:443. Error: Connexion termin?e par expiration du d?lai d'attente

expiration du délais d'attente mean : time out issue ;)

@mherrb

This comment has been minimized.

Show comment
Hide comment
@mherrb

mherrb Jun 15, 2017

@zorun yes that's probably it. One of the original issue reported to me was that it's impossible to follow someone from mastodon.rocks, so this would be the subscription code I guess. Then it's also impossible to find people using the search box or to mention them while writing a toot. That's probably another service.

mherrb commented Jun 15, 2017

@zorun yes that's probably it. One of the original issue reported to me was that it's impossible to follow someone from mastodon.rocks, so this would be the subscription code I guess. Then it's also impossible to find people using the search box or to mention them while writing a toot. That's probably another service.

@nightpool

This comment has been minimized.

Show comment
Hide comment
@nightpool

nightpool Jun 28, 2017

Collaborator

Been having some pretty bad experiences with this today. Tooot.im is another server where this is an issue.

looking into whether http.rb gives us the ability to configure this.

Collaborator

nightpool commented Jun 28, 2017

Been having some pretty bad experiences with this today. Tooot.im is another server where this is an issue.

looking into whether http.rb gives us the ability to configure this.

@Gargron

This comment has been minimized.

Show comment
Hide comment
@Gargron

Gargron Jul 2, 2017

Member

Please consider that a bigger timeout is not the greatest solution. An instance has a lot to do - delivering payloads, fetching data. If a single connection hogs a thread for a long time, that has an awful impact on overall throughput.

Member

Gargron commented Jul 2, 2017

Please consider that a bigger timeout is not the greatest solution. An instance has a lot to do - delivering payloads, fetching data. If a single connection hogs a thread for a long time, that has an awful impact on overall throughput.

@nightpool

This comment has been minimized.

Show comment
Hide comment
@nightpool

nightpool Jul 2, 2017

Collaborator

yeah, absolutely. the original title of my issue was "prefer ipv4 addresses to ipv6 addresses".

it looks like the only place to fix this is the system resolver though.

Collaborator

nightpool commented Jul 2, 2017

yeah, absolutely. the original title of my issue was "prefer ipv4 addresses to ipv6 addresses".

it looks like the only place to fix this is the system resolver though.

@pbeyssac

This comment has been minimized.

Show comment
Hide comment
@pbeyssac

pbeyssac Jul 17, 2017

Hello,

My 2 bits.

I encounter the same problem on my instance (mast.eu.org) with some other instances that advertise an IPv6 address and are not reachable due to network issues between me and them (my provider, probably). It is not a problem that either of us can fix.

IPv4 still works in most cases so a v4 fallback would fix the issue.

It is not a resolver issue, the resolver resolves what the client software requests. The operating system may set some preferences but the last resort choice is in the client software (socket-calling code). See getaddrinfo(3) for example.

pbeyssac commented Jul 17, 2017

Hello,

My 2 bits.

I encounter the same problem on my instance (mast.eu.org) with some other instances that advertise an IPv6 address and are not reachable due to network issues between me and them (my provider, probably). It is not a problem that either of us can fix.

IPv4 still works in most cases so a v4 fallback would fix the issue.

It is not a resolver issue, the resolver resolves what the client software requests. The operating system may set some preferences but the last resort choice is in the client software (socket-calling code). See getaddrinfo(3) for example.

@bortzmeyer

This comment has been minimized.

Show comment
Hide comment
@bortzmeyer

bortzmeyer Jul 17, 2017

And it is not an IPv6 issue. The same problem would occur when the resolver returns a set of IPv4 addresses, some working and some not. The Mastodon client must try all the addresses until one succeeds, either in sequence or, as @mherrb suggested, by using the more clever algorithm of RFC 6555.

All the good HTTP clients do the same, try with wget to have an example.

bortzmeyer commented Jul 17, 2017

And it is not an IPv6 issue. The same problem would occur when the resolver returns a set of IPv4 addresses, some working and some not. The Mastodon client must try all the addresses until one succeeds, either in sequence or, as @mherrb suggested, by using the more clever algorithm of RFC 6555.

All the good HTTP clients do the same, try with wget to have an example.

@bortzmeyer

This comment has been minimized.

Show comment
Hide comment
@bortzmeyer

bortzmeyer Jul 17, 2017

Side point: https://mastodon.rocks/ now works fine over IPv6. But, as I said, it is not an IPv6 issue and Mastodon should still be fixed to try another IP addresses if the first one fails.

bortzmeyer commented Jul 17, 2017

Side point: https://mastodon.rocks/ now works fine over IPv6. But, as I said, it is not an IPv6 issue and Mastodon should still be fixed to try another IP addresses if the first one fails.

@bortzmeyer

This comment has been minimized.

Show comment
Hide comment
@bortzmeyer

bortzmeyer Dec 21, 2017

Note there is a standard algorithm to deal with the case of several IP addresses (IPv4 or IPv6) when some are unresponsive; in RFC 8305 https://www.rfc-editor.org/info/rfc8305

bortzmeyer commented Dec 21, 2017

Note there is a standard algorithm to deal with the case of several IP addresses (IPv4 or IPv6) when some are unresponsive; in RFC 8305 https://www.rfc-editor.org/info/rfc8305

@Gargron Gargron closed this Jul 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment