Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally fallback to IPv4 when IPv6 is unreachable #2662

Merged
merged 3 commits into from
Feb 16, 2021

Conversation

sonalkr132
Copy link
Member

@sonalkr132 sonalkr132 commented Feb 28, 2019

Description:

Multiple users have reported that IPv6 address of rubygems.org is
unreachable[1][2]. We have verified using online tools[3][4] as well
as all of aws regions with ipv6 support that there is no misconfig issue
on our part.
IPv6 route being broken is a bit expected and hence, RFC8305[5]
recommends starting connection to both IPv4 and IPv6 address in parallel
and use whichever returns first (this is TLDR explanation, check RFC for
details). This patch of TCPSocket.initialize does something similar, as
in, it starts connection to all resolved IPs in parallel and uses first
successful connection's IP to initalize the TCPSocket. We use Net:HTTP
as our remote featcher and it internally uses TCPSocket.open[6].

Even tho support of IPv6 fallback is spotty, I have open a ruby issue
for it[7].

About test for this.. we would need to simulate a ipv6 broken route, only way I know of is using ip6tables and that needs sudo.

Fixes #3641.
Closes #3463.

Tasks:

  • Describe the problem / feature
  • Write tests
  • Write code to solve the problem
  • Get code review from coworkers / friends

I will abide by the code of conduct.

@sonalkr132
Copy link
Member Author

Bundler tests are failing. How do I run bundler tests in context of my rubygems changes?

@deivid-rodriguez
Copy link
Member

I think if you switch to the bundler folder, run bin/rake spec:deps, and then bin/rspec <spec_thats_failing>, it should properly detecting that bundler is a submodule of the rubygems repo, and run tests using the rubygems from the checked out repo. If it doesn't let me know!

@bronzdoc
Copy link
Member

You need to set the RGV environment variable to the branch of rubygems you want to test against inside the bundler submodule @sonalkr132 and run bin/rspec ...

@deivid-rodriguez
Copy link
Member

I think it should just work because bundler's Rakefile already detects the rubygems checkout: https://github.com/bundler/bundler/blob/master/Rakefile#L9-L14. No need to set the RGV varible I think, because that will download rubygems, and in this case it's already checked out.

@bronzdoc
Copy link
Member

@deivid-rodriguez it didn't for me, i just tried it since i'm testing a patch in bundler against a patch in RG

@sonalkr132 sonalkr132 force-pushed the ipv6-fallback branch 4 times, most recently from bc225e3 to 83f7fb3 Compare March 1, 2019 04:09
@sonalkr132 sonalkr132 force-pushed the ipv6-fallback branch 3 times, most recently from 0b915df to 3a89de5 Compare March 10, 2019 15:39
@sonalkr132
Copy link
Member Author

Thanks for help guys. I have my tests passing now, but this implementation is adding 4-6 minutes to bundler test suit (has redudant connection checking).
I am trying a slightly different implmentation and for that tests are failing in build but not locally. Did you ever face this (some -I... stuff?) or is this a secret feature discouraging people from patching ruby classes? 😅

We can probably also use resolve-replace here (start make_connection for each IP, with resolve-replace controlling dns) but using that seems more fragile to me than this patch.

@deivid-rodriguez
Copy link
Member

I'm so sorry I didn't follow up on the bundler tests troubleshooting, I forgot about this!

Happily, @sonalkr132 found his way through and also documented it ❤️.

@deivid-rodriguez
Copy link
Member

I am trying a slightly different implmentation and for that tests are failing in build but not locally. Did you ever face this (some -I... stuff?) or is this a secret feature discouraging people from patching ruby classes?

Sorry for not following up, @sonalkr132. I'll have a look and see if I understand what's going on 👍

@deivid-rodriguez
Copy link
Member

@sonalkr132 The test that are failing in your alternative patch seem to be failing only in CI because... they only run in CI, because of being slow.

@deivid-rodriguez
Copy link
Member

@deivid-rodriguez it didn't for me, i just tried it since i'm testing a patch in bundler against a patch in RG

@bronzdoc I looked into this and you're right. I noticed this when reopening #2696 and CI passed (it should've failed when being tested against bundler's master as per the job added in #2695).

The problem is that currently we only test against a remote branch or tag, not against the checked out code of rubygems. So basically when we open any PR to rubygems, bundler's tests are not running against that PR, but against master.

As per the lines of code in bundler's Rakefile that I pointed out earlier: https://github.com/bundler/bundler/blob/master/Rakefile#L9-L14, this seems unintentional, so I'll fix it to make our specs against bundler a proper regression run and not just a sanity check against master.

@deivid-rodriguez
Copy link
Member

I created rubygems/bundler#7070 to fix this. If it's correct, we can start specifying RGV=.. in rubygems to really test bundler against the proper copy of rubygems.

@sonalkr132 sonalkr132 force-pushed the ipv6-fallback branch 2 times, most recently from e1ecb35 to 3a89de5 Compare August 11, 2019 12:21
@duckinator
Copy link
Member

hey, what's the status of this? It'd be awesome to have IPv4 fallback in place.

@florrain
Copy link

Hi @sonalkr132 👋 I second the previous comment, the fallback to ipv4 would be very helpful. Would you have an idea on the next steps here?

@florrain
Copy link

I'm able to reproduce the IPv6 error using my company VPN that is for some reason not able to resolve rubygems.org. Rubygems.org using IPv4 works though. So I can assist with the testing if that can help getting this out of the door.

@sonalkr132
Copy link
Member Author

sonalkr132 commented Jul 23, 2020

Hi, sorry about the delay in response here.
The issue with this as of now is that monkey patching TCPSocketExt in rubygems is far from what (or where) the fix should be. A significant number of bundler real-world tests (which are way too slow to run/debug) are failing with this implementation. I was also getting different failures on Travis and my local setup.

Now we are using GitHub ci and I have been meaning to update this with a slightly less intrusive change (which could pass the test suit), however, haven't been able to find time for it.

IPv6 not working for some users (particularly in Europe) is also a Fastly issue, where only sites hosted on Fastly are unreachable. We have filed a support ticket about this (so have other users) but no progress so far.
@florrain it would be helpful you can open a support ticket on help.rubygems.org and I can forward debugging details I need, which I will forward on our support ticket.

@deivid-rodriguez
Copy link
Member

deivid-rodriguez commented Aug 31, 2020

Hello @sonalkr132!

Today I spent some time getting a ipv6 tunnel (ipv6 adoption in Spain is not really a thing) so that I can have ipv6, and then breaking it through ip6tables as you suggested. Timeouts on gem install started happening 👍.

Then I rebased & installed this PR and timeouts were fixed 👍. I also tried some bundle install's and run bundler & rubygems specs, and all seems good.

Can you share the realworld bundler stuff that was not playing well with this?

@deivid-rodriguez
Copy link
Member

deivid-rodriguez commented Feb 9, 2021

ruby/net-http#10 seems a very good idea to me, since it also fixes very noticiable inefficiencies in net/http usages, as reported in https://bugs.ruby-lang.org/issues/16381. Same fix is also proposed separately at ruby/ruby#1806.

Anyways, yeah, I saw the comment that TCPSocket.open will also be fixed, was just thinking that if those proposals get attended first we won't need that. Also if this gets completely fixed in ruby land, that makes it easier for us to backport the fixes to rubygems.

@deivid-rodriguez
Copy link
Member

ruby/net-http#10 was merged by @hsbt, so the currently proposed ruby-core implementation of happy eyeballs (which was also just updated to address feedback) might be enough now.

@denisdefreyne
Copy link

denisdefreyne commented Feb 16, 2021

My IPv6 connectivity broke again today, which gave me the opportunity to test this out,

Initially:

% cat ~/.gemrc
gem: --no-document⏎

% gem update
ERROR:  While executing gem ... (Gem::RemoteFetcher::UnknownHostError)
    timed out (https://rubygems.org/specs.4.8.gz)

I added :ipv4_fallback_enabled: true:

% cat ~/.gemrc
gem: --no-document
:ipv4_fallback_enabled: true

% gem update
Updating installed gems
Updating aws-partitions
Fetching aws-partitions-1.427.0.gem
Successfully installed aws-partitions-1.427.0
[snip]
Gems updated: aws-partitions aws-sdk-core aws-record [snip]

I removed :ipv4_fallback_enabled: true:

% cat ~/.gemrc
gem: --no-document

% gem update
ERROR:  While executing gem ... (Gem::RemoteFetcher::UnknownHostError)
    timed out (https://rubygems.org/specs.4.8.gz)

So it looks like this works as intended!

@deivid-rodriguez
Copy link
Member

Nice!!!

@deivid-rodriguez
Copy link
Member

Should we move this forward then @sonalkr132?

@sonalkr132
Copy link
Member Author

Yeah, I think so. This is disabled by default, so it can hardly do any harm. We can document it as a workaround for now and eventually migrate to fix in ruby/ruby#4038, which is much more comprehensive. I am not sure if we would ever enable current flag by default.

@ddfreyne please do open a ticket on https://support.fastly.com and add support@rubygems.org or my email in cc. Make sure that your connection issue is only for fastly IPs or any IPv6 address.

wget -6 rubygems.org # fails
wget -6 www.google.com # works

@denisdefreyne
Copy link

Sadly the IPv6 problem lies entirely on my ISP’s side, not Fastly’s. I get an IPv6 address, which works intermittently: occasionally all IPv6 traffic, whether to Google or Fastly or anything else, gets dropped and thus yields timeouts.

@deivid-rodriguez
Copy link
Member

Can you rebase it @sonalkr132?

Multiple users have reported that IPv6 address of rubygems.org is
unreachable[1][2]. We have verified using online tools[3][4] as well
as all of aws regions with ipv6 support that there is no misconfig issue
on our part.
IPv6 route being broken is a bit expected and hence, RFC8305[5]
recommends starting connection to both IPv4 and IPv6 address in parallel
and use whichever returns first (this is TLDR explanation, check RFC for
details). This patch of TCPSocket.initialize does something similar, as
in, it starts connection to all resolved IPs in parallel and uses first
successful connection's IP to initalize the TCPSocket. We use `Net:HTTP`
as our remote featcher and it internally uses TCPSocket.open[6].

Even tho support of IPv6 fallback is spotty, I have open a ruby issue
for it[7].

[1]
https://help.rubygems.org/discussions/problems/33671-suggestion-for-easy-solution-for-the-ipv6-related-timeout-problem
[2] https://help.rubygems.org/discussions/problems/31074-timeout-error
[3] https://www.ssllabs.com/ssltest/analyze.html?d=rubygems.org
[4] https://ready.chair6.net/?url=rubygems.org
[5] https://tools.ietf.org/html/rfc8305
[6]
https://github.com/ruby/ruby/blob/4444025d16ae1a586eee6a0ac9bdd09e33833f3c/lib/net/http.rb#L950
[7] https://bugs.ruby-lang.org/issues/15628
use `export IPV4_FALLBACK_ENABLED=true` to enable
@sonalkr132
Copy link
Member Author

rebased.

Copy link
Member

@deivid-rodriguez deivid-rodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm merging this after CI is it looks ok to you @sonalkr132.

@sonalkr132
Copy link
Member Author

yes, please go ahead. should we add a guide/blog for this? seems like it can use a page for context around the issue in Net:HTTP and error message.

@simi simi merged commit 533a242 into rubygems:master Feb 16, 2021
@sonalkr132 sonalkr132 deleted the ipv6-fallback branch February 16, 2021 16:06
@akostadinov
Copy link

Which version will this fall into?

@simi
Copy link
Member

simi commented Feb 16, 2021

@akostadinov most likely 3.2.11

@deivid-rodriguez
Copy link
Member

Yes, I'll release it tomorrow hopefully. And yes, a guide/blog would be cool, an easy place to link to so we don't need to repeat the same instructions again and again when people run into this.

@deivid-rodriguez deivid-rodriguez changed the title Fallback to IPv4 when IPv6 is unreachable Optionally fallback to IPv4 when IPv6 is unreachable Feb 17, 2021
deivid-rodriguez pushed a commit that referenced this pull request Feb 17, 2021
Fallback to IPv4 when IPv6 is unreachable

(cherry picked from commit 533a242)
Copy link

@benlangfeld benlangfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be better to fix net/http to use Socket.connect_nonblock than to spawn all these threads?

@deivid-rodriguez
Copy link
Member

It sounds like you want to report an issue/suggestion to https://github.com/ruby/net-http?

@benlangfeld
Copy link

@deivid-rodriguez I was asking a question.

@deivid-rodriguez
Copy link
Member

Oh, sorry, I didn't mean to be rude. I don't know the answer to your question. The current implementation seems to be working just fine, and we might not even need to enable this by default at all, since it seems like it will be fixed in ruby.

But your question seems like it could lead to an enhancement to net/http that might have a broader scope than this issue, so I think it's worth to reach out to net/http maintainers with your idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet