Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix request.rb for intermittently raised error using TLS 1.3 #2486

Closed
wants to merge 1 commit into from

Conversation

MSP-Greg
Copy link
Contributor

Description:

When running tests with a Ruby built with OpenSSL 1.1.1 or later, one test has intermittently failed both here and in Ruby trunk testing. With trunk, sometimes the failure occurs when testing parallel, but passes on retry. The test is:

TestGemRemoteFetcher#test_do_not_allow_invalid_client_cert_auth_connection

An error should be raised in the test, but this one starts with OpenSSL, passes thru net/http, and should be caught in request.rb. Since it seems to only occur with OpenSSL 1.1.1/TLSv1_3, code somewhat accounts for that...

Two examples of test failure:
https://ci.appveyor.com/project/rubygems/rubygems/builds/20388929/job/lk27qkfajexyciuj#L30
https://ci.appveyor.com/project/MSP-Greg/ruby-loco/builds/20465475#L2933

Tasks:

  • Describe the problem / feature
  • Write tests
  • Write code to solve the problem
  • Get code review from coworkers / friends

I will abide by the code of conduct.

@MSP-Greg
Copy link
Contributor Author

As of today 2018-11-21, the only 'standard' rubies available that are using OpenSSL 1.1.1 are windows Ruby builds 2.5.3 or greater. The current Travis rubies (both Ubuntu & OSX/MacOS) are using earlier versions.

OpenSSL 1.1.1 is the first version with TLSv1_3, and it operates a little differently than TLSv1_2. Some of the differences are only an issue with CI where a server and a client are both being created for testing...

@bronzdoc
Copy link
Member

@MSP-Greg can you add a test? thanks!

@MSP-Greg
Copy link
Contributor Author

See RG issue 2388.

@bronzdoc

can you add a test? thanks!

If I could consistently repo the error, yes. Since I can't, no.

As mentioned in the first msg, while using a new version of OpenSSL/TLS, this test has intermittently failed/errored due to a particular OpenSSL error being raised and not caught.

But, an error should be raised. I could remove the code from lib/rubygems/request.rb and add the error the code in test_gem_remote_fetcher.rb?

@bronzdoc
Copy link
Member

bronzdoc commented Nov 27, 2018

Maybe we can mock the error with Object.send(:remove_const, :TLS1_3_VERSION) ?

@MSP-Greg
Copy link
Contributor Author

MSP-Greg commented Nov 27, 2018

@bronzdoc

Maybe we can mock the error with Object.send(:remove_const, :TLS1_3_VERSION) ?

The issue occurs when TLS v1.3 exists. On Windows, I don't believe I've ever repo'd it locally. Appveyor fails occasionally on it (both testing rubygems/rubygems & ruby/ruby).

Note that in ruby/openssl issue 227, it's stated that there are systems where the error being raised is not intermittent. Also note the statement "TLS 1.3, which is new in OpenSSL 1.1.1, handles a client certificate differently from TLS 1.2."

This is messy, as until more repos test with more OS's and OpenSSL 1.1.1 configurations, I don't think it's well defined. Importantly, this issue here is about OpenSSL / net/http / RubyGems correctly sensing a client security error, and stopping it. The problem is an error is raised by OpenSSL that isn't allowed for or caught.

I was an early adopter of 1.1.1, and was building & testing prereleases. I used it with ruby, puma, & eventmachine, and was involved in updating them & MSYS2 to use it. By no means am I an OpenSSL expert, but I've seen what adjustments have been needed with TLS 1.3.

Long story short, I doubt this is the only change that will be needed with TLS 1.3, but it gets the CI to be consistent. As mentioned earlier, the test code could be changed, rather than request.rb.

@MSP-Greg
Copy link
Contributor Author

@bronzdoc

Was looking to see about other trunk CI testing (make install before make test?), and came across the following:

http://rubyci.s3.amazonaws.com/ubuntu/ruby-trunk/log/20181129T003002Z.log.html.gz#rubyspec

Scroll up a bit, and you'll see this error on an Ubuntu 18.10 build...

rescue (defined?(OpenSSL::SSL) ? OpenSSL::SSL::SSLError : nil) => e
# As of 2018-11, error is intermittently raised when using TLS 1.3 with
# improper client authentication
if OpenSSL::SSL.const_defined? :TLS1_3_VERSION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be checked earlier in

rescue defined?(OpenSSL::SSL) ? OpenSSL::SSL::SSLError : Errno::EHOSTDOWN,
since we already checking if OpenSSL::SSL is defined there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe all of the caller listings show hitting RG first at request.rb:222. The rescue in 240 catches errors in line 222.

@@ -237,6 +237,14 @@ def perform_request(request) # :nodoc:
verbose "fatal error"

raise Gem::RemoteFetcher::FetchError.new('fatal error', @uri)
rescue (defined?(OpenSSL::SSL) ? OpenSSL::SSL::SSLError : nil) => e
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we rescuing for nil? if OpenSSL::SSL is not defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I, possibly incorrectly, assumed that rescue was a form of is_a?. It's been a few days since I did the PR, but I think I locally tested to see if it was catching other errors.

For the test to pass, it has to receive a Gem::RemoteFetcher::FetchError error, and if the CI passed for Ruby versions without TLS v1.3, nil is not catching other errors.

@bronzdoc
Copy link
Member

Thanks for the explanation @MSP-Greg i think i understand better now, i left some comments

@MSP-Greg
Copy link
Contributor Author

@bronzdoc You're welcome, and thanks for the thanks.

@MSP-Greg
Copy link
Contributor Author

MSP-Greg commented Dec 3, 2018

Intermittent test issues are difficult to test. @hsbt did a commit that supersedes this, see ruby/ruby r66163 (GitHub edbac1b).

Closing.

@MSP-Greg MSP-Greg closed this Dec 3, 2018
ghost pushed a commit that referenced this pull request Dec 4, 2018
2507: Fixed test fails with the newer version of OpenSSL r=hsbt a=hsbt

# Description:

This backport fixes #2486 .

I already fixed at ruby/ruby@edbac1b
______________

# Tasks:

- [ ] Describe the problem / feature
- [ ] Write tests
- [ ] Write code to solve the problem
- [ ] Get code review from coworkers / friends

I will abide by the [code of conduct](https://github.com/rubygems/rubygems/blob/master/CODE_OF_CONDUCT.md).


Co-authored-by: hsbt <hsbt@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>
@MSP-Greg MSP-Greg deleted the request.rb_tlsv1_3_error branch May 30, 2024 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants