Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let Net::HTTP decompress the index instead of doing it manually #4081

Merged
merged 2 commits into from
Jan 7, 2021

Conversation

eregon
Copy link
Contributor

@eregon eregon commented Nov 23, 2020

As a result, bundler no longer uses stringio, so the require for it can be removed avoid potential issues with the stringio default gem. So this PR also closes #3860.

Make sure the following tasks are checked

@eregon
Copy link
Contributor Author

eregon commented Nov 23, 2020

Looks like there is a strange bug in RuboCop, I didn't touch those lines: https://github.com/rubygems/rubygems/pull/4081/checks?check_run_id=1442980333

@eregon
Copy link
Contributor Author

eregon commented Dec 4, 2020

Could someone review this PR?

Copy link
Member

@deivid-rodriguez deivid-rodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @eregon. I checked the net/http code and I think I get it.

If "Accept-Encoding" is set, like we're doing, net/http sets a flag to avoid handling decoding of the response (@decode_content = false):

https://github.com/ruby/net-http/blob/71ba18910eab3876b153cf8e4f1cf1045ed69501/lib/net/http/generic_request.rb#L73

If it's not set, net/http sets it to gzip and sets the flag to handle it internally (@decode_content = true):

https://github.com/ruby/net-http/blob/71ba18910eab3876b153cf8e4f1cf1045ed69501/lib/net/http/generic_request.rb#L34-L44

Then when it comes to handling the response, it skips decompression if the @decode_content flag is set, and also removes the Content-Encoding header from the response:

https://github.com/ruby/net-http/blob/71ba18910eab3876b153cf8e4f1cf1045ed69501/lib/net/http/response.rb#L257-L287

So in that case, the block below where we handle it ourselves will be skipped since Content-Encoding won't be set.

What I'm thinking is, since net/http also handles availability of zlib automatically, could we skip setting the Accept-Encoding header altogether and also our handling of the response at

if response["Content-Encoding"] == "gzip"
content = Zlib::GzipReader.new(StringIO.new(content)).read
end

?

@eregon
Copy link
Contributor Author

eregon commented Dec 11, 2020

I think we could do that, except maybe for Net::HTTP::HAVE_ZLIB == false cases.
I'm not sure if that happens in practice, but it seems possible if e.g. the zlib C extension was not compiled.
OTOH if zlib was not compiled, it would already not work because there is require "zlib" at the top of this file, so maybe we don't care about that case and raise an exception if it's false?

And anyway, if zlib is not available, Net::HTTP would not set Accept-Encoding: gzip, so it seems we can safely remove anything related to zlib in this file, and let Net::HTTP do the right thing.
The only concern seems then to be if some older net/http doesn't properly handle that gzip stuff, I'll check it for Ruby 2.3.

@deivid-rodriguez
Copy link
Member

I think we just run into this issue after upgrading to truffleruby 20.3: https://github.com/rubygems/rubygems/runs/1658684625?

I rebased this and removed zlib usages.

I could probably have gone all the way and also remove the require and the rescue Zlib::GzipFile. I think the require can go too, and regarding the rescue, it seems useful to avoid some errors: rubygems/bundler#6261. But if it's having an effect, it's probably hiding server side issues, so we might want to add a warning a leave it there for a while to make sure nobody complains before we removing it 🤷‍♂️. Anyways, for another time. I'll check CI here with the current changs for now.

@eregon
Copy link
Contributor Author

eregon commented Jan 7, 2021

Looks good to me. Yes, that should help with that error.

We're also working on fixing this in general in TruffleRuby, but that will need to wait the 21.2 release probably.
So it'd be nice to have this fix as a Bundler release, so that it's also applied when users install the latest Bundler on TruffleRuby.

@eregon
Copy link
Contributor Author

eregon commented Jan 7, 2021

And thanks for rebasing and improving this PR! (I had it as a TODO but I have too many TODOs right now.)

* Setting the Accept-Encoding header actually disables the automatic
  decompression as documented in Net::HTTP.
  https://github.com/ruby/ruby/blob/79a8ed07650dcbb36ec4b49a22596275e6c0fe23/lib/net/http/response.rb#L105-L107
  https://github.com/ruby/ruby/blob/c5eb24349a4535948514fe765c3ddb0628d81004/lib/net/http/request.rb#L11-L12
* This is also more memory efficient as Net::HTTP decompresses chunk
  by chunk instead of having the entire compressed String in memory
  and also wrapped in a StringIO which is particularly inefficient.
* See oracle/truffleruby#2127 (comment)
  for more details.
@deivid-rodriguez
Copy link
Member

I think stringio is no longer used, so we can safely remove that require too, and this PR would also close #3860.

@eregon
Copy link
Contributor Author

eregon commented Jan 7, 2021

Even better 👍

@deivid-rodriguez
Copy link
Member

Thanks @eregon!

@deivid-rodriguez deivid-rodriguez merged commit a5e96ef into rubygems:master Jan 7, 2021
@deivid-rodriguez deivid-rodriguez mentioned this pull request Jan 7, 2021
4 tasks
deivid-rodriguez added a commit that referenced this pull request Jan 11, 2021
Let Net::HTTP decompress the index instead of doing it manually

(cherry picked from commit a5e96ef)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fails with stringio gem
3 participants