Use IO.copy_stream when reading, writing #6958

martinemde · 2023-09-14T04:24:27Z

What was the end-user or developer problem that led to this PR?

Looking at increasing read/write efficiency. I'm not sure if this is actually better, but I believe that's the goal of IO.copy_stream.

What is your fix for the problem, implemented in this PR?

Use IO.copy_stream when writing gems to files from the gem archive.

Make sure the following tasks are checked

Describe the problem / feature
Write tests for features and bug fixes
Write code to solve the problem
Make sure you follow the current code style and write meaningful commit messages without tags

segiddins

I think that's fundamentally what IO.copy_stream is doing under the hood, agree it's what we should be doing

martinemde · 2023-09-15T06:27:04Z

Strangely truffleruby fails every time but only on this branch. IO.copy_stream discrepancy? It fails, If I'm not mistaken, because a binary file from a gem (a file that would pass through this change) is producing a different checksum. It is presumably checking this same checksum on the other rubies.

segiddins · 2023-12-10T21:58:06Z

lib/rubygems/package.rb

@@ -712,6 +712,16 @@ def verify_gz(entry) # :nodoc:
  rescue Zlib::GzipFile::Error => e
    raise Gem::Package::FormatError.new(e.message, entry.full_name)
  end
+
+  if RUBY_ENGINE == "truffleruby"


add a ruby engine version check as well?

Is there a fixed version? I haven't heard anything. A ticket was mentioned but it had to do with the return value of copy_stream. I don't think this is cause by the return value.

oracle/truffleruby#3280 (comment) -- 23.1.2, to be released next month

Oh, that makes sense. If it leaves our tar reader pos at the beginning, we can't possibly read the files right.

segiddins · 2023-12-10T22:02:10Z

lib/rubygems/package.rb

+
+  if RUBY_ENGINE == "truffleruby"
+    def copy_stream(src, dst) # :nodoc:
+      dst.write src.read 16_384 until src.eof?


if tests keep failing, maybe change this to dst.write src.read ?

We have to assume it's the second change that's failing, reading from the tar.gz. This is the original code for the first change.

martinemde

Maybe full read then write is required when reading from tgz in truffleruby.

lib/rubygems/package.rb

martinemde · 2023-12-12T06:24:16Z

If this doesn't work I'll just make one change at a time.

The befuddlement to code ratio of this change is surprisingly high.

segiddins · 2023-12-18T02:17:44Z

@martinemde I think this is ready to merge?

martinemde · 2023-12-18T02:22:32Z

I'll merge now, then when truffleruby is released we can follow up with an engine version when we know it works.

Use IO.copy_stream when reading, writing (cherry picked from commit 558f516)

martinemde force-pushed the martinemde/io-copy-stream branch from 77ae281 to 846dd72 Compare September 14, 2023 16:17

segiddins approved these changes Sep 14, 2023

View reviewed changes

martinemde changed the title ~~Use IO.copy_stream when reading, then writing~~ Use IO.copy_stream when reading, writing Sep 14, 2023

martinemde force-pushed the martinemde/io-copy-stream branch from c047809 to 2a7d462 Compare October 7, 2023 16:06

simi approved these changes Oct 7, 2023

View reviewed changes

martinemde added the status: blocked / backlog label Oct 12, 2023

segiddins force-pushed the martinemde/io-copy-stream branch from 2a7d462 to f94a88a Compare November 27, 2023 04:15

segiddins mentioned this pull request Dec 8, 2023

Fewer allocations in gem installation #6975

Merged

4 tasks

Use IO.copy_stream when reading, then writing

5ae7af7

martinemde force-pushed the martinemde/io-copy-stream branch from f94a88a to 5ae7af7 Compare December 10, 2023 00:36

Compensate for truffleruby IO.copy_stream

52d7719

segiddins approved these changes Dec 11, 2023

View reviewed changes

martinemde commented Dec 12, 2023

View reviewed changes

lib/rubygems/package.rb Outdated Show resolved Hide resolved

try full read then write.

cea0fbb

segiddins approved these changes Dec 12, 2023

View reviewed changes

segiddins removed the status: blocked / backlog label Dec 18, 2023

martinemde merged commit 558f516 into master Dec 18, 2023
72 checks passed

martinemde deleted the martinemde/io-copy-stream branch December 18, 2023 02:22

deivid-rodriguez added the rubygems: performance label Dec 21, 2023

deivid-rodriguez pushed a commit that referenced this pull request Dec 21, 2023

Merge pull request #6958 from rubygems/martinemde/io-copy-stream

8f5ad64

Use IO.copy_stream when reading, writing (cherry picked from commit 558f516)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use IO.copy_stream when reading, writing #6958

Use IO.copy_stream when reading, writing #6958

martinemde commented Sep 14, 2023

segiddins left a comment

martinemde commented Sep 15, 2023 •

edited

segiddins Dec 10, 2023

martinemde Dec 12, 2023

segiddins Dec 12, 2023

martinemde Dec 12, 2023

segiddins Dec 10, 2023

martinemde Dec 12, 2023

martinemde left a comment

martinemde commented Dec 12, 2023

segiddins commented Dec 18, 2023

martinemde commented Dec 18, 2023

Use IO.copy_stream when reading, writing #6958

Use IO.copy_stream when reading, writing #6958

Conversation

martinemde commented Sep 14, 2023

What was the end-user or developer problem that led to this PR?

What is your fix for the problem, implemented in this PR?

Make sure the following tasks are checked

segiddins left a comment

Choose a reason for hiding this comment

martinemde commented Sep 15, 2023 • edited

segiddins Dec 10, 2023

Choose a reason for hiding this comment

martinemde Dec 12, 2023

Choose a reason for hiding this comment

segiddins Dec 12, 2023

Choose a reason for hiding this comment

martinemde Dec 12, 2023

Choose a reason for hiding this comment

segiddins Dec 10, 2023

Choose a reason for hiding this comment

martinemde Dec 12, 2023

Choose a reason for hiding this comment

martinemde left a comment

Choose a reason for hiding this comment

martinemde commented Dec 12, 2023

segiddins commented Dec 18, 2023

martinemde commented Dec 18, 2023

martinemde commented Sep 15, 2023 •

edited