Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential performance regression with Ruby 3.3 #336

Closed
tagliala opened this issue Dec 24, 2023 · 14 comments
Closed

Potential performance regression with Ruby 3.3 #336

tagliala opened this issue Dec 24, 2023 · 14 comments
Labels
known bug Maybe YOU are the one who wants to fix it?

Comments

@tagliala
Copy link
Contributor

tagliala commented Dec 24, 2023

Ruby 3.3.0(x64) Memory

Also tested against ruby 3.4.0dev (2023-12-30T03:14:38Z master 8e32c01742) [arm64-darwin23]

$ ./test/profile_memory_rep.rb 
Total allocated: 72562889 bytes (723858 objects)
Total retained:  871400 bytes (10173 objects)

allocated memory by gem
-----------------------------------
  72431203  caxlsx/lib
     74310  htmlentities-4.3.4
     39200  fileutils
      5065  open3
      4598  shellwords
      3473  tmpdir
      2720  other
      2160  rubyzip-2.3.2
       160  set

allocated memory by file
-----------------------------------
  52801024  /Users/geremia/dev/caxlsx/lib/axlsx/workbook/worksheet/cell.rb

Ruby 3.2.2 (x64) Memory

$ ./test/profile_memory_rep.rb 
Total allocated: 49296532 bytes (503449 objects)
Total retained:  871353 bytes (10175 objects)

allocated memory by gem
-----------------------------------
  49182491  caxlsx/lib
     44870  htmlentities-4.3.4
     36016  fileutils
     22442  other
      4689  open3
      2577  tmpdir
      2327  shellwords
      1080  rubyzip-2.3.2
        40  set

allocated memory by file
-----------------------------------
  32000768  /Users/geremia/dev/caxlsx/lib/axlsx/workbook/worksheet/cell.rb

1.47x memory

Also, there seems to be a performance degradation in IPS

Ruby 3.3.0 (x64) IPS

                                     user     system      total        real
axlsx_merged_cells               1.451173   0.002135   1.453308 (  1.454905)
axlsx_noautowidth                0.455097   0.001215   0.456312 (  0.456981)
axlsx_autowidth                  0.454843   0.001212   0.456055 (  0.456636)
axlsx_shared                     0.466123   0.001508   0.467631 (  0.468300)
axlsx_stream                     0.455901   0.000726   0.456627 (  0.457245)
axlsx_zip_command                0.446355   0.013546   0.522661 (  0.526744)
csv                              0.067128   0.002678   0.069806 (  0.069974)

Ruby 3.2.2 (x64) IPS

                                     user     system      total        real
axlsx_merged_cells               1.452900   0.002088   1.454988 (  1.458400)
axlsx_noautowidth                0.445243   0.001518   0.446761 (  0.447651)
axlsx_autowidth                  0.443028   0.001141   0.444169 (  0.445178)
axlsx_shared                     0.443346   0.000950   0.444296 (  0.445360)
axlsx_stream                     0.443291   0.001001   0.444292 (  0.445203)
axlsx_zip_command                0.429552   0.011412   0.503254 (  0.510625)
csv                              0.062434   0.003259   0.065693 (  0.065968)

1.03x slower

Ref: ruby/ruby#8123

@tagliala
Copy link
Contributor Author

tagliala commented Dec 24, 2023

Cannot run profile because of a segmentation fault

@numFmts = SimpleTypedList.new NumFmt, 'numFmts'

$ ./test/profile_rep.rb 
./lib/axlsx/stylesheet/styles.rb:503: [BUG] Segmentation fault at 0x0000000000000004
ruby 3.3.0rc1 (2023-12-11 master a49643340e) [x86_64-darwin23]

-- Control frame information -----------------------------------------------
c:0010 p:0008 s:0048 e:000044 METHOD ./lib/axlsx/stylesheet/styles.rb:503
c:0009 p:0003 s:0040 e:000039 METHOD ./lib/axlsx/stylesheet/styles.rb:121 [FINISH]
c:0008 p:---- s:0036 e:000035 CFUNC  :new
c:0007 p:0008 s:0032 e:000031 METHOD ./lib/axlsx/workbook/workbook.rb:232 [FINISH]
c:0006 p:---- s:0027 e:000026 CFUNC  :new
c:0005 p:0009 s:0023 e:000022 METHOD ./lib/axlsx/package.rb:66
c:0004 p:0010 s:0019 e:000018 BLOCK  ./test/profile_rep.rb:16 [FINISH]
c:0003 p:---- s:0015 e:000014 CFUNC  :profile
c:0002 p:0080 s:0011 E:000f08 EVAL   ./test/profile_rep.rb:14 [FINISH]
c:0001 p:0000 s:0003 E:0001f0 DUMMY  [FINISH]

Ref: ruby-prof/ruby-prof#326

@kiskoza
Copy link
Member

kiskoza commented Dec 30, 2023

I've found a significant difference in how MatchData is handled in

elsif v.respond_to?(:to_f) && (Axlsx::SAFE_FLOAT_REGEX.match?(v.to_s) || ((matchdata = MAYBE_FLOAT_REGEX.match(v.to_s)) && matchdata[:exp].to_i.between?(Float::MIN_10_EXP, Float::MAX_10_EXP)))

Minimal code to reproduce the difference:

require 'memory_profiler'

MAYBE_FLOAT_REGEX = /\A[-+]?[0-9]*\.?[0-9]+[eE](?<exp>[-+]?[0-9]{3})\Z/.freeze
row = Array.new(1000) { (rand / 10_000).to_s }

report = MemoryProfiler.report do
  row.each { |v| matchdata = MAYBE_FLOAT_REGEX.match(v.to_s) }
end

report.pretty_print
Ruby 3.2.2
Total allocated: 192 bytes (1 objects)
Total retained:  0 bytes (0 objects)

allocated memory by gem
-----------------------------------
       192  other

allocated memory by file
-----------------------------------
       192  matchdata.rb
Ruby 3.3.0
Total allocated: 104256 bytes (1001 objects)
Total retained:  0 bytes (0 objects)

allocated memory by gem
-----------------------------------
    104256  other

allocated memory by file
-----------------------------------
    104256  matchdata.rb

Could be other places as well, but it seems to be a nice spot to start the investigation

@tagliala
Copy link
Contributor Author

tagliala commented Dec 30, 2023

Thanks.

I'm able to run profiler against this reduced test case. Bringing it to the extreme, it happens on 3.3 when there is not a match

No match

require 'memory_profiler'

report = MemoryProfiler.report do
  /\A[A-Z]+\Z/.match('1')
end

report.pretty_print

Ruby 3.2:

Total allocated: 0 bytes (0 objects)
Total retained:  0 bytes (0 objects)

Ruby 3.3:

Total allocated: 104 bytes (1 objects)
Total retained:  0 bytes (0 objects)

Match (A instead of 1)

Ruby 3.2:

Total allocated: 168 bytes (1 objects)
Total retained:  168 bytes (1 objects)

Ruby 3.3:

Total allocated: 168 bytes (1 objects)
Total retained:  168 bytes (1 objects)

Should we bisect Ruby?

@kiskoza
Copy link
Member

kiskoza commented Dec 30, 2023

I did the bisect, here are the results: ruby/ruby@7193b40 seems to be the culprit.

$> git bisect run fish -c "rm -rf $HOME/.rubies/ruby-bisect/ && cd build && ../configure --prefix=$HOME/.rubies/ruby-bisect --disable-install-doc && make install && cd $HOME/.rubies/ruby-bisect/bin && ./gem install memory_profiler && ./ruby ~/Codes/ruby/matchdata.rb"

7193b404a1a56e50f8046d0382914907020c1559 is the first bad commit
commit 7193b404a1a56e50f8046d0382914907020c1559
Author: Peter Zhu <peter@peterzhu.ca>
Date:   Wed Jul 26 15:57:03 2023 -0400

    Add function rb_reg_onig_match
    
    rb_reg_onig_match performs preparation, error handling, and cleanup for
    matching a regex against a string. This reduces repetitive code and
    removes the need for StringScanner to access internal data of regex.

 ext/strscan/extconf.rb |   1 +
 ext/strscan/strscan.c  | 107 ++++++++++++++--------
 include/ruby/re.h      |  18 ++--
 re.c                   | 242 ++++++++++++++++++++-----------------------------
 regexec.c              |   8 +-
 5 files changed, 185 insertions(+), 191 deletions(-)

where the script is almost the same as you posted before:

matchdata.rb file
require 'memory_profiler'

v = '1'
report = MemoryProfiler.report { /\A[A-Z]+\Z/.match(v) }

exit report.total_allocated_memsize
git bisect log
$> git bisect log

git bisect start
# good: [e51014f9c05aa65cbf203442d37fef7c12390015] v3.2.2p53
git bisect good e51014f9c05aa65cbf203442d37fef7c12390015
# bad: [5124f9ac7513eb590c37717337c430cb93caa151] v3.3.0
git bisect bad 5124f9ac7513eb590c37717337c430cb93caa151
# good: [07c19cf551e58e1b82af5efeb0047f10588fff9f] Fix a grammar [ci skip]
git bisect good 07c19cf551e58e1b82af5efeb0047f10588fff9f
# bad: [cdb757f817b0721d79d43ae8de6f8198d6033177] Avoid synchronizing GitHub Actions from YARP
git bisect bad cdb757f817b0721d79d43ae8de6f8198d6033177
# good: [94e41822679ebd269564ad10d366596e1514d4ef] `rb_current_ractor_raw(b)`
git bisect good 94e41822679ebd269564ad10d366596e1514d4ef
# good: [3e7f5b02aeb2a55b27d402024f42821684d85ec6] [rubygems/rubygems] Improve error message in frozen mode edge case
git bisect good 3e7f5b02aeb2a55b27d402024f42821684d85ec6
# good: [210caa777b2ee2405b1d074b790b0183e0f6b27f] More RDoc for Dir.glob (#8088)
git bisect good 210caa777b2ee2405b1d074b790b0183e0f6b27f
# bad: [f07ef1d54c3bbae62e2ddd7266932c294db91daf] Lrama v0.5.3
git bisect bad f07ef1d54c3bbae62e2ddd7266932c294db91daf
# bad: [108cc38a7658bfb8e9457f95baa5cdfbd175b64d] [rubygems/rubygems] Extract polling logic into its own class
git bisect bad 108cc38a7658bfb8e9457f95baa5cdfbd175b64d
# good: [14691d39e9bc312ea813ceb895f887ca6893e5d9] Revert "GitHub Actions: fix ENOENT for jobs running in forks"
git bisect good 14691d39e9bc312ea813ceb895f887ca6893e5d9
# bad: [83f9d80c0b0b7752e490abb45145073d8d0e656d] downloader.rb: unlink broken symlink before caching to fix ENOENT
git bisect bad 83f9d80c0b0b7752e490abb45145073d8d0e656d
# good: [9bdd48597297dcb01a7e07f5a0f716c4ea9cc1e1] YJIT: Count the number of dynamic send dispatches (#8122)
git bisect good 9bdd48597297dcb01a7e07f5a0f716c4ea9cc1e1
# good: [e5effa4bd063f454f9f304e6f9fbf9dd8b353a76] YJIT: Use dynamic dispatch for megamorphic send (#8125)
git bisect good e5effa4bd063f454f9f304e6f9fbf9dd8b353a76
# bad: [69b20d1196f363b3990694951832408556b4ac04] Don't load RREGEXP_PTR twice
git bisect bad 69b20d1196f363b3990694951832408556b4ac04
# bad: [511c51e116fdd44314bef977e941d90331d07efd] Refactor err string in rb_reg_prepare_re
git bisect bad 511c51e116fdd44314bef977e941d90331d07efd
# bad: [7193b404a1a56e50f8046d0382914907020c1559] Add function rb_reg_onig_match
git bisect bad 7193b404a1a56e50f8046d0382914907020c1559
# first bad commit: [7193b404a1a56e50f8046d0382914907020c1559] Add function rb_reg_onig_match

How should we proceed with this one?

@tagliala
Copy link
Contributor Author

tagliala commented Dec 30, 2023

Sorry, I'm not sure, this is beyond my knowledge

It looks both slower and uses more memory than 3.2. Is it possible that this is an undesired regression in Ruby?

%i[ips memory].each do |benchmark|
  Benchmark.send(benchmark) do |x|
    x.report('test') { /\A[A-Z]+\Z/.match('1') }
    x.compare!
  end
end
Ruby version: 3.2.2
Warming up --------------------------------------
                test   251.171k i/100ms
Calculating -------------------------------------
                test      2.501M (± 0.3%) i/s -     12.559M in   5.021641s
Calculating -------------------------------------
                test     0.000  memsize (     0.000  retained)
                         0.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
Ruby version: 3.3.0
Warming up --------------------------------------
                test   218.089k i/100ms
Calculating -------------------------------------
                test      2.196M (± 1.0%) i/s -     11.123M in   5.066107s
Calculating -------------------------------------
                test   104.000  memsize (     0.000  retained)
                         1.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
Benchmarks in case of match
Ruby version: 3.2.2
Warming up --------------------------------------
                test   227.245k i/100ms
Calculating -------------------------------------
                test      2.268M (± 7.1%) i/s -     11.362M in   5.027076s
Calculating -------------------------------------
                test   168.000  memsize (   168.000  retained)
                         1.000  objects (     1.000  retained)
                         0.000  strings (     0.000  retained)
Ruby version: 3.3.0
Warming up --------------------------------------
                test   206.341k i/100ms
Calculating -------------------------------------
                test      2.120M (± 2.0%) i/s -     10.730M in   5.062185s
Calculating -------------------------------------
                test   168.000  memsize (   168.000  retained)
                         1.000  objects (     1.000  retained)
                         0.000  strings (     0.000  retained)

@luke-gru
Copy link

luke-gru commented Dec 30, 2023

I looked into this and created a pull request for ruby here: ruby/ruby#9398. There is also a Ruby ticket here: https://bugs.ruby-lang.org/issues/20104

@tagliala
Copy link
Contributor Author

tagliala commented Dec 31, 2023

Thanks @luke-gru! I've followed the PR to understand the fix and I've seen that it has been approved. Should it be also backported to 3.3.x stable branch?

@kiskoza would it be possible for you to run the full profile_memory script using luke-gru:re_match_no_match_no_object branch? I guess it would require much less time for you, since at the moment my development machine is not configured to build ruby at specific branches (I'm using rvm)

@kiskoza
Copy link
Member

kiskoza commented Dec 31, 2023

It looks okay again 👍 Thanks @luke-gru for the quick fix 🚀

Tested on ruby 3.4.0dev (2023-12-30T18:26:59Z re_match_no_match_.. 248dbaa34a) [x86_64-linux]

Total allocated: 49550576 bytes (503506 objects)
Total retained:  871440 bytes (10174 objects)

allocated memory by gem
-----------------------------------
  49438451  caxlsx/lib
     69630  htmlentities-4.3.4
     28640  fileutils
      4083  open3
      2888  tmpdir
      2324  shellwords
      2240  other
      2160  rubyzip-2.3.2
       160  set

@luke-gru
Copy link

luke-gru commented Jan 4, 2024

Thanks @luke-gru for the quick fix

You're welcome 😄

I've put in a request for a backport to 3.3 in the redmine ticket that's linked in my comment above. Thanks for the thorough investigation @tagliala and @kiskoza, it made finding the source of the error much easier!

@tagliala
Copy link
Contributor Author

tagliala commented Jan 7, 2024

I've compared performance between 3.2.2 and 3.4.0-dev and it looks almost the same, with a slightly advantage towards 3.4, except for axlsx_merged_cells, which seems to have a consistent 8% slowdown

$ ruby --version
ruby 3.4.0dev (2024-01-07T07:18:16Z master 83c98ead4e) [x86_64-darwin23]

                                     user     system      total        real
axlsx_merged_cells               1.676058   0.002793   1.678851 (  1.682359)
axlsx_merged_cells               1.666813   0.002912   1.669725 (  1.673564)
axlsx_merged_cells               1.676876   0.001952   1.678828 (  1.682328)
axlsx_merged_cells               1.667184   0.002927   1.670111 (  1.673674)
$ ruby --version
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-darwin22]

                                     user     system      total        real
axlsx_merged_cells               1.568698   0.001896   1.570594 (  1.573285)
axlsx_merged_cells               1.557191   0.002239   1.559430 (  1.564341)
axlsx_merged_cells               1.557767   0.002371   1.560138 (  1.566215)
axlsx_merged_cells               1.563460   0.002348   1.565808 (  1.571161)

@kiskoza
Copy link
Member

kiskoza commented Jan 12, 2024

It has been resolved already on Ruby's main branch and the backport seems to be on its way, nothing to do from our side. We should keep this ticket open until Ruby 3.3.1 arrives to verify the fix.

@kiskoza kiskoza added the known bug Maybe YOU are the one who wants to fix it? label Jan 12, 2024
@97jaz
Copy link

97jaz commented Jan 25, 2024

Does it seem likely that the unnecessary match allocation is responsible for the unbounded memory growth described here? I am not terribly familiar with the MRI internals, but I'd guess that the match is allocated on the garbage collected heap, so I wouldn't expect this actually to leak memory.

I ask, because my company's application is having some OOM issues after upgrading to 3.3.0, and I'm trying to figure out if we should revert to 3.2 for now.

@luke-gru
Copy link

@97jaz Like you mentioned, the MatchData object lives on the ruby heap so it wouldn't leak memory. Maybe try opening a ticket on redmine.ruby-lang.org with a detailed account of the problem and how much memory your environment has. Reverting also might not be a bad idea.

@kiskoza
Copy link
Member

kiskoza commented Apr 24, 2024

Ruby 3.3.1 got released 🎉

The results seems to be okay when running ruby test/profile_memory.rb on v4.1.0 with Ruby 3.3.1

Total allocated: 49550490 bytes (503506 objects)
Total retained:  871440 bytes (10174 objects)

allocated memory by gem
-----------------------------------
  49438451  caxlsx/lib
     69630  htmlentities-4.3.4
     38009  lib
      2240  other
      2160  rubyzip-2.3.2

@kiskoza kiskoza closed this as completed Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
known bug Maybe YOU are the one who wants to fix it?
Projects
None yet
Development

No branches or pull requests

4 participants