New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

match? should be faster than match #5256

Open
ahorek opened this Issue Jul 20, 2018 · 6 comments

Comments

Projects
None yet
5 participants
@ahorek
Contributor

ahorek commented Jul 20, 2018

Environment

jruby 9.2.1.0-SNAPSHOT (2.5.0) 2018-07-20 a2bc9a3 Java HotSpot(TM) 64-Bit Server VM 9.0.4+11 on 9.0.4+11 +jit [linux-x86_64]
ruby 2.6.0dev (2018-07-20 trunk 64004) [x86_64-linux]
ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-linux]

require 'benchmark/ips'

pattern = /^([0-1]?[0-9]|2[0-3])([:\-,\.])?([0-5][0-9])$/
string = '9:30'

Benchmark.ips do |x|
  x.report('match') { string.match(pattern) }
  x.report('match?') { string.match?(pattern) }
end

ruby 2.5.0

               match    926.635k (± 2.6%) i/s -      4.634M in   5.004093s
              match?      3.395M (± 5.5%) i/s -     16.945M in   5.013444s

ruby master

               match      1.024M (± 4.4%) i/s -      5.123M in   5.013162s
              match?      4.809M (± 6.9%) i/s -     23.905M in   5.001929s

ruby master + jit

               match    984.690k (± 7.8%) i/s -      4.911M in   5.018228s
              match?      5.348M (±11.2%) i/s -     26.410M in   5.009641s

jruby master

               match      1.158M (± 9.2%) i/s -      5.712M in   4.995932s
              match?      1.222M (± 5.6%) i/s -      6.092M in   5.005920s

jruby master + indy

               match      1.211M (±11.0%) i/s -      5.912M in   4.985816s
              match?      1.362M (± 4.1%) i/s -      6.782M in   4.989386s

match? should perform better https://bugs.ruby-lang.org/issues/8110
but jruby implement it just like match, without setting a back-reference on a ruby level

ruby has a specialized method for it and it looks like there has been some improvements recently
https://github.com/ruby/ruby/blob/trunk/re.c#L3324

I found match? pattern very useful on many places, so I think it's worth investigating if there's a way to improve it.

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Jul 23, 2018

Member

good find, definitely makes sense to look into "optimizing" match? at least the way MRI does ...
shouldn't be too hard - split out a method or introduce a boolean parameter on the internals to not set $~

Member

kares commented Jul 23, 2018

good find, definitely makes sense to look into "optimizing" match? at least the way MRI does ...
shouldn't be too hard - split out a method or introduce a boolean parameter on the internals to not set $~

@lopex

This comment has been minimized.

Show comment
Hide comment
@lopex

lopex Jul 23, 2018

Member

MRI has separate internal implementation for that which passes region as NULL so we'd have to make something similar in joni.

Member

lopex commented Jul 23, 2018

MRI has separate internal implementation for that which passes region as NULL so we'd have to make something similar in joni.

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Jul 24, 2018

Member

@lopex and I looked at this one today. Removing creating internal regions in joni and then deleting a bunch of code on JRuby side (since we don't need to deal with captures) MRI (2.5.1) was like 5.2M i/s and we were like 4.7-4.9M i/s from like 1.9-2.1M i/s before patching. If I removed interruptible support we went up to about 5.5M i/s. If I tried bench with those changes using graal ce rc4 we get about 7.2M i/s.

@lopex plans on doing a cleaner version of my changes to joni and we will at least close the gap quite a bit.

Even without removing regions we discovered we were calling search instead of match (which we had to call based on the current version of what match? calls through; but we ended up finding two bottlenecks overall so this was somewhat satisfying.

Member

enebo commented Jul 24, 2018

@lopex and I looked at this one today. Removing creating internal regions in joni and then deleting a bunch of code on JRuby side (since we don't need to deal with captures) MRI (2.5.1) was like 5.2M i/s and we were like 4.7-4.9M i/s from like 1.9-2.1M i/s before patching. If I removed interruptible support we went up to about 5.5M i/s. If I tried bench with those changes using graal ce rc4 we get about 7.2M i/s.

@lopex plans on doing a cleaner version of my changes to joni and we will at least close the gap quite a bit.

Even without removing regions we discovered we were calling search instead of match (which we had to call based on the current version of what match? calls through; but we ended up finding two bottlenecks overall so this was somewhat satisfying.

@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 16, 2018

Member

We are closer but still not quite there (numbers from my MBP):

[] ~/projects/jruby $ jruby blah.rb
Warming up --------------------------------------
               match    93.104k i/100ms
              match?   132.259k i/100ms
Calculating -------------------------------------
               match      2.204M (± 9.7%) i/s -     10.893M in   5.009938s
              match?      3.104M (± 6.3%) i/s -     15.474M in   5.009289s

[] ~/projects/jruby $ rvm ruby-2.5 do ruby blah.rb
Warming up --------------------------------------
               match    71.941k i/100ms
              match?   222.292k i/100ms
Calculating -------------------------------------
               match    917.639k (± 5.0%) i/s -      4.604M in   5.032212s
              match?      4.885M (± 2.1%) i/s -     24.452M in   5.007808s

[] ~/projects/jruby $ jruby -Xcompile.invokedynamic blah.rb
Warming up --------------------------------------
               match   106.672k i/100ms
              match?   152.679k i/100ms
Calculating -------------------------------------
               match      2.565M (±10.5%) i/s -     12.694M in   5.030781s
              match?      3.712M (± 6.3%) i/s -     18.474M in   5.003471s
Member

headius commented Aug 16, 2018

We are closer but still not quite there (numbers from my MBP):

[] ~/projects/jruby $ jruby blah.rb
Warming up --------------------------------------
               match    93.104k i/100ms
              match?   132.259k i/100ms
Calculating -------------------------------------
               match      2.204M (± 9.7%) i/s -     10.893M in   5.009938s
              match?      3.104M (± 6.3%) i/s -     15.474M in   5.009289s

[] ~/projects/jruby $ rvm ruby-2.5 do ruby blah.rb
Warming up --------------------------------------
               match    71.941k i/100ms
              match?   222.292k i/100ms
Calculating -------------------------------------
               match    917.639k (± 5.0%) i/s -      4.604M in   5.032212s
              match?      4.885M (± 2.1%) i/s -     24.452M in   5.007808s

[] ~/projects/jruby $ jruby -Xcompile.invokedynamic blah.rb
Warming up --------------------------------------
               match   106.672k i/100ms
              match?   152.679k i/100ms
Calculating -------------------------------------
               match      2.565M (±10.5%) i/s -     12.694M in   5.030781s
              match?      3.712M (± 6.3%) i/s -     18.474M in   5.003471s
@headius

This comment has been minimized.

Show comment
Hide comment
@headius

headius Aug 28, 2018

Member

FWIW here's an allocation profile of the match? call from the above benchmark run an arbitrary number of times. The vast majority of allocations are the Joni engine and its int[] state.

          percent          live          alloc'ed  stack class
 rank   self  accum     bytes objs     bytes  objs trace name
    1 61.37% 61.37%  38246880 239043 2637534560 16484591 377572 org.joni.ByteCodeMachine
    2 18.41% 79.79%  11474016 239042 791260272 16484589 377574 int[]
    3  0.90% 80.69%    562256 10216    562256 10216 300000 char[]
    4  0.39% 81.08%    245464 10206    245464 10206 300000 java.lang.String
    5  0.31% 81.39%    193472 4233    193472  4233 300000 java.lang.Object[]
    6  0.16% 81.55%     99016  387     99016   387 307282 byte[]
    7  0.15% 81.70%     94464 1968     94560  1970 343568 org.jruby.RubySymbol
    8  0.13% 81.83%     80512  629     80512   629 345011 org.jruby.ir.IRMethod
    9  0.13% 81.96%     78920 1973     79000  1975 343569 org.jruby.util.ByteList
   10  0.12% 82.08%     77600 1940     77600  1940 343562 org.jruby.util.ByteList
Member

headius commented Aug 28, 2018

FWIW here's an allocation profile of the match? call from the above benchmark run an arbitrary number of times. The vast majority of allocations are the Joni engine and its int[] state.

          percent          live          alloc'ed  stack class
 rank   self  accum     bytes objs     bytes  objs trace name
    1 61.37% 61.37%  38246880 239043 2637534560 16484591 377572 org.joni.ByteCodeMachine
    2 18.41% 79.79%  11474016 239042 791260272 16484589 377574 int[]
    3  0.90% 80.69%    562256 10216    562256 10216 300000 char[]
    4  0.39% 81.08%    245464 10206    245464 10206 300000 java.lang.String
    5  0.31% 81.39%    193472 4233    193472  4233 300000 java.lang.Object[]
    6  0.16% 81.55%     99016  387     99016   387 307282 byte[]
    7  0.15% 81.70%     94464 1968     94560  1970 343568 org.jruby.RubySymbol
    8  0.13% 81.83%     80512  629     80512   629 345011 org.jruby.ir.IRMethod
    9  0.13% 81.96%     78920 1973     79000  1975 343569 org.jruby.util.ByteList
   10  0.12% 82.08%     77600 1940     77600  1940 343562 org.jruby.util.ByteList
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment