New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty hash optimization #1676

Merged
merged 2 commits into from Jul 27, 2014

Conversation

Projects
None yet
3 participants
@grddev
Contributor

grddev commented May 1, 2014

While running some performance comparisons between MRI and JRuby, I realised that MRI (or at least new versions of) optimises access to empty hashes. For that purpose I implemented a small benchmark:

require 'benchmark'

N = 40_000
CHECK = 'x' * 20_000
OPERATIONS = [:[], :key?, :delete, :values_at]
SIZES = {
  'empty' => { },
  'small' => { 0 => 0, 1 => 1 },
}

class Detector
  def hash
    $hash_called = 1
  end
end

puts RUBY_DESCRIPTION
Benchmark.bmbm(10) do |x|
  OPERATIONS.each do |op|
    SIZES.each do |name, hash|
      $hash_called = 0
      hash.send(op, Detector.new)
      x.report("#{name}\##{op} #{$hash_called}") { N.times { hash.send(op, CHECK) } }
    end
  end
end

In addition to checking the performance of a few operations against an empty and a small Hash, it also tries the operation with a Detector instance, which has a side-effect in its #hash method in order to detect wether the method is actually called or not in the different Ruby versions I tested.

The pull request consists of two commits, one which implements the general performance tweak (which would certainly be enough for master) and another which strives to restrict the optimisation to the behaviour of the emulated MRI version. The solution is not perfect, but I'm also not sure it is even worthwhile to emulate the behaviour in this case. I strongly suspect that the number of times the #hash function is called is not part of any meaningful contract for Hash.

The JRuby behaviour without this pull request is in line with MRI 1.8.7, but is technically wrong for 1.9.3 and beyond (I don't have any versions between 1.8.7 and 1.9.3 to verify exactly when MRI implemented their change). Personally, I would vote for implementing the performance tweak independent of version, and potentially break an implementation relying on #hash side effects.

Below are results first for JRuby 1.7.2, and then after that MRI versions 1.8.7, 1.9.3, and 2.0.0 followed by their emulated counterparts with the pull request applied.

jruby 1.7.12 (1.9.3p392) 2014-04-15 643e292 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]

                        user     system      total        real
empty#[] 1          3.190000   0.000000   3.190000 (  3.199000)
small#[] 1          3.160000   0.010000   3.170000 (  3.166000)
empty#key? 1        3.110000   0.010000   3.120000 (  3.110000)
small#key? 1        3.150000   0.010000   3.160000 (  3.160000)
empty#delete 1      3.230000   0.010000   3.240000 (  3.241000)
small#delete 1      3.530000   0.010000   3.540000 (  3.545000)
empty#values_at 1   3.370000   0.010000   3.380000 (  3.402000)
small#values_at 1   3.260000   0.010000   3.270000 (  3.253000)

ruby 2.0.0p451 (2014-02-24 revision 45167) [x86_64-darwin12.5.0]

                        user     system      total        real
empty#[] 0          0.010000   0.000000   0.010000 (  0.009584)
small#[] 1          0.740000   0.000000   0.740000 (  0.746104)
empty#key? 0        0.010000   0.000000   0.010000 (  0.005490)
small#key? 1        0.750000   0.000000   0.750000 (  0.752317)
empty#delete 0      0.000000   0.000000   0.000000 (  0.006898)
small#delete 1      0.780000   0.000000   0.780000 (  0.788203)
empty#values_at 0   0.000000   0.000000   0.000000 (  0.008107)
small#values_at 1   0.780000   0.010000   0.790000 (  0.782744)

jruby 1.7.13-SNAPSHOT (2.0.0p195) 2014-05-01 68d4185 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]

                      user     system      total        real
empty#[] 0          0.020000   0.000000   0.020000 (  0.014000)
small#[] 1          3.170000   0.010000   3.180000 (  3.178000)
empty#key? 0        0.010000   0.000000   0.010000 (  0.008000)
small#key? 1        3.160000   0.010000   3.170000 (  3.168000)
empty#delete 0      0.000000   0.000000   0.000000 (  0.008000)
small#delete 1      3.080000   0.010000   3.090000 (  3.090000)
empty#values_at 0   0.010000   0.000000   0.010000 (  0.015000)
small#values_at 1   3.090000   0.010000   3.100000 (  3.090000)

ruby 1.9.3p545 (2014-02-24 revision 45159) [x86_64-darwin13.1.0]

                        user     system      total        real
empty#[] 0          0.830000   0.000000   0.830000 (  0.827847)
small#[] 1          0.840000   0.000000   0.840000 (  0.838575)
empty#key? 0        0.820000   0.000000   0.820000 (  0.823376)
small#key? 1        0.800000   0.000000   0.800000 (  0.799500)
empty#delete 1      0.800000   0.000000   0.800000 (  0.804521)
small#delete 1      0.830000   0.010000   0.840000 (  0.829005)
empty#values_at 1   0.830000   0.000000   0.830000 (  0.835925)
small#values_at 1   0.810000   0.000000   0.810000 (  0.812079)

jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 68d4185 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]

                        user     system      total        real
empty#[] 0          0.010000   0.000000   0.010000 (  0.010000)
small#[] 1          3.160000   0.010000   3.170000 (  3.163000)
empty#key? 0        0.010000   0.000000   0.010000 (  0.008000)
small#key? 1        3.130000   0.000000   3.130000 (  3.136000)
empty#delete 1      3.160000   0.010000   3.170000 (  3.176000)
small#delete 1      3.180000   0.010000   3.190000 (  3.186000)
empty#values_at 0   0.020000   0.000000   0.020000 (  0.020000)
small#values_at 1   3.160000   0.010000   3.170000 (  3.176000)

ruby 1.8.7 (2013-06-27 patchlevel 374) [i686-darwin13.1.0]

                        user     system      total        real
empty#[] 1          1.580000   0.000000   1.580000 (  1.585427)
small#[] 1          1.660000   0.000000   1.660000 (  1.663023)
empty#key? 1        1.620000   0.000000   1.620000 (  1.624190)
small#key? 1        1.600000   0.010000   1.610000 (  1.602030)
empty#delete 1      1.630000   0.000000   1.630000 (  1.635427)
small#delete 1      1.710000   0.000000   1.710000 (  1.711169)
empty#values_at 1   1.650000   0.000000   1.650000 (  1.656964)
small#values_at 1   1.620000   0.000000   1.620000 (  1.618939)

jruby 1.7.13-SNAPSHOT (ruby-1.8.7p370) 2014-05-01 68d4185 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]

                        user     system      total        real
empty#[] 1          3.120000   0.000000   3.120000 (  3.121000)
small#[] 1          3.100000   0.010000   3.110000 (  3.110000)
empty#key? 1        3.070000   0.000000   3.070000 (  3.076000)
small#key? 1        3.070000   0.010000   3.080000 (  3.083000)
empty#delete 1      3.120000   0.000000   3.120000 (  3.119000)
small#delete 1      3.030000   0.010000   3.040000 (  3.036000)
empty#values_at 1   3.090000   0.000000   3.090000 (  3.094000)
small#values_at 1   3.160000   0.010000   3.170000 (  3.145000)

grddev added some commits May 1, 2014

Optimize key-based Hash access for empty hashes
This avoids performing the potentially expensive hash computation in
the event that the Hash is empty, as there cannot possibly be a match.
Conform to MRI behaviour for empty hashes
In the event that the `#hash` method has side effects, any optimisation
with respect to empty hashes will affect the results of the operations.

For MRI, the empty-hash optimisation seems to be applied everywhere in
2.0.0 and beyond. In 1.9.3 the #hash method is not called for #[] and
#key?, but strangely enough the performance is not improved for empty
hashes. In 1.8.7, the #hash method is always called even if empty.

This does not match the semantics of MRI exactly, as this avoids calling
the #hash method from #values_at in 1.9.3, but that would be slightly
more involved to implement.
@mjc

This comment has been minimized.

Show comment
Hide comment
@mjc

mjc Jul 26, 2014

Contributor

jruby_1_7 192cfb5, 2.0 mode, java 8u11, warm JIT, indy off:

Calculating -------------------------------------
          empty#[] 1      3316 i/100ms
          small#[] 1      3333 i/100ms
        empty#key? 1      3337 i/100ms
        small#key? 1      3357 i/100ms
      empty#delete 1      3365 i/100ms
      small#delete 1      3354 i/100ms
   empty#values_at 1      3332 i/100ms
   small#values_at 1      3336 i/100ms
-------------------------------------------------
          empty#[] 1    33892.3 (±1.6%) i/s -     172432 in   5.089000s
          small#[] 1    33457.4 (±2.2%) i/s -     169983 in   5.083000s
        empty#key? 1    33539.3 (±2.3%) i/s -     170187 in   5.077000s
        small#key? 1    34014.9 (±1.2%) i/s -     171207 in   5.034000s
      empty#delete 1    33713.8 (±1.8%) i/s -     171615 in   5.092000s
      small#delete 1    33765.4 (±2.0%) i/s -     171054 in   5.068000s
   empty#values_at 1    33699.0 (±1.6%) i/s -     169932 in   5.044000s
   small#values_at 1    33160.6 (±2.0%) i/s -     166800 in   5.032000s

same as above, with patch applied:

Calculating -------------------------------------
          empty#[] 0    236574 i/100ms
          small#[] 1      3356 i/100ms
        empty#key? 0    240100 i/100ms
        small#key? 1      3270 i/100ms
      empty#delete 0    251527 i/100ms
      small#delete 1      3283 i/100ms
   empty#values_at 0    207500 i/100ms
   small#values_at 1      3291 i/100ms
-------------------------------------------------
          empty#[] 0 11939837.5 (±4.5%) i/s -   59616648 in   5.004000s
          small#[] 1    33781.5 (±2.2%) i/s -     171156 in   5.069000s
        empty#key? 0 12914673.9 (±4.6%) i/s -   64586900 in   5.013000s
        small#key? 1    33528.9 (±2.2%) i/s -     170040 in   5.074000s
      empty#delete 0 13501938.4 (±4.4%) i/s -   67409236 in   5.003000s
      small#delete 1    33965.4 (±1.3%) i/s -     170716 in   5.027000s
   empty#values_at 0  7842808.2 (±2.9%) i/s -   39217500 in   5.005000s
   small#values_at 1    33338.8 (±2.9%) i/s -     167841 in   5.039000s

Excellent speedup. I'll test this against master too and post those results.

Contributor

mjc commented Jul 26, 2014

jruby_1_7 192cfb5, 2.0 mode, java 8u11, warm JIT, indy off:

Calculating -------------------------------------
          empty#[] 1      3316 i/100ms
          small#[] 1      3333 i/100ms
        empty#key? 1      3337 i/100ms
        small#key? 1      3357 i/100ms
      empty#delete 1      3365 i/100ms
      small#delete 1      3354 i/100ms
   empty#values_at 1      3332 i/100ms
   small#values_at 1      3336 i/100ms
-------------------------------------------------
          empty#[] 1    33892.3 (±1.6%) i/s -     172432 in   5.089000s
          small#[] 1    33457.4 (±2.2%) i/s -     169983 in   5.083000s
        empty#key? 1    33539.3 (±2.3%) i/s -     170187 in   5.077000s
        small#key? 1    34014.9 (±1.2%) i/s -     171207 in   5.034000s
      empty#delete 1    33713.8 (±1.8%) i/s -     171615 in   5.092000s
      small#delete 1    33765.4 (±2.0%) i/s -     171054 in   5.068000s
   empty#values_at 1    33699.0 (±1.6%) i/s -     169932 in   5.044000s
   small#values_at 1    33160.6 (±2.0%) i/s -     166800 in   5.032000s

same as above, with patch applied:

Calculating -------------------------------------
          empty#[] 0    236574 i/100ms
          small#[] 1      3356 i/100ms
        empty#key? 0    240100 i/100ms
        small#key? 1      3270 i/100ms
      empty#delete 0    251527 i/100ms
      small#delete 1      3283 i/100ms
   empty#values_at 0    207500 i/100ms
   small#values_at 1      3291 i/100ms
-------------------------------------------------
          empty#[] 0 11939837.5 (±4.5%) i/s -   59616648 in   5.004000s
          small#[] 1    33781.5 (±2.2%) i/s -     171156 in   5.069000s
        empty#key? 0 12914673.9 (±4.6%) i/s -   64586900 in   5.013000s
        small#key? 1    33528.9 (±2.2%) i/s -     170040 in   5.074000s
      empty#delete 0 13501938.4 (±4.4%) i/s -   67409236 in   5.003000s
      small#delete 1    33965.4 (±1.3%) i/s -     170716 in   5.027000s
   empty#values_at 0  7842808.2 (±2.9%) i/s -   39217500 in   5.005000s
   small#values_at 1    33338.8 (±2.9%) i/s -     167841 in   5.039000s

Excellent speedup. I'll test this against master too and post those results.

@mjc

This comment has been minimized.

Show comment
Hide comment
@mjc

mjc Jul 26, 2014

Contributor

master (abf3c49), JIT off, indy off, warm:

Calculating -------------------------------------
          empty#[] 1      3145 i/100ms
          small#[] 1      3089 i/100ms
        empty#key? 1      3206 i/100ms
        small#key? 1      3204 i/100ms
      empty#delete 1      3182 i/100ms
      small#delete 1      3228 i/100ms
   empty#values_at 1      3229 i/100ms
   small#values_at 1      3207 i/100ms
-------------------------------------------------
          empty#[] 1    33525.4 (±1.6%) i/s -     169830 in   5.067000s
          small#[] 1    33252.8 (±1.8%) i/s -     166806 in   5.018000s
        empty#key? 1    33049.5 (±2.3%) i/s -     166712 in   5.047000s
        small#key? 1    32837.7 (±2.5%) i/s -     166608 in   5.077000s
      empty#delete 1    32630.7 (±2.5%) i/s -     165464 in   5.074000s
      small#delete 1    32724.7 (±2.1%) i/s -     164628 in   5.033000s
   empty#values_at 1    33059.4 (±2.0%) i/s -     167908 in   5.081000s
   small#values_at 1    32231.3 (±2.6%) i/s -     163557 in   5.078000s

master (abf3c49) with patch applied, same as above:

Calculating -------------------------------------
          empty#[] 0     59391 i/100ms
          small#[] 1      3164 i/100ms
        empty#key? 0     61153 i/100ms
        small#key? 1      3177 i/100ms
      empty#delete 0     62821 i/100ms
      small#delete 1      3132 i/100ms
   empty#values_at 0     59283 i/100ms
   small#values_at 1      3187 i/100ms
-------------------------------------------------
          empty#[] 0  1481771.6 (±4.6%) i/s -    7423875 in   5.022000s
          small#[] 1    32780.6 (±2.4%) i/s -     164528 in   5.022000s
        empty#key? 0  1534521.3 (±3.9%) i/s -    7705278 in   5.030000s
        small#key? 1    33279.0 (±1.6%) i/s -     168381 in   5.061000s
      empty#delete 0  1573938.9 (±2.7%) i/s -    7915446 in   5.033000s
      small#delete 1    33430.4 (±1.3%) i/s -     169128 in   5.060000s
   empty#values_at 0  1434370.6 (±3.5%) i/s -    7173243 in   5.008000s
   small#values_at 1    33202.5 (±1.8%) i/s -     168911 in   5.089000s
Contributor

mjc commented Jul 26, 2014

master (abf3c49), JIT off, indy off, warm:

Calculating -------------------------------------
          empty#[] 1      3145 i/100ms
          small#[] 1      3089 i/100ms
        empty#key? 1      3206 i/100ms
        small#key? 1      3204 i/100ms
      empty#delete 1      3182 i/100ms
      small#delete 1      3228 i/100ms
   empty#values_at 1      3229 i/100ms
   small#values_at 1      3207 i/100ms
-------------------------------------------------
          empty#[] 1    33525.4 (±1.6%) i/s -     169830 in   5.067000s
          small#[] 1    33252.8 (±1.8%) i/s -     166806 in   5.018000s
        empty#key? 1    33049.5 (±2.3%) i/s -     166712 in   5.047000s
        small#key? 1    32837.7 (±2.5%) i/s -     166608 in   5.077000s
      empty#delete 1    32630.7 (±2.5%) i/s -     165464 in   5.074000s
      small#delete 1    32724.7 (±2.1%) i/s -     164628 in   5.033000s
   empty#values_at 1    33059.4 (±2.0%) i/s -     167908 in   5.081000s
   small#values_at 1    32231.3 (±2.6%) i/s -     163557 in   5.078000s

master (abf3c49) with patch applied, same as above:

Calculating -------------------------------------
          empty#[] 0     59391 i/100ms
          small#[] 1      3164 i/100ms
        empty#key? 0     61153 i/100ms
        small#key? 1      3177 i/100ms
      empty#delete 0     62821 i/100ms
      small#delete 1      3132 i/100ms
   empty#values_at 0     59283 i/100ms
   small#values_at 1      3187 i/100ms
-------------------------------------------------
          empty#[] 0  1481771.6 (±4.6%) i/s -    7423875 in   5.022000s
          small#[] 1    32780.6 (±2.4%) i/s -     164528 in   5.022000s
        empty#key? 0  1534521.3 (±3.9%) i/s -    7705278 in   5.030000s
        small#key? 1    33279.0 (±1.6%) i/s -     168381 in   5.061000s
      empty#delete 0  1573938.9 (±2.7%) i/s -    7915446 in   5.033000s
      small#delete 1    33430.4 (±1.3%) i/s -     169128 in   5.060000s
   empty#values_at 0  1434370.6 (±3.5%) i/s -    7173243 in   5.008000s
   small#values_at 1    33202.5 (±1.8%) i/s -     168911 in   5.089000s
@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Jul 26, 2014

Member

Sorry for dithering on this for so long. My two observations on it when I originally looked it was:

  1. MRI not having side-effects on empty structures seems like an implementation detail leaking through. So I don't think we need to match it, but I also do not see a problem with following it either.
  2. Performance tweaks like this is great if the pattern happens with any regularity. I did not just apply this when first submitted because I wondered if this was common or not.

I guess an empty check is not a huge added complexity but it would be really cool if someone could qualify how often this case happens. Personally, I am not against merging this, but it would be nice to know if this can be measured outside this microbenchmark.

Member

enebo commented Jul 26, 2014

Sorry for dithering on this for so long. My two observations on it when I originally looked it was:

  1. MRI not having side-effects on empty structures seems like an implementation detail leaking through. So I don't think we need to match it, but I also do not see a problem with following it either.
  2. Performance tweaks like this is great if the pattern happens with any regularity. I did not just apply this when first submitted because I wondered if this was common or not.

I guess an empty check is not a huge added complexity but it would be really cool if someone could qualify how often this case happens. Personally, I am not against merging this, but it would be nice to know if this can be measured outside this microbenchmark.

@grddev

This comment has been minimized.

Show comment
Hide comment
@grddev

grddev Jul 27, 2014

Contributor

@enebo, in terms of motivation, I think the Ruby idiom def method ..., options = {} in itself is enough to justify the empty hash optimization. In the particular instance where I noticed it was a performance bottleneck was with multi_json, but a quick grep revealed that the above pattern occurs over 100 times in the bundled JRuby standard library alone.

Regarding the side-effects, I take it you'd prefer if I remove the second commit?

Contributor

grddev commented Jul 27, 2014

@enebo, in terms of motivation, I think the Ruby idiom def method ..., options = {} in itself is enough to justify the empty hash optimization. In the particular instance where I noticed it was a performance bottleneck was with multi_json, but a quick grep revealed that the above pattern occurs over 100 times in the bundled JRuby standard library alone.

Regarding the side-effects, I take it you'd prefer if I remove the second commit?

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Jul 27, 2014

Member

@grddev I am cool with the second commit. I mostly was just saying following it is not what I consider to be semantics (so more qualitative statement about behavior like this not being mandatory).

As for the justification I was hoping for some actual usage statistics. I know active_support has it a lot in code but that does not mean it is actually used a lot (consider many methods pass in an options hash so opts = {} never actually executes. That said, I have talked too much about this as it is a simple optimization and MRI decided it is worth doing. Sorry to drag too long on this :)

Member

enebo commented Jul 27, 2014

@grddev I am cool with the second commit. I mostly was just saying following it is not what I consider to be semantics (so more qualitative statement about behavior like this not being mandatory).

As for the justification I was hoping for some actual usage statistics. I know active_support has it a lot in code but that does not mean it is actually used a lot (consider many methods pass in an options hash so opts = {} never actually executes. That said, I have talked too much about this as it is a simple optimization and MRI decided it is worth doing. Sorry to drag too long on this :)

enebo added a commit that referenced this pull request Jul 27, 2014

@enebo enebo merged commit e992b91 into jruby:jruby-1_7 Jul 27, 2014

1 check failed

continuous-integration/travis-ci The Travis CI build could not complete due to an error
Details

@enebo enebo added this to the JRuby 1.7.14 milestone Jul 27, 2014

@enebo enebo modified the milestones: JRuby 1.7.14, JRuby 1.7.15 Aug 27, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment