Faster and more readable implementation of Hash#deep_merge #30275

msimonborg · 2017-08-15T21:41:55Z

Summary

This is a re-implementation of Hash#deep_merge. Using Ruby's built-in optional block for merge to handle duplicate keys makes the code shorter and more readable, and since the block is only run when a duplicate key is encountered it also improves performance. Performance benefits fluctuate case by case and are felt more strongly when merging hashes with fewer duplicate keys. All tests pass after this change.

Benchmark (MacOS 10.12.6, ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-darwin16])

In a gist

require 'active_support'
require 'benchmark/ips'

class Hash
  def new_deep_merge(other_hash, &block)
    dup.new_deep_merge!(other_hash, &block)
  end

  def new_deep_merge!(other, &block)
    merge!(other) do |key, old_val, new_val|
      if old_val.is_a?(Hash) && new_val.is_a?(Hash)
        old_val.new_deep_merge(new_val, &block)
      elsif block_given?
        block.call(key, old_val, new_val)
      else
        new_val
      end
    end
  end
end

h1 = { a: true, b: { c: [1, 2, 3] }, d: 1 }
h2 = { a: false, b: { c: [3, 4, 5] }, d: 2 }
h3 = { a: false, b: { x: [3, 4, 5] }, e: 2, f: { y: false } }
h4 = { e: 2, f: { y: false }, g: 'hi', h: true }

Benchmark.ips do |bm|
  bm.report('old_deep_merge_all_dupes') { h1.deep_merge(h2) }
  bm.report('new_deep_merge_all_dupes') { h1.new_deep_merge(h2) }
  bm.compare!
end

Benchmark.ips do |bm|
  bm.report('old_deep_merge_some_dupes') { h1.deep_merge(h3) }
  bm.report('new_deep_merge_some_dupes') { h1.new_deep_merge(h3) }
  bm.compare!
end

Benchmark.ips do |bm|
  bm.report('old_deep_merge_no_dupes') { h1.deep_merge(h4) }
  bm.report('new_deep_merge_no_dupes') { h1.new_deep_merge(h4) }
  bm.compare!
end

Warming up --------------------------------------
old_deep_merge_all_dupes
                        21.602k i/100ms
new_deep_merge_all_dupes
                        23.242k i/100ms
Calculating -------------------------------------
old_deep_merge_all_dupes
                        248.052k (± 3.3%) i/s -      1.253M in   5.056553s
new_deep_merge_all_dupes
                        264.029k (± 3.8%) i/s -      1.325M in   5.024762s

Comparison:
new_deep_merge_all_dupes:   264028.7 i/s
old_deep_merge_all_dupes:   248052.3 i/s - same-ish: difference falls within error

Warming up --------------------------------------
old_deep_merge_some_dupes
                        19.705k i/100ms
new_deep_merge_some_dupes
                        22.753k i/100ms
Calculating -------------------------------------
old_deep_merge_some_dupes
                        209.495k (± 5.0%) i/s -      1.064M in   5.092063s
new_deep_merge_some_dupes
                        249.642k (± 3.6%) i/s -      1.251M in   5.019540s

Comparison:
new_deep_merge_some_dupes:   249642.2 i/s
old_deep_merge_some_dupes:   209494.5 i/s - 1.19x  slower

Warming up --------------------------------------
old_deep_merge_no_dupes
                        30.064k i/100ms
new_deep_merge_no_dupes
                        40.112k i/100ms
Calculating -------------------------------------
old_deep_merge_no_dupes
                        330.682k (± 3.9%) i/s -      1.654M in   5.008077s
new_deep_merge_no_dupes
                        471.547k (± 3.2%) i/s -      2.367M in   5.023926s

Comparison:
new_deep_merge_no_dupes:   471547.3 i/s
old_deep_merge_no_dupes:   330682.5 i/s - 1.43x  slower

rails-bot · 2017-08-15T21:41:58Z

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @rafaelfranca (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

rafaelfranca · 2017-08-15T21:52:54Z

Can you use benchmark-ips in your benchmark?

rafaelfranca · 2017-08-15T21:55:02Z

activesupport/lib/active_support/core_ext/hash/deep_merge.rb

+  def deep_merge!(other, &block)
+    merge!(other) do |_key, old_val, new_val|
+      if old_val.is_a?(Hash) && new_val.is_a?(Hash)
+        old_val.dup.deep_merge!(new_val, &block)


Why not call deep_merge here?

You're right, I'll fix that in a new commit

msimonborg · 2017-08-15T22:10:40Z

Fixed that LOC per your comment and changed benchmarks to benchmark/ips. Let me know if there's anything else you'd want to see!

lugray · 2017-08-16T17:34:08Z

activesupport/lib/active_support/core_ext/hash/deep_merge.rb

-
-      self[current_key] = if this_value.is_a?(Hash) && other_value.is_a?(Hash)
-        this_value.deep_merge(other_value, &block)
+    merge!(other_hash) do |_key, this_val, other_val|


This should be key, not _key since it's used in L28.

lugray

Can you squash down your commits?

jonathanhefner · 2017-08-16T22:18:56Z

I ran the benchmark on my machine (Ruby 2.2), and here are the numbers I got:

new_deep_merge_all_dupes:    99330.4 i/s
old_deep_merge_all_dupes:    94990.2 i/s - 1.05x  slower

new_deep_merge_some_dupes:   105237.5 i/s
old_deep_merge_some_dupes:    91174.9 i/s - 1.15x  slower

new_deep_merge_no_dupes:    95207.6 i/s
old_deep_merge_no_dupes:    84227.6 i/s - same-ish: difference falls within error

Which is still an improvement. 👍

Also, a while back, I wrote a gem (benchmark-inputs) to help with benchmarking multiple use cases, particularly when mutation is involved. Using it to benchmark the bang method, here are the numbers I got:

NEW deep_merge!:    183292.1 i/s
OLD deep_merge!:    152630.4 i/s - 1.20x slower

And here's the script to reproduce that:

require 'active_support'
require 'benchmark/inputs'

class Hash
  def new_deep_merge(other_hash, &block)
    dup.new_deep_merge!(other_hash, &block)
  end

  def new_deep_merge!(other, &block)
    merge!(other) do |key, old_val, new_val|
      if old_val.is_a?(Hash) && new_val.is_a?(Hash)
        old_val.new_deep_merge(new_val, &block)
      elsif block_given?
        block.call(key, old_val, new_val)
      else
        new_val
      end
    end
  end
end

HASHES = [
  { a: false, b: { c: [3, 4, 5] }, d: 2 },
  { a: false, b: { x: [3, 4, 5] }, e: 2, f: { y: false } },
  { e: 2, f: { y: false }, g: 'hi', h: true },
]

OTHER = { a: true, b: { c: [1, 2, 3] }, d: 1 }

Benchmark.inputs(HASHES) do |job|
  job.dup_inputs = true
  job.report('OLD deep_merge!'){|h| h.deep_merge!(OTHER) }
  job.report('NEW deep_merge!'){|h| h.new_deep_merge!(OTHER) }
  job.compare!
end

add missing newline call #deep_merge instead of #dup.deep_merge! make variable and parameter naming more consistent change `_key` to `key` faster implementation of Hash#deep_merge

msimonborg · 2017-08-16T22:44:39Z

Rebased and squashed. Thanks @lugray !

rails-bot assigned rafaelfranca Aug 15, 2017

rafaelfranca reviewed Aug 15, 2017

View reviewed changes

lugray reviewed Aug 16, 2017

View reviewed changes

lugray approved these changes Aug 16, 2017

View reviewed changes

faster implementation of Hash#deep_merge

2eacb1c

add missing newline call #deep_merge instead of #dup.deep_merge! make variable and parameter naming more consistent change `_key` to `key` faster implementation of Hash#deep_merge

rafaelfranca merged commit 9876196 into rails:master Aug 17, 2017

msimonborg deleted the deep_merge_patch branch August 19, 2017 17:35

nateberkopec mentioned this pull request Aug 19, 2017

Implement new deep_merge getsentry/sentry-ruby#754

Closed

richardvenneman mentioned this pull request Sep 12, 2017

Trestle options merge stack level too deep on Rails master TrestleAdmin/trestle#36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster and more readable implementation of Hash#deep_merge #30275

Faster and more readable implementation of Hash#deep_merge #30275

msimonborg commented Aug 15, 2017 •

edited

Loading

rails-bot commented Aug 15, 2017

rafaelfranca commented Aug 15, 2017

rafaelfranca Aug 15, 2017

msimonborg Aug 15, 2017

msimonborg commented Aug 15, 2017

lugray Aug 16, 2017

lugray left a comment

jonathanhefner commented Aug 16, 2017

msimonborg commented Aug 16, 2017

Faster and more readable implementation of Hash#deep_merge #30275

Faster and more readable implementation of Hash#deep_merge #30275

Conversation

msimonborg commented Aug 15, 2017 • edited Loading

Summary

Benchmark (MacOS 10.12.6, ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-darwin16])

rails-bot commented Aug 15, 2017

rafaelfranca commented Aug 15, 2017

rafaelfranca Aug 15, 2017

Choose a reason for hiding this comment

msimonborg Aug 15, 2017

Choose a reason for hiding this comment

msimonborg commented Aug 15, 2017

lugray Aug 16, 2017

Choose a reason for hiding this comment

lugray left a comment

Choose a reason for hiding this comment

jonathanhefner commented Aug 16, 2017

msimonborg commented Aug 16, 2017

msimonborg commented Aug 15, 2017 •

edited

Loading