New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster and more readable implementation of Hash#deep_merge #30275

Merged
merged 1 commit into from Aug 17, 2017

Conversation

Projects
None yet
5 participants
@msimonborg
Contributor

msimonborg commented Aug 15, 2017

Summary

This is a re-implementation of Hash#deep_merge. Using Ruby's built-in optional block for merge to handle duplicate keys makes the code shorter and more readable, and since the block is only run when a duplicate key is encountered it also improves performance. Performance benefits fluctuate case by case and are felt more strongly when merging hashes with fewer duplicate keys. All tests pass after this change.

Benchmark (MacOS 10.12.6, ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-darwin16])

In a gist

require 'active_support'
require 'benchmark/ips'

class Hash
  def new_deep_merge(other_hash, &block)
    dup.new_deep_merge!(other_hash, &block)
  end

  def new_deep_merge!(other, &block)
    merge!(other) do |key, old_val, new_val|
      if old_val.is_a?(Hash) && new_val.is_a?(Hash)
        old_val.new_deep_merge(new_val, &block)
      elsif block_given?
        block.call(key, old_val, new_val)
      else
        new_val
      end
    end
  end
end

h1 = { a: true, b: { c: [1, 2, 3] }, d: 1 }
h2 = { a: false, b: { c: [3, 4, 5] }, d: 2 }
h3 = { a: false, b: { x: [3, 4, 5] }, e: 2, f: { y: false } }
h4 = { e: 2, f: { y: false }, g: 'hi', h: true }

Benchmark.ips do |bm|
  bm.report('old_deep_merge_all_dupes') { h1.deep_merge(h2) }
  bm.report('new_deep_merge_all_dupes') { h1.new_deep_merge(h2) }
  bm.compare!
end

Benchmark.ips do |bm|
  bm.report('old_deep_merge_some_dupes') { h1.deep_merge(h3) }
  bm.report('new_deep_merge_some_dupes') { h1.new_deep_merge(h3) }
  bm.compare!
end

Benchmark.ips do |bm|
  bm.report('old_deep_merge_no_dupes') { h1.deep_merge(h4) }
  bm.report('new_deep_merge_no_dupes') { h1.new_deep_merge(h4) }
  bm.compare!
end

Warming up --------------------------------------
old_deep_merge_all_dupes
                        21.602k i/100ms
new_deep_merge_all_dupes
                        23.242k i/100ms
Calculating -------------------------------------
old_deep_merge_all_dupes
                        248.052k (± 3.3%) i/s -      1.253M in   5.056553s
new_deep_merge_all_dupes
                        264.029k (± 3.8%) i/s -      1.325M in   5.024762s

Comparison:
new_deep_merge_all_dupes:   264028.7 i/s
old_deep_merge_all_dupes:   248052.3 i/s - same-ish: difference falls within error

Warming up --------------------------------------
old_deep_merge_some_dupes
                        19.705k i/100ms
new_deep_merge_some_dupes
                        22.753k i/100ms
Calculating -------------------------------------
old_deep_merge_some_dupes
                        209.495k (± 5.0%) i/s -      1.064M in   5.092063s
new_deep_merge_some_dupes
                        249.642k (± 3.6%) i/s -      1.251M in   5.019540s

Comparison:
new_deep_merge_some_dupes:   249642.2 i/s
old_deep_merge_some_dupes:   209494.5 i/s - 1.19x  slower

Warming up --------------------------------------
old_deep_merge_no_dupes
                        30.064k i/100ms
new_deep_merge_no_dupes
                        40.112k i/100ms
Calculating -------------------------------------
old_deep_merge_no_dupes
                        330.682k (± 3.9%) i/s -      1.654M in   5.008077s
new_deep_merge_no_dupes
                        471.547k (± 3.2%) i/s -      2.367M in   5.023926s

Comparison:
new_deep_merge_no_dupes:   471547.3 i/s
old_deep_merge_no_dupes:   330682.5 i/s - 1.43x  slower
@rails-bot

This comment has been minimized.

rails-bot commented Aug 15, 2017

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @rafaelfranca (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Aug 15, 2017

Can you use benchmark-ips in your benchmark?

activesupport/lib/active_support/core_ext/hash/deep_merge.rb Outdated
def deep_merge!(other, &block)
merge!(other) do |_key, old_val, new_val|
if old_val.is_a?(Hash) && new_val.is_a?(Hash)
old_val.dup.deep_merge!(new_val, &block)

This comment has been minimized.

@rafaelfranca

rafaelfranca Aug 15, 2017

Member

Why not call deep_merge here?

This comment has been minimized.

@msimonborg

msimonborg Aug 15, 2017

Contributor

You're right, I'll fix that in a new commit

@msimonborg

This comment has been minimized.

Contributor

msimonborg commented Aug 15, 2017

Fixed that LOC per your comment and changed benchmarks to benchmark/ips. Let me know if there's anything else you'd want to see!

activesupport/lib/active_support/core_ext/hash/deep_merge.rb Outdated
self[current_key] = if this_value.is_a?(Hash) && other_value.is_a?(Hash)
this_value.deep_merge(other_value, &block)
merge!(other_hash) do |_key, this_val, other_val|

This comment has been minimized.

@lugray

lugray Aug 16, 2017

Contributor

This should be key, not _key since it's used in L28.

@lugray

lugray approved these changes Aug 16, 2017

Can you squash down your commits?

@jonathanhefner

This comment has been minimized.

jonathanhefner commented Aug 16, 2017

I ran the benchmark on my machine (Ruby 2.2), and here are the numbers I got:

new_deep_merge_all_dupes:    99330.4 i/s
old_deep_merge_all_dupes:    94990.2 i/s - 1.05x  slower

new_deep_merge_some_dupes:   105237.5 i/s
old_deep_merge_some_dupes:    91174.9 i/s - 1.15x  slower

new_deep_merge_no_dupes:    95207.6 i/s
old_deep_merge_no_dupes:    84227.6 i/s - same-ish: difference falls within error

Which is still an improvement. 👍

Also, a while back, I wrote a gem (benchmark-inputs) to help with benchmarking multiple use cases, particularly when mutation is involved. Using it to benchmark the bang method, here are the numbers I got:

NEW deep_merge!:    183292.1 i/s
OLD deep_merge!:    152630.4 i/s - 1.20x slower

And here's the script to reproduce that:

require 'active_support'
require 'benchmark/inputs'

class Hash
  def new_deep_merge(other_hash, &block)
    dup.new_deep_merge!(other_hash, &block)
  end

  def new_deep_merge!(other, &block)
    merge!(other) do |key, old_val, new_val|
      if old_val.is_a?(Hash) && new_val.is_a?(Hash)
        old_val.new_deep_merge(new_val, &block)
      elsif block_given?
        block.call(key, old_val, new_val)
      else
        new_val
      end
    end
  end
end

HASHES = [
  { a: false, b: { c: [3, 4, 5] }, d: 2 },
  { a: false, b: { x: [3, 4, 5] }, e: 2, f: { y: false } },
  { e: 2, f: { y: false }, g: 'hi', h: true },
]

OTHER = { a: true, b: { c: [1, 2, 3] }, d: 1 }

Benchmark.inputs(HASHES) do |job|
  job.dup_inputs = true
  job.report('OLD deep_merge!'){|h| h.deep_merge!(OTHER) }
  job.report('NEW deep_merge!'){|h| h.new_deep_merge!(OTHER) }
  job.compare!
end
faster implementation of Hash#deep_merge
add missing newline

call #deep_merge instead of #dup.deep_merge!

make variable and parameter naming more consistent

change `_key` to `key`

faster implementation of Hash#deep_merge

@msimonborg msimonborg force-pushed the msimonborg:deep_merge_patch branch to 2eacb1c Aug 16, 2017

@msimonborg

This comment has been minimized.

Contributor

msimonborg commented Aug 16, 2017

Rebased and squashed. Thanks @lugray !

@rafaelfranca rafaelfranca merged commit 9876196 into rails:master Aug 17, 2017

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
codeclimate 2 fixed issues
Details

@msimonborg msimonborg deleted the msimonborg:deep_merge_patch branch Aug 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment