Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster and more readable implementation of Hash#deep_merge #30275

Merged
merged 1 commit into from Aug 17, 2017
Merged

Faster and more readable implementation of Hash#deep_merge #30275

merged 1 commit into from Aug 17, 2017

Conversation

msimonborg
Copy link
Contributor

@msimonborg msimonborg commented Aug 15, 2017

Summary

This is a re-implementation of Hash#deep_merge. Using Ruby's built-in optional block for merge to handle duplicate keys makes the code shorter and more readable, and since the block is only run when a duplicate key is encountered it also improves performance. Performance benefits fluctuate case by case and are felt more strongly when merging hashes with fewer duplicate keys. All tests pass after this change.

Benchmark (MacOS 10.12.6, ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-darwin16])

In a gist

require 'active_support'
require 'benchmark/ips'

class Hash
  def new_deep_merge(other_hash, &block)
    dup.new_deep_merge!(other_hash, &block)
  end

  def new_deep_merge!(other, &block)
    merge!(other) do |key, old_val, new_val|
      if old_val.is_a?(Hash) && new_val.is_a?(Hash)
        old_val.new_deep_merge(new_val, &block)
      elsif block_given?
        block.call(key, old_val, new_val)
      else
        new_val
      end
    end
  end
end

h1 = { a: true, b: { c: [1, 2, 3] }, d: 1 }
h2 = { a: false, b: { c: [3, 4, 5] }, d: 2 }
h3 = { a: false, b: { x: [3, 4, 5] }, e: 2, f: { y: false } }
h4 = { e: 2, f: { y: false }, g: 'hi', h: true }

Benchmark.ips do |bm|
  bm.report('old_deep_merge_all_dupes') { h1.deep_merge(h2) }
  bm.report('new_deep_merge_all_dupes') { h1.new_deep_merge(h2) }
  bm.compare!
end

Benchmark.ips do |bm|
  bm.report('old_deep_merge_some_dupes') { h1.deep_merge(h3) }
  bm.report('new_deep_merge_some_dupes') { h1.new_deep_merge(h3) }
  bm.compare!
end

Benchmark.ips do |bm|
  bm.report('old_deep_merge_no_dupes') { h1.deep_merge(h4) }
  bm.report('new_deep_merge_no_dupes') { h1.new_deep_merge(h4) }
  bm.compare!
end

Warming up --------------------------------------
old_deep_merge_all_dupes
                        21.602k i/100ms
new_deep_merge_all_dupes
                        23.242k i/100ms
Calculating -------------------------------------
old_deep_merge_all_dupes
                        248.052k (± 3.3%) i/s -      1.253M in   5.056553s
new_deep_merge_all_dupes
                        264.029k (± 3.8%) i/s -      1.325M in   5.024762s

Comparison:
new_deep_merge_all_dupes:   264028.7 i/s
old_deep_merge_all_dupes:   248052.3 i/s - same-ish: difference falls within error

Warming up --------------------------------------
old_deep_merge_some_dupes
                        19.705k i/100ms
new_deep_merge_some_dupes
                        22.753k i/100ms
Calculating -------------------------------------
old_deep_merge_some_dupes
                        209.495k (± 5.0%) i/s -      1.064M in   5.092063s
new_deep_merge_some_dupes
                        249.642k (± 3.6%) i/s -      1.251M in   5.019540s

Comparison:
new_deep_merge_some_dupes:   249642.2 i/s
old_deep_merge_some_dupes:   209494.5 i/s - 1.19x  slower

Warming up --------------------------------------
old_deep_merge_no_dupes
                        30.064k i/100ms
new_deep_merge_no_dupes
                        40.112k i/100ms
Calculating -------------------------------------
old_deep_merge_no_dupes
                        330.682k (± 3.9%) i/s -      1.654M in   5.008077s
new_deep_merge_no_dupes
                        471.547k (± 3.2%) i/s -      2.367M in   5.023926s

Comparison:
new_deep_merge_no_dupes:   471547.3 i/s
old_deep_merge_no_dupes:   330682.5 i/s - 1.43x  slower

@rails-bot
Copy link

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @rafaelfranca (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@rafaelfranca
Copy link
Member

Can you use benchmark-ips in your benchmark?

def deep_merge!(other, &block)
merge!(other) do |_key, old_val, new_val|
if old_val.is_a?(Hash) && new_val.is_a?(Hash)
old_val.dup.deep_merge!(new_val, &block)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call deep_merge here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I'll fix that in a new commit

@msimonborg
Copy link
Contributor Author

Fixed that LOC per your comment and changed benchmarks to benchmark/ips. Let me know if there's anything else you'd want to see!


self[current_key] = if this_value.is_a?(Hash) && other_value.is_a?(Hash)
this_value.deep_merge(other_value, &block)
merge!(other_hash) do |_key, this_val, other_val|
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be key, not _key since it's used in L28.

Copy link

@lugray lugray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you squash down your commits?

@jonathanhefner
Copy link
Member

I ran the benchmark on my machine (Ruby 2.2), and here are the numbers I got:

new_deep_merge_all_dupes:    99330.4 i/s
old_deep_merge_all_dupes:    94990.2 i/s - 1.05x  slower

new_deep_merge_some_dupes:   105237.5 i/s
old_deep_merge_some_dupes:    91174.9 i/s - 1.15x  slower

new_deep_merge_no_dupes:    95207.6 i/s
old_deep_merge_no_dupes:    84227.6 i/s - same-ish: difference falls within error

Which is still an improvement. 👍

Also, a while back, I wrote a gem (benchmark-inputs) to help with benchmarking multiple use cases, particularly when mutation is involved. Using it to benchmark the bang method, here are the numbers I got:

NEW deep_merge!:    183292.1 i/s
OLD deep_merge!:    152630.4 i/s - 1.20x slower

And here's the script to reproduce that:

require 'active_support'
require 'benchmark/inputs'

class Hash
  def new_deep_merge(other_hash, &block)
    dup.new_deep_merge!(other_hash, &block)
  end

  def new_deep_merge!(other, &block)
    merge!(other) do |key, old_val, new_val|
      if old_val.is_a?(Hash) && new_val.is_a?(Hash)
        old_val.new_deep_merge(new_val, &block)
      elsif block_given?
        block.call(key, old_val, new_val)
      else
        new_val
      end
    end
  end
end

HASHES = [
  { a: false, b: { c: [3, 4, 5] }, d: 2 },
  { a: false, b: { x: [3, 4, 5] }, e: 2, f: { y: false } },
  { e: 2, f: { y: false }, g: 'hi', h: true },
]

OTHER = { a: true, b: { c: [1, 2, 3] }, d: 1 }

Benchmark.inputs(HASHES) do |job|
  job.dup_inputs = true
  job.report('OLD deep_merge!'){|h| h.deep_merge!(OTHER) }
  job.report('NEW deep_merge!'){|h| h.new_deep_merge!(OTHER) }
  job.compare!
end

add missing newline

call #deep_merge instead of #dup.deep_merge!

make variable and parameter naming more consistent

change `_key` to `key`

faster implementation of Hash#deep_merge
@msimonborg
Copy link
Contributor Author

Rebased and squashed. Thanks @lugray !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants