`Hash#deep_diff`, the recursive difference between two hashes #8142

Closed
wants to merge 1 commit into
from

Projects

None yet

6 participants

@nikitug
Contributor
nikitug commented Nov 8, 2012

diff vs deep_diff vs MiniTest#diff

Here is deep_diff usage example:

{ a: { b: 1 }}.deep_diff(a: { b: 1 })        # => {}
{ a: { b: 1, c: 2 } }.deep_diff(a: { b: 1 }) # => { a: { c: 2 } }
{ a: { b: 1 } }.deep_diff(a: { b: 1, c: 2 }) # => { a: { c: 2 } }

diff was introduced in @891a962 to improve failed assertion message. So deep_diff will be more useful when you need descriptive messages. So I'm wondering are there any such cases when diff is a better choice than deep_diff?

Also here is a couple of strings from Rails using diff:

# FIXME: minitest does object diffs, do we need to have our own?
message ||= sprintf("The recognized options <%s> did not match <%s>, difference: <%s>",
  request.path_parameters, expected_options, expected_options.diff(request.path_parameters))
assert_equal(expected_options, request.path_parameters, message)

FIXME says that there is a MiniTest alternative which is way less descriptive:

[15] pry(main)> diff({a: 1, b: 2}, {b: 2})
=> "Expected: {:a=>1, :b=>2}\n  Actual: {:b=>2}"

So WDYT is the best way we should act here?

@steveklabnik
Member

Can you show us some code where Hash#deep_diff improves it significantly?

@nikitug
Contributor
nikitug commented Nov 8, 2012

Here is the first thing that came to mind:

big_json = JSON.parse(response)
expected = {some: expected_data}
assert_equal expected, big_json, "Good diff: #{expected.deep_diff(big_json)}"

But I'm pretty sure there are more cases when deep_diff would be more helpful than diff.

@carlosantoniodasilva carlosantoniodasilva and 1 other commented on an outdated diff Nov 8, 2012
activesupport/lib/active_support/core_ext/hash/diff.rb
@@ -10,4 +10,25 @@ def diff(other)
delete_if { |k, v| other[k] == v }.
merge!(other.dup.delete_if { |k, v| has_key?(k) })
end
+
+ # Returns a hash that represents the recursive difference between two hashes.
+ #
+ # {a: {b: 1}}.deep_diff(a: {b: 1}) # => {}
+ # {a: {b: 1, c: 2}}.deep_diff(a: {b: 1}) # => {a: {c: 2}}
+ # {a: {b: 1}}.deep_diff(a: {b: 1, c: 2}) # => {a: {c: 2}}
+ def deep_diff(other)
+ {}.tap do |diff|
+ other_diff = other.dup
+ each_pair do |k,v|
+ if self[k].is_a?(Hash) && other[k].is_a?(Hash)
@carlosantoniodasilva
carlosantoniodasilva Nov 8, 2012 Member

self[k] == v, right? You can probably use v here and in the other places below.

@nikitug
nikitug Nov 8, 2012 Contributor

sure, my bad

@carlosantoniodasilva carlosantoniodasilva and 1 other commented on an outdated diff Nov 8, 2012
activesupport/lib/active_support/core_ext/hash/diff.rb
+ # {a: {b: 1, c: 2}}.deep_diff(a: {b: 1}) # => {a: {c: 2}}
+ # {a: {b: 1}}.deep_diff(a: {b: 1, c: 2}) # => {a: {c: 2}}
+ def deep_diff(other)
+ {}.tap do |diff|
+ other_diff = other.dup
+ each_pair do |k,v|
+ if self[k].is_a?(Hash) && other[k].is_a?(Hash)
+ h = self[k].deep_diff(other[k])
+ diff[k] = h unless h.empty?
+ elsif self[k] != other[k]
+ diff[k] = self[k]
+ end
+ other_diff.delete(k)
+ end
+ diff.merge!(other_diff)
+ end
@carlosantoniodasilva
carlosantoniodasilva Nov 8, 2012 Member

I don't think tap buys us anything in here, it'd probably be better to just set diff = {} and go on.

@nikitug
nikitug Nov 8, 2012 Contributor

It just seems more clear to me, but I'll make the change, thanks!

@carlosantoniodasilva carlosantoniodasilva commented on an outdated diff Nov 8, 2012
activesupport/lib/active_support/core_ext/hash/diff.rb
@@ -10,4 +10,25 @@ def diff(other)
delete_if { |k, v| other[k] == v }.
merge!(other.dup.delete_if { |k, v| has_key?(k) })
end
+
+ # Returns a hash that represents the recursive difference between two hashes.
+ #
+ # {a: {b: 1}}.deep_diff(a: {b: 1}) # => {}
+ # {a: {b: 1, c: 2}}.deep_diff(a: {b: 1}) # => {a: {c: 2}}
+ # {a: {b: 1}}.deep_diff(a: {b: 1, c: 2}) # => {a: {c: 2}}
@carlosantoniodasilva
carlosantoniodasilva Nov 8, 2012 Member

Keep spaces inside {}, as in the changelog examples.

@carlosantoniodasilva carlosantoniodasilva commented on an outdated diff Nov 8, 2012
activesupport/test/core_ext/hash_ext_test.rb
@@ -656,7 +656,17 @@ def test_reverse_merge
end
def test_diff
- assert_equal({ :a => 2 }, { :a => 2, :b => 5 }.diff({ :a => 1, :b => 5 }))
+ assert_equal({ a: 2 }, { a: 2, b: 5 }.diff({ a: 1, b: 5 }))
+ end
+
+ def test_deep_diff_acts_like_diff
+ assert_equal({ a: 2 }, { a: 2, b: 5 }.deep_diff({ a: 1, b: 5 }))
+ assert_equal({}, { a: { b: 1 } }.deep_diff(a: { b: 1 }))
+ end
+
+ def test_deep_diff
+ assert_equal({ a: { c: 2 } }, { a: { b: 1, c: 2 } }.deep_diff(a: { b: 1 }))
+ assert_equal({ a: { c: 2 } }, { a: { b: 1 } }.deep_diff(a: { b: 1, c: 2 }))
end
@carlosantoniodasilva
carlosantoniodasilva Nov 8, 2012 Member

I think you can put everything inside one test only.

@carlosantoniodasilva

I've made some minor comments. I'm not sure whether we should add such a method to AS (I've personally never used Hash#diff I guess), but in any case, lets wait for more comments. Thanks!

@nikitug
Contributor
nikitug commented Nov 8, 2012

@carlosantoniodasilva thanks for your comments! For example, for me it's handy in some test cases to use diffs instead of explicit checks by keys (to cut a corners in JSON API testing mostly). Anyway let's just wait for some feedback, yep.

@nikitug
Contributor
nikitug commented Nov 8, 2012

By the way, diff was introduced in @891a962 to improve failed assertion message. So deep_diff will be more useful when you need descriptive messages. So I'm wondering are there any such cases when diff is a better choice than deep_diff?

Also here is a couple of strings from Rails using diff:

# FIXME: minitest does object diffs, do we need to have our own?
message ||= sprintf("The recognized options <%s> did not match <%s>, difference: <%s>",
  request.path_parameters, expected_options, expected_options.diff(request.path_parameters))
assert_equal(expected_options, request.path_parameters, message)

FIXME says that there is a MiniTest alternative which is way less descriptive:

[15] pry(main)> diff({a: 1, b: 2}, {b: 2})
=> "Expected: {:a=>1, :b=>2}\n  Actual: {:b=>2}"

So WDYT is the best way we should act here?

@t3hk0d3

But what happens if i still want my empty hash?

Owner

@t3hk0d3 please, provide some code to describe the problem

Nevermind. However there is another problem:

assert_equal(ActiveSupport::HashWithIndifferentAccess, { a: { b: 1 }}.with_indifferent_access.deep_diff({ a: { b: { c: 1}}}).class)
Owner

Nice catch, but it seems to be not so clear what class is expected in the case of diff between HashWithIndifferentAccess and Hash.

And there is another problem (or bug?) with HashWithIndifferentAccess in diff as well: {a: 1}.diff(HashWithIndifferentAccess.new(a: 1, b: 2)) #=> {"a"=>1, "b"=>2}.

@rafaelfranca
Member

Why not just diff to do the deep diff? I think this is the desirable behavior when you are doing a diff.

@nikitug
Contributor
nikitug commented Nov 9, 2012

@rafaelfranca same for me, but I just didnot want to break any code. @carlosantoniodasilva @steveklabnik what do you think about this behavior change?

@fxn
Member
fxn commented Nov 9, 2012

Wouldn't that be more naturally implemented as JSON diff? In principle I see here some gem rather than an AS extension. As for XML diff.

@nikitug
Contributor
nikitug commented Nov 9, 2012

@fxn do you mean that we should diff Ruby Hashes as JSON-encoded strings? MiniTest acts same way, it pretty-prints objects and just diff the strings. So given bad pp format (which is to_s by default) you will get bad diff.

@fxn
Member
fxn commented Nov 9, 2012

I mean, when the output is XML I use something similar to https://github.com/postmodern/nokogiri-diff, and in the use case above where you want to test JSON in principle I'd use something like https://github.com/a2design-company/json-compare, that is specific for JSON.

@nikitug
Contributor
nikitug commented Nov 9, 2012

@fxn got it, thanks! But we still have some usecases similar to the Rails code given above.

@steveklabnik
Member

There is a JSON-patch format currently in Internet-Draft: http://tools.ietf.org/html/draft-ietf-appsawg-json-patch-06

@rafaelfranca
Member

I have to agree with @fxn. This fits more as a gem. I don't see myself using this in an application

@steveklabnik
Member

I agree. Thanks @nikitug , but just gem-ify this for now. If lots of people use it, we can always add it later.

@nikitug
Contributor
nikitug commented Nov 9, 2012

@rafaelfranca @steveklabnik ok, no problem. Just one more question about diffs:

The only place in Rails where Hash#diff is used is:

# FIXME: minitest does object diffs, do we need to have our own?
message ||= sprintf("The recognized options <%s> did not match <%s>, difference: <%s>",
  request.path_parameters, expected_options, expected_options.diff(request.path_parameters))
assert_equal(expected_options, request.path_parameters, message)

So do we really need Hash#diff at all? Maybe it's ok to drop this method support in AS and just use MiniTest's one (with proper descriptive formatting)?

@rafaelfranca
Member

Yes, I think we can deprecate that method.

@steveklabnik
Member

I've submitted a pull request for this here: #8158

@steveklabnik steveklabnik referenced this pull request May 23, 2013
Merged

Deprecate Hash#diff. #8158

@sgerrand sgerrand pushed a commit to sgerrand/rails that referenced this pull request Nov 2, 2013
@steveklabnik steveklabnik Deprecate Hash#diff.
It's no longer used in Rails any more.

See rails#8142\#issuecomment-10227297 for more
88d59de
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment