#match_array / #=~ has a bad error message on equal hashes #191

Closed
phiggins opened this Issue Dec 2, 2012 · 9 comments

Comments

Projects
None yet
3 participants
@phiggins
Contributor

phiggins commented Dec 2, 2012

As discussed with @myronmarston on twitter[1][2], when using the #=~ operator on hashes that are equal the error message displayed is confusing. This is demonstrated in an example[3].

Since hashes in ruby don't respond to #to_ary, the #=~ method isn't really applicable to them, however the error message should at least be improved. I can think of two simple ways to remedy this:

  1. Better detection of incompatible types for #=~, so using it with things that don't respond to #to_ary displays a message like "Don't use this method with this class.".
  2. Change #=~ to try #== first and report success if the things to be compared are equal.

What do you think?

1: https://twitter.com/pete_higgins/status/275320169910374401
2: https://twitter.com/pete_higgins/status/275320308058185728
3: https://gist.github.com/4190543

@myronmarston

This comment has been minimized.

Show comment
Hide comment
@myronmarston

myronmarston Dec 2, 2012

Member

Thanks for reporting this, @phiggins.

I would vote for your first idea. I think it's confusing to use a match_array matcher to match a hash.

Note, however, that I think it's OK for match_array to be used against other non-array collections that act array like, as long as the expected argument passed by the user is an array. some_custom_collection_type.should match_array([1, 2, 3]) reads OK I think, but [1, 2, 3].should match_array(some_non_array_object) doesn't make sense, IMO.

(That said, if you use the should =~ form, it's not nearly so obvious that it only supports passing an array as the expected argument).

Looking at the match_array matcher, it already includes code to handle this case:

https://github.com/rspec/rspec-expectations/blob/v2.12.0/lib/rspec/matchers/built_in/match_array.rb#L19

...so I was initially confused by the fact that you're not seeing that error message (which corresponds to your first idea).

However, thinking about it some more, I found this:

https://github.com/rspec/rspec-expectations/blob/v2.12.0/lib/rspec/matchers.rb#L687

As you can see, the =~ matcher delegates to match_array if and only if it is operating on an Array...otherwise, it calls =~ on the given object. Hash=~ returns false, even for identical hashes:

1.9.3p327 :001 > { "a" => 3 } =~ { "a" => 3 }
 => nil 

...and that's why you're seeing the failures you're seeing. Among other things, this demonstrates again that the operator matchers, as nice as they read, add complexity, and it confirms that our choice not to support operator matchers with the new expect syntax was the right one. If you were using expect(hash).to match_array you would get the expected failure message.

My first instinct is that we can get a nice, clear failure message by forcing hash.should =~ to use our match_array matcher:

module RSpec
  module Matchers
    OperatorMatcher.register(Hash, '=~', BuiltIn::MatchArray)
  end 
end

However, this has the potential to be a breaking change. If any users have defined Hash#=~ to behave in a particular way, hash.should =~ some_other_hash may be working fine for them, and changing it so that it delegates to match_array would break their spec suites. That said, I've never heard of any users doing that, and 99% of the time that a user tries hash.should =~, they probably expect it to use the match_array behavior (or something similar). So making this change would solve the issue for most users at the risk of breaking a very small percent of users.

Thoughts?

Member

myronmarston commented Dec 2, 2012

Thanks for reporting this, @phiggins.

I would vote for your first idea. I think it's confusing to use a match_array matcher to match a hash.

Note, however, that I think it's OK for match_array to be used against other non-array collections that act array like, as long as the expected argument passed by the user is an array. some_custom_collection_type.should match_array([1, 2, 3]) reads OK I think, but [1, 2, 3].should match_array(some_non_array_object) doesn't make sense, IMO.

(That said, if you use the should =~ form, it's not nearly so obvious that it only supports passing an array as the expected argument).

Looking at the match_array matcher, it already includes code to handle this case:

https://github.com/rspec/rspec-expectations/blob/v2.12.0/lib/rspec/matchers/built_in/match_array.rb#L19

...so I was initially confused by the fact that you're not seeing that error message (which corresponds to your first idea).

However, thinking about it some more, I found this:

https://github.com/rspec/rspec-expectations/blob/v2.12.0/lib/rspec/matchers.rb#L687

As you can see, the =~ matcher delegates to match_array if and only if it is operating on an Array...otherwise, it calls =~ on the given object. Hash=~ returns false, even for identical hashes:

1.9.3p327 :001 > { "a" => 3 } =~ { "a" => 3 }
 => nil 

...and that's why you're seeing the failures you're seeing. Among other things, this demonstrates again that the operator matchers, as nice as they read, add complexity, and it confirms that our choice not to support operator matchers with the new expect syntax was the right one. If you were using expect(hash).to match_array you would get the expected failure message.

My first instinct is that we can get a nice, clear failure message by forcing hash.should =~ to use our match_array matcher:

module RSpec
  module Matchers
    OperatorMatcher.register(Hash, '=~', BuiltIn::MatchArray)
  end 
end

However, this has the potential to be a breaking change. If any users have defined Hash#=~ to behave in a particular way, hash.should =~ some_other_hash may be working fine for them, and changing it so that it delegates to match_array would break their spec suites. That said, I've never heard of any users doing that, and 99% of the time that a user tries hash.should =~, they probably expect it to use the match_array behavior (or something similar). So making this change would solve the issue for most users at the risk of breaking a very small percent of users.

Thoughts?

@phiggins

This comment has been minimized.

Show comment
Hide comment
@phiggins

phiggins Dec 3, 2012

Contributor

(That said, if you use the should =~ form, it's not nearly so obvious that it only supports passing an array as the expected argument).

That's how I got into this mess, I didn't know it was Array specific before this.

As you can see, the =~ matcher delegates to match_array if and only if it is operating on an Array...

I'll admit I am not very familiar with how rspec works internally and I didn't realize rspec was only activating the #match_array code when #=~ was called on an array.

otherwise, it calls =~ on the given object. Hash=~ returns false, even for identical hashes:

... and I had no idea until this morning that everything in ruby inherits #=~ from object. Interesting!

My first instinct is that we can get a nice, clear failure message by forcing hash.should =~ to use our match_array matcher:

In addition to your point about unexpected breakage, if this change were made we would have worked around this for Hash but other types, for example, Set, would still have the confusing error message. Making rspec treat #=~ differently for Hash shifts the inconsistency around rather than fixing it. This might have ramifications I don't understand, but what about doing something like this instead:

module RSpec
  module Matchers
    OperatorMatcher.register(Enumerable, '=~', BuiltIn::MatchArray)
  end 
end

It seems in my experimenting that non-Enumerables already have a clear enough error message (that #=~ reported false) and it is Enumerables that get the confusing one. Perhaps there is some rspec code I haven't seen that decides Enumerables are diff-able, and that's why they get the confusing error message?

This is not nearly as simple as I had thought originally. ;)

Contributor

phiggins commented Dec 3, 2012

(That said, if you use the should =~ form, it's not nearly so obvious that it only supports passing an array as the expected argument).

That's how I got into this mess, I didn't know it was Array specific before this.

As you can see, the =~ matcher delegates to match_array if and only if it is operating on an Array...

I'll admit I am not very familiar with how rspec works internally and I didn't realize rspec was only activating the #match_array code when #=~ was called on an array.

otherwise, it calls =~ on the given object. Hash=~ returns false, even for identical hashes:

... and I had no idea until this morning that everything in ruby inherits #=~ from object. Interesting!

My first instinct is that we can get a nice, clear failure message by forcing hash.should =~ to use our match_array matcher:

In addition to your point about unexpected breakage, if this change were made we would have worked around this for Hash but other types, for example, Set, would still have the confusing error message. Making rspec treat #=~ differently for Hash shifts the inconsistency around rather than fixing it. This might have ramifications I don't understand, but what about doing something like this instead:

module RSpec
  module Matchers
    OperatorMatcher.register(Enumerable, '=~', BuiltIn::MatchArray)
  end 
end

It seems in my experimenting that non-Enumerables already have a clear enough error message (that #=~ reported false) and it is Enumerables that get the confusing one. Perhaps there is some rspec code I haven't seen that decides Enumerables are diff-able, and that's why they get the confusing error message?

This is not nearly as simple as I had thought originally. ;)

@myronmarston

This comment has been minimized.

Show comment
Hide comment
@myronmarston

myronmarston Dec 4, 2012

Member

I like your Enumerable idea, but it suffers from a few problems:

  • As the OperatorMatcher registry currently exists, it only supports exact matches on the direct class of the object...not based on other superclasses/modules in the ancestor chain. We could update it to support entries in the ancestor chain but we're generally moving away from operator matchers.
  • String is an Enumerable on 1.8 (and we don't have plans to drop 1.8.7 support any time soon), but "a string".should =~ /str/ needs to delegate to String#=~ as it currently does, and your proposal would break that :(.
Member

myronmarston commented Dec 4, 2012

I like your Enumerable idea, but it suffers from a few problems:

  • As the OperatorMatcher registry currently exists, it only supports exact matches on the direct class of the object...not based on other superclasses/modules in the ancestor chain. We could update it to support entries in the ancestor chain but we're generally moving away from operator matchers.
  • String is an Enumerable on 1.8 (and we don't have plans to drop 1.8.7 support any time soon), but "a string".should =~ /str/ needs to delegate to String#=~ as it currently does, and your proposal would break that :(.
@alindeman

This comment has been minimized.

Show comment
Hide comment
@alindeman

alindeman Dec 4, 2012

Contributor

As the OperatorMatcher registry currently exists, it only supports exact matches on the direct class of the object...not based on other superclasses/modules in the ancestor chain. We could update it to support entries in the ancestor chain but we're generally moving away from operator matchers.

This option would actually be pretty appealing for rspec-rails, as I've written a bit of unfortunate code to support the =~ operator matcher in Rails 4. See the commit message for more details.

I actually need to make sure the change doesn't leak memory if Rails code is reloaded. An OperatorMatcher registry that knew about ancestors would allay this concern too.

Contributor

alindeman commented Dec 4, 2012

As the OperatorMatcher registry currently exists, it only supports exact matches on the direct class of the object...not based on other superclasses/modules in the ancestor chain. We could update it to support entries in the ancestor chain but we're generally moving away from operator matchers.

This option would actually be pretty appealing for rspec-rails, as I've written a bit of unfortunate code to support the =~ operator matcher in Rails 4. See the commit message for more details.

I actually need to make sure the change doesn't leak memory if Rails code is reloaded. An OperatorMatcher registry that knew about ancestors would allay this concern too.

@myronmarston

This comment has been minimized.

Show comment
Hide comment
@myronmarston

myronmarston Dec 4, 2012

Member

This option would actually be pretty appealing for rspec-rails, as I've written a bit of unfortunate code to support the =~ operator matcher in Rails 4. See the commit message for more details.

I actually need to make sure the change doesn't leak memory if Rails code is reloaded. An OperatorMatcher registry that knew about ancestors would allay this concern too.

I think that fixing the operator matcher registory to support ancestors is a better option. Let's do that.

I'm still not sure what the best solution for @phiggins' original issue is, though. The fact that strings are enumerable on 1.8 is a problem for the current suggestion. Any ideas, @alindeman?

Member

myronmarston commented Dec 4, 2012

This option would actually be pretty appealing for rspec-rails, as I've written a bit of unfortunate code to support the =~ operator matcher in Rails 4. See the commit message for more details.

I actually need to make sure the change doesn't leak memory if Rails code is reloaded. An OperatorMatcher registry that knew about ancestors would allay this concern too.

I think that fixing the operator matcher registory to support ancestors is a better option. Let's do that.

I'm still not sure what the best solution for @phiggins' original issue is, though. The fact that strings are enumerable on 1.8 is a problem for the current suggestion. Any ideas, @alindeman?

@alindeman

This comment has been minimized.

Show comment
Hide comment
@alindeman

alindeman Dec 4, 2012

Contributor

I'm still not sure what the best solution for @phiggins' original issue is, though. The fact that strings are enumerable on 1.8 is a problem for the current suggestion. Any ideas, @alindeman?

I don't have strong feelings, but I'm leaning toward supporting leaving it as-is. =~ was kind of magical to begin with; I think match_array is better. And since writing expect(hash).to match_array clearly makes no sense, I would hope folks would simply not write it in the first place, and therefore not get confused.

Propagating =~ to work correctly for Hashes or all Enumerables strikes me as risky. If the only gain is to avoid confusion with error messages in the specific case of hash.should =~, I'm leaning against it.

But, always willing to be convinced. Thoughts?

Contributor

alindeman commented Dec 4, 2012

I'm still not sure what the best solution for @phiggins' original issue is, though. The fact that strings are enumerable on 1.8 is a problem for the current suggestion. Any ideas, @alindeman?

I don't have strong feelings, but I'm leaning toward supporting leaving it as-is. =~ was kind of magical to begin with; I think match_array is better. And since writing expect(hash).to match_array clearly makes no sense, I would hope folks would simply not write it in the first place, and therefore not get confused.

Propagating =~ to work correctly for Hashes or all Enumerables strikes me as risky. If the only gain is to avoid confusion with error messages in the specific case of hash.should =~, I'm leaning against it.

But, always willing to be convinced. Thoughts?

@myronmarston

This comment has been minimized.

Show comment
Hide comment
@myronmarston

myronmarston Dec 4, 2012

Member

I thought of a potential solution that would allow us to do @phiggins' idea (registering the operator for Enumerable) while ensuring we don't interfere with enumerable types like String that define =~: before looking up an object's class in the registry, we can check the owner of its operator method, and if the owner is not Kernel, it means the object must have a meaningful definition for that operator, and we should not use the registered matcher in the place of the operator.

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

Member

myronmarston commented Dec 4, 2012

I thought of a potential solution that would allow us to do @phiggins' idea (registering the operator for Enumerable) while ensuring we don't interfere with enumerable types like String that define =~: before looking up an object's class in the registry, we can check the owner of its operator method, and if the owner is not Kernel, it means the object must have a meaningful definition for that operator, and we should not use the registered matcher in the place of the operator.

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

@phiggins

This comment has been minimized.

Show comment
Hide comment
@phiggins

phiggins Dec 4, 2012

Contributor

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

That seems like a good solution.

Since the proposed solutions involve modifying large bits of rails internals across projects, I'll leave you guys to it. I'm eager to see what you come up with.

Contributor

phiggins commented Dec 4, 2012

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

That seems like a good solution.

Since the proposed solutions involve modifying large bits of rails internals across projects, I'll leave you guys to it. I'm eager to see what you come up with.

alindeman added a commit to alindeman/rspec-expectations that referenced this issue Dec 4, 2012

@alindeman

This comment has been minimized.

Show comment
Hide comment
@alindeman

alindeman Dec 4, 2012

Contributor

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

Seems reasonable to me. I opened #192 to allow operator matcher's registry to consider ancestors.

Contributor

alindeman commented Dec 4, 2012

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

Seems reasonable to me. I opened #192 to allow operator matcher's registry to consider ancestors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment