Skip to content

Loading…

#match_array / #=~ has a bad error message on equal hashes #191

Closed
phiggins opened this Issue · 9 comments

3 participants

@phiggins

As discussed with @myronmarston on twitter[1][2], when using the #=~ operator on hashes that are equal the error message displayed is confusing. This is demonstrated in an example[3].

Since hashes in ruby don't respond to #to_ary, the #=~ method isn't really applicable to them, however the error message should at least be improved. I can think of two simple ways to remedy this:

  1. Better detection of incompatible types for #=~, so using it with things that don't respond to #to_ary displays a message like "Don't use this method with this class.".
  2. Change #=~ to try #== first and report success if the things to be compared are equal.

What do you think?

1: https://twitter.com/pete_higgins/status/275320169910374401
2: https://twitter.com/pete_higgins/status/275320308058185728
3: https://gist.github.com/4190543

@myronmarston
RSpec member

Thanks for reporting this, @phiggins.

I would vote for your first idea. I think it's confusing to use a match_array matcher to match a hash.

Note, however, that I think it's OK for match_array to be used against other non-array collections that act array like, as long as the expected argument passed by the user is an array. some_custom_collection_type.should match_array([1, 2, 3]) reads OK I think, but [1, 2, 3].should match_array(some_non_array_object) doesn't make sense, IMO.

(That said, if you use the should =~ form, it's not nearly so obvious that it only supports passing an array as the expected argument).

Looking at the match_array matcher, it already includes code to handle this case:

https://github.com/rspec/rspec-expectations/blob/v2.12.0/lib/rspec/matchers/built_in/match_array.rb#L19

...so I was initially confused by the fact that you're not seeing that error message (which corresponds to your first idea).

However, thinking about it some more, I found this:

https://github.com/rspec/rspec-expectations/blob/v2.12.0/lib/rspec/matchers.rb#L687

As you can see, the =~ matcher delegates to match_array if and only if it is operating on an Array...otherwise, it calls =~ on the given object. Hash=~ returns false, even for identical hashes:

1.9.3p327 :001 > { "a" => 3 } =~ { "a" => 3 }
 => nil 

...and that's why you're seeing the failures you're seeing. Among other things, this demonstrates again that the operator matchers, as nice as they read, add complexity, and it confirms that our choice not to support operator matchers with the new expect syntax was the right one. If you were using expect(hash).to match_array you would get the expected failure message.

My first instinct is that we can get a nice, clear failure message by forcing hash.should =~ to use our match_array matcher:

module RSpec
  module Matchers
    OperatorMatcher.register(Hash, '=~', BuiltIn::MatchArray)
  end 
end

However, this has the potential to be a breaking change. If any users have defined Hash#=~ to behave in a particular way, hash.should =~ some_other_hash may be working fine for them, and changing it so that it delegates to match_array would break their spec suites. That said, I've never heard of any users doing that, and 99% of the time that a user tries hash.should =~, they probably expect it to use the match_array behavior (or something similar). So making this change would solve the issue for most users at the risk of breaking a very small percent of users.

Thoughts?

@phiggins

(That said, if you use the should =~ form, it's not nearly so obvious that it only supports passing an array as the expected argument).

That's how I got into this mess, I didn't know it was Array specific before this.

As you can see, the =~ matcher delegates to match_array if and only if it is operating on an Array...

I'll admit I am not very familiar with how rspec works internally and I didn't realize rspec was only activating the #match_array code when #=~ was called on an array.

otherwise, it calls =~ on the given object. Hash=~ returns false, even for identical hashes:

... and I had no idea until this morning that everything in ruby inherits #=~ from object. Interesting!

My first instinct is that we can get a nice, clear failure message by forcing hash.should =~ to use our match_array matcher:

In addition to your point about unexpected breakage, if this change were made we would have worked around this for Hash but other types, for example, Set, would still have the confusing error message. Making rspec treat #=~ differently for Hash shifts the inconsistency around rather than fixing it. This might have ramifications I don't understand, but what about doing something like this instead:

module RSpec
  module Matchers
    OperatorMatcher.register(Enumerable, '=~', BuiltIn::MatchArray)
  end 
end

It seems in my experimenting that non-Enumerables already have a clear enough error message (that #=~ reported false) and it is Enumerables that get the confusing one. Perhaps there is some rspec code I haven't seen that decides Enumerables are diff-able, and that's why they get the confusing error message?

This is not nearly as simple as I had thought originally. ;)

@myronmarston
RSpec member

I like your Enumerable idea, but it suffers from a few problems:

  • As the OperatorMatcher registry currently exists, it only supports exact matches on the direct class of the object...not based on other superclasses/modules in the ancestor chain. We could update it to support entries in the ancestor chain but we're generally moving away from operator matchers.
  • String is an Enumerable on 1.8 (and we don't have plans to drop 1.8.7 support any time soon), but "a string".should =~ /str/ needs to delegate to String#=~ as it currently does, and your proposal would break that :(.
@alindeman

As the OperatorMatcher registry currently exists, it only supports exact matches on the direct class of the object...not based on other superclasses/modules in the ancestor chain. We could update it to support entries in the ancestor chain but we're generally moving away from operator matchers.

This option would actually be pretty appealing for rspec-rails, as I've written a bit of unfortunate code to support the =~ operator matcher in Rails 4. See the commit message for more details.

I actually need to make sure the change doesn't leak memory if Rails code is reloaded. An OperatorMatcher registry that knew about ancestors would allay this concern too.

@myronmarston
RSpec member

This option would actually be pretty appealing for rspec-rails, as I've written a bit of unfortunate code to support the =~ operator matcher in Rails 4. See the commit message for more details.

I actually need to make sure the change doesn't leak memory if Rails code is reloaded. An OperatorMatcher registry that knew about ancestors would allay this concern too.

I think that fixing the operator matcher registory to support ancestors is a better option. Let's do that.

I'm still not sure what the best solution for @phiggins' original issue is, though. The fact that strings are enumerable on 1.8 is a problem for the current suggestion. Any ideas, @alindeman?

@alindeman

I'm still not sure what the best solution for @phiggins' original issue is, though. The fact that strings are enumerable on 1.8 is a problem for the current suggestion. Any ideas, @alindeman?

I don't have strong feelings, but I'm leaning toward supporting leaving it as-is. =~ was kind of magical to begin with; I think match_array is better. And since writing expect(hash).to match_array clearly makes no sense, I would hope folks would simply not write it in the first place, and therefore not get confused.

Propagating =~ to work correctly for Hashes or all Enumerables strikes me as risky. If the only gain is to avoid confusion with error messages in the specific case of hash.should =~, I'm leaning against it.

But, always willing to be convinced. Thoughts?

@myronmarston
RSpec member

I thought of a potential solution that would allow us to do @phiggins' idea (registering the operator for Enumerable) while ensuring we don't interfere with enumerable types like String that define =~: before looking up an object's class in the registry, we can check the owner of its operator method, and if the owner is not Kernel, it means the object must have a meaningful definition for that operator, and we should not use the registered matcher in the place of the operator.

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

@phiggins

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

That seems like a good solution.

Since the proposed solutions involve modifying large bits of rails internals across projects, I'll leave you guys to it. I'm eager to see what you come up with.

@alindeman alindeman added a commit to alindeman/rspec-expectations that referenced this issue
@alindeman alindeman OperatorMatcher considers ancestor chain
* Related #191
e5e5a74
@alindeman

This would ensure that anyone using a.should =~ b with an a that is enumerable but is relying upon this expression using a.=~ would continue to see it work.

Seems reasonable to me. I opened #192 to allow operator matcher's registry to consider ancestors.

@myronmarston myronmarston added a commit that closed this issue
@myronmarston myronmarston Provide a clear failure message when using `{}.should =~ {}`.
The match_array matcher (delegated to by `=~` for enumerables) is
not meant to be used for hashes, but the `should =~` syntax doesn't
make that obvious. Previously, you would get a failure like:

     Failure/Error: actual.should =~ actual
       expected: {:foo=>"bar"}
            got: {:foo=>"bar"} (using =~)

...which is pretty confusing. Our `match_array` matcher already
includes handling for invalid arguments (such as hashes) to return
a clear failure message, but it wasn't being used for an expression
like `{}.should =~ {}` because it was only registered as an operator
matcher for `Array`. This changes it so that it is registered as an
operator matcher for any Enumerable. This improves the failure message
for enumerable types like hash and set, but on its own, it could cause
breakage for things like 1.8 strings that are Enumerable but also
define a reasonable `=~`.  The fix here changes the operator matcher
delegation logic so that it only delegates to the registered matcher
if the object has the generic Kernel implementation of the operator.
If it has a more specific implementation, we assume the user actually
wants to match using the given operator itself.

Fixes #191.
c7a91af
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.