Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeing different hash values when the hasher is wrapped in a newtype #40

Closed
ahornby opened this issue Jul 26, 2020 · 2 comments
Closed

Comments

@ahornby
Copy link

ahornby commented Jul 26, 2020

I'm seeing different hash values from ahash when the AHasher is wrapped in newtype.

I've written a test showing the difference in
ahornby@fa76d4e which is testable with cargo test.

Would you expect this to work? I was expecting only the address of the factory to matter, and whether the constructed hasher is wrapped or not to make no difference.

I don't see the difference if using the default hasher from HashMap (the commit adds a test case for that as well).

@tkaitchuck
Copy link
Owner

@ahornby This is due to a combination of two factors:

  1. Your implementation is not overriding all of the methods of Hasher and delegating them. So some are being implemented via their default implementation.
  2. The very unfortunate implementation of string's hashcode in the standard library.

String's hash function is does a double deref to get a [u8] and delegates to that.
The hash for [u8] is here: https://doc.rust-lang.org/beta/src/core/hash/mod.rs.html#667-672
as you can see it is explicitly passing in the length as a separate parameter. This is extremely unfortunate because it causes a loss in performance, because the Hasher implementations have to assume that such a call is not made and incorporate the length, because this is done in other types including arrays. (The specialization feature of aHash works around this, but requires nightly)
So this method is going to call write_usize(). aHash overrides this method, but your wrapping hasher does not, so it will instead call the default implementation here: https://doc.rust-lang.org/src/core/hash/mod.rs.html#322 which is different from how it is implemented in aHash.

So you can make your test pass if you have NewTypeHasher override every method on Hasher and delegate all of them.
That would work and I would expect it to continue to work. However it is worth noting that it is fragile. If a method is added to the Hash interface with a default provided, (as has happened in the past) then your code would not be broken. But then aHash could override that new method and suddenly you would be in the situation you are in now where the hashes don't match.

@ahornby
Copy link
Author

ahornby commented Jul 27, 2020

@tkaitchuck thanks for the great explanation! Yep, difference goes away once all methods wrapped.

@ahornby ahornby closed this as completed Jul 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants