-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
( odd x && odd y) ==> hash (x,y) == hash (negate x, negate y ) #270
Comments
also,
|
Interesting. I guess it's not a problem with vanilla FNV1 as it hashes byte at the time, so That's unfortunate. I would like to mix-in I wonder whether there are hash function with machine integers (Word64) as native input type (instead of single bytes). EDIT: Now as I think of it, given that internal state is 64bit, the 128 bit input will always have collisions. |
a quick search turned up this comment "... will provide poorer mixing of the bits" and gives the example of bit flips - I guess that's similar to flipping signs, observed in the present issue This answers the comment in the source
because ... it is bad? The code ( I am experimenting with
and it does look better (w.r.t. collisions. for run-time - did not measure but it should probably be unrolled) |
Will do (later today). I am thinking both:
This is a change (different trade-off between no. of collisions and time for computing hash function) that might affect a lot of users of hashable/unordered-containers? So, "4 rounds not 8" should be scrutinized further? (with proofs, or experiments) On the other hand, perhaps not so much code is hashing numbers and tuples. Else, people would have complained already? But the change also applies to every generic instance of Hashable (does it?) Perhaps the saving grace is that instances for bytestring (and similar) were using byte-wise computation already, and remain as they are? |
It does as we hash the constructor number. Good point, that can be changed to hash as previously. (I have to see a Generic instance of type with over 65536 constructors to compile, so hashing three zeros doesn't add much). That said, the exact values of
I should add that to the package description as well. (EDIT: Done) |
I checked your 0cfa4fd . The collisions that I was complaining about, are gone. NB: and I was thinking: for the SAT/SMT encoding folks (like me), it's a nice challenge to find a small collision (for |
What you mean by small? I guess you want to ask SMT for a counterexample of "there aren't simple bit flip collisions for hash(x,y)" |
To add to the last, don't spend too much time. I'll investigate how much breakage switching |
e.g., I did use leancheck for enumeration, but with the improved code now that takes too long (which is a good sign - there are no small collisions) or something like http://www.isthe.com/chongo/tech/comp/fnv/#zero-hash
that, of course, depends on the hashmap key type in the application. For me, right now, it's not (byte)strings but trees. I think I want hash-consing, really. |
The type now contains a hash of the module name. This hash is used for comparisons. When a TopLevelModuleName is created it is checked that the hash does not collide with previously seen hashes. The type RawTopLevelModuleName (without the hash) was added. A mapping between RawTopLevelModuleNames and ModuleNameHashes is maintained in the state. The compiler backend interface (Backend') now uses TopLevelModuleName instead of ModuleName.
this should not happen?
I think it has implications for other types, as in
this is with hashable-1.4.2.0, ghc-9.6.2
The text was updated successfully, but these errors were encountered: