You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As mentioned in Issue #30 Integral types do now hash to themselves. This leads to several problems.
Generally this behaviour would be okay if the combining function would take care of this. In the current implementation it is using a simple xor which has the following property:
xor 0 a = a
xor a 0 = a
Which makes the following hashes to be the same:
hash [] == hash [0,0,0,0] == hash [1,0,0] == hash [0,1,0] == hash [1]
As mentioned in the old issue, languages like Python also hash Ints to themselves. But they got the chaining right. In Python:
hash ((0)) = 0
hash ((0,0)) = 3713080549408328131
Further the usage of xor loads to miserable avalanche behavior for chains of Ints, For example:
hash [1,2] == hash 3
hash [1,2,3] == hash 0
This leads to a ton of hash collisions and all kind of funky errors.
This property naturally expands to generic instances which will affect a wide range of applications. For Example
data Salary = Salary { month :: Int, salary :: Int }
hash (Salary 1 2) == hash (Salary 0 3)
hash (Salary 0 1) == hash (Salard 1 0)
The text was updated successfully, but these errors were encountered:
The combining function was good when we had a good hash function for integers. Unfortunately, the hash function got reverted back to the identity function, but the combining function remained the same as it had been :-(
We don't really care that tuples of different size (e.g. (0, 0) and (0, 0, 0)) hash to the same value, as the only way you end up using both instances simultaneously is if you use an existential wrapper type, in which case you're already in deep waters.
Aside: 0 is only a left identity of the current combine implementation (which uses multiplication and xor), not a right identity.
As mentioned in Issue #30 Integral types do now hash to themselves. This leads to several problems.
Generally this behaviour would be okay if the combining function would take care of this. In the current implementation it is using a simple xor which has the following property:
xor 0 a = a
xor a 0 = a
Which makes the following hashes to be the same:
hash [] == hash [0,0,0,0] == hash [1,0,0] == hash [0,1,0] == hash [1]
As mentioned in the old issue, languages like Python also hash Ints to themselves. But they got the chaining right. In Python:
hash ((0)) = 0
hash ((0,0)) = 3713080549408328131
Further the usage of xor loads to miserable avalanche behavior for chains of Ints, For example:
hash [1,2] == hash 3
hash [1,2,3] == hash 0
This leads to a ton of hash collisions and all kind of funky errors.
This property naturally expands to generic instances which will affect a wide range of applications. For Example
data Salary = Salary { month :: Int, salary :: Int }
hash (Salary 1 2) == hash (Salary 0 3)
hash (Salary 0 1) == hash (Salard 1 0)
The text was updated successfully, but these errors were encountered: