Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Eq instance for HashMap #39

Closed
michalt opened this Issue · 8 comments

3 participants

@michalt

The current Eq instance for HashMap simply uses toList. But this doesn't quite work when the hash for two different keys is the same. Consider:

module Bug where

import Data.List ( sort )

import Data.Hashable
import qualified Data.HashMap.Strict as HashMap

a = (1, -1) :: (Int, Int)
b = (-1, 1) :: (Int, Int)

l1 = [a, b] :: [(Int, Int)]
l2 = [b, a] :: [(Int, Int)]

hm1 = HashMap.fromList (l1 `zip` [1, 1 :: Int ..])
hm2 = HashMap.fromList (l2 `zip` [1, 1 :: Int ..])

l1' = sort $ HashMap.toList hm1
l2' = sort $ HashMap.toList hm2

in GHCI:

*Bug> hash a
-34
*Bug> hash b
-34
*Bug> hm1
fromList [((1,-1),1),((-1,1),1)]
*Bug> hm2
fromList [((-1,1),1),((1,-1),1)]
*Bug> hm1 == hm2
False
*Bug> l1' == l2'
True

Also, the current instance has a comment:

-- NOTE: This is just a placeholder.
instance (Eq k, Eq v) => Eq (HashMap k v) where
    a == b = toList a == toList b

But I'm not sure how to interpret that (I'm guessing that the correct instance is on a TODO list).. Anyway, it seems to me that it would be probably better to avoid providing the above one.

@tibbe
Owner

Thanks for the bug report. The implementation is indeed a place-holder, but it was not intended to be incorrect (just slow.) I'm a bit snowed under at the moment. If you have the time to attempt a fix, please go ahead. A good implementation of Eq would traverse the tree, comparing elements as we go (just like e.g. Data.Map does.)

@michalt michalt referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@michalt michalt referenced this issue from a commit
Commit has since been removed from the repository and is no longer available.
@michalt michalt referenced this issue from a commit in michalt/unordered-containers
@michalt michalt Fix Eq instance for HashMap (issue #39). 6129d50
@tibbe
Owner

Fixed in 6129d50.

@tibbe tibbe closed this
@tibbe
Owner

Gabriele Sales writes:

I'm using HashMap.Strict from unordered-containers-0.2.2.1 with ghc-7.4.2.

I may have found an issue with the filter function. On my data the I'm
getting the following:

map1 == filter pred map2 => False
map1 == fromList . toList $ filter pred map2 => True

In the first case the two maps resulted to be different, but if I
repack the second after the call to filter I'm getting that they are
equal.

I would like to give you some example data to reproduce the issue, but
at this point my code is rather large and extracting the relevant part
would require some effort. So I was thinking about an alternative.

My guess is that the equality check fails because the two maps hold
the same elements, but the internal (hidden) structures differ. Do you
happen to have some debug code pretty-printing the internal nodes of
an HashMap?

@michalt Do you mind taking a look?

@tibbe tibbe reopened this
@michalt michalt referenced this issue from a commit in michalt/unordered-containers
@michalt michalt Don't leave a BitmapIndexed with only one child (issue #39). 0c329b2
@michalt

The patch should fix it (I've also opened a pull request).

@gbrsales

I've put together a QuickCheck test that reliably reproduces the issue for me:

https://raw.github.com/gbrsales/unordered-containers-issue39/master/EqAfterDelete.hs

@tibbe tibbe referenced this issue from a commit
@tibbe Add second regression test for issue #39
Contributed by Gabriele Sales.
0b7ef83
@michalt michalt referenced this issue from a commit in michalt/unordered-containers
@michalt michalt Don't leave BitmapIndexed with a single Leaf child (issue #39). 74a1314
@michalt michalt referenced this issue from a commit in michalt/unordered-containers
@michalt michalt Don't depend on structural equality in Eq (issue #39) df6c50e
@michalt

Referenced above is my attempt at the Eq instance without structural equality. The idea is quite simple -- it's doing almost the same thing as the initial version with toList. The difference is that the case of Collision is handled with more care.

So it doesn't require structural equality, but does depend on the assumption that the lists of Leafs and Collisions when read from left to right (i.e., like a yield in a parse tree) are the same in equal HashMaps.

Also, it passes all tests.

What do you think about it? I wasn't sure if you want to merge it or just have a backup plan, so I didn't open a pull request (it's in my eq2 branch).

@tibbe
Owner

I think I want to merge it. I also want the change that deforests the structure when there's only a single leaf in a BitmapIndexed node. I think the latter is a nice optimization. I don't yet want to enforce the invariant that this deforestation happens (which is required by your earlier == implementation). The code base isn't ready for that yet.

Could you please make the needed pull requests?

@tibbe
Owner

Fixed in df6c50e.

@tibbe tibbe closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.