Eq instance for HashMap #39

Closed
michalt opened this Issue May 6, 2012 · 8 comments

Comments

Projects
None yet
3 participants
@michalt
Contributor

michalt commented May 6, 2012

The current Eq instance for HashMap simply uses toList. But this doesn't quite work when the hash for two different keys is the same. Consider:

module Bug where

import Data.List ( sort )

import Data.Hashable
import qualified Data.HashMap.Strict as HashMap

a = (1, -1) :: (Int, Int)
b = (-1, 1) :: (Int, Int)

l1 = [a, b] :: [(Int, Int)]
l2 = [b, a] :: [(Int, Int)]

hm1 = HashMap.fromList (l1 `zip` [1, 1 :: Int ..])
hm2 = HashMap.fromList (l2 `zip` [1, 1 :: Int ..])

l1' = sort $ HashMap.toList hm1
l2' = sort $ HashMap.toList hm2

in GHCI:

*Bug> hash a
-34
*Bug> hash b
-34
*Bug> hm1
fromList [((1,-1),1),((-1,1),1)]
*Bug> hm2
fromList [((-1,1),1),((1,-1),1)]
*Bug> hm1 == hm2
False
*Bug> l1' == l2'
True

Also, the current instance has a comment:

-- NOTE: This is just a placeholder.
instance (Eq k, Eq v) => Eq (HashMap k v) where
    a == b = toList a == toList b

But I'm not sure how to interpret that (I'm guessing that the correct instance is on a TODO list).. Anyway, it seems to me that it would be probably better to avoid providing the above one.

@tibbe

This comment has been minimized.

Show comment Hide comment
@tibbe

tibbe May 7, 2012

Owner

Thanks for the bug report. The implementation is indeed a place-holder, but it was not intended to be incorrect (just slow.) I'm a bit snowed under at the moment. If you have the time to attempt a fix, please go ahead. A good implementation of Eq would traverse the tree, comparing elements as we go (just like e.g. Data.Map does.)

Owner

tibbe commented May 7, 2012

Thanks for the bug report. The implementation is indeed a place-holder, but it was not intended to be incorrect (just slow.) I'm a bit snowed under at the moment. If you have the time to attempt a fix, please go ahead. A good implementation of Eq would traverse the tree, comparing elements as we go (just like e.g. Data.Map does.)

michalt added a commit to michalt/unordered-containers that referenced this issue May 8, 2012

@tibbe

This comment has been minimized.

Show comment Hide comment
@tibbe

tibbe Sep 6, 2012

Owner

Fixed in 6129d50.

Owner

tibbe commented Sep 6, 2012

Fixed in 6129d50.

@tibbe tibbe closed this Sep 6, 2012

@tibbe

This comment has been minimized.

Show comment Hide comment
@tibbe

tibbe Oct 9, 2012

Owner

Gabriele Sales writes:

I'm using HashMap.Strict from unordered-containers-0.2.2.1 with ghc-7.4.2.

I may have found an issue with the filter function. On my data the I'm
getting the following:

map1 == filter pred map2 => False
map1 == fromList . toList $ filter pred map2 => True

In the first case the two maps resulted to be different, but if I
repack the second after the call to filter I'm getting that they are
equal.

I would like to give you some example data to reproduce the issue, but
at this point my code is rather large and extracting the relevant part
would require some effort. So I was thinking about an alternative.

My guess is that the equality check fails because the two maps hold
the same elements, but the internal (hidden) structures differ. Do you
happen to have some debug code pretty-printing the internal nodes of
an HashMap?

@michalt Do you mind taking a look?

Owner

tibbe commented Oct 9, 2012

Gabriele Sales writes:

I'm using HashMap.Strict from unordered-containers-0.2.2.1 with ghc-7.4.2.

I may have found an issue with the filter function. On my data the I'm
getting the following:

map1 == filter pred map2 => False
map1 == fromList . toList $ filter pred map2 => True

In the first case the two maps resulted to be different, but if I
repack the second after the call to filter I'm getting that they are
equal.

I would like to give you some example data to reproduce the issue, but
at this point my code is rather large and extracting the relevant part
would require some effort. So I was thinking about an alternative.

My guess is that the equality check fails because the two maps hold
the same elements, but the internal (hidden) structures differ. Do you
happen to have some debug code pretty-printing the internal nodes of
an HashMap?

@michalt Do you mind taking a look?

@tibbe tibbe reopened this Oct 9, 2012

michalt added a commit to michalt/unordered-containers that referenced this issue Oct 9, 2012

@michalt

This comment has been minimized.

Show comment Hide comment
@michalt

michalt Oct 9, 2012

Contributor

The patch should fix it (I've also opened a pull request).

Contributor

michalt commented Oct 9, 2012

The patch should fix it (I've also opened a pull request).

@gbrsales

This comment has been minimized.

Show comment Hide comment
@gbrsales

gbrsales Oct 10, 2012

I've put together a QuickCheck test that reliably reproduces the issue for me:

https://raw.github.com/gbrsales/unordered-containers-issue39/master/EqAfterDelete.hs

I've put together a QuickCheck test that reliably reproduces the issue for me:

https://raw.github.com/gbrsales/unordered-containers-issue39/master/EqAfterDelete.hs

tibbe added a commit that referenced this issue Oct 10, 2012

Add second regression test for issue #39
Contributed by Gabriele Sales.

michalt added a commit to michalt/unordered-containers that referenced this issue Oct 10, 2012

michalt added a commit to michalt/unordered-containers that referenced this issue Oct 17, 2012

@michalt

This comment has been minimized.

Show comment Hide comment
@michalt

michalt Oct 17, 2012

Contributor

Referenced above is my attempt at the Eq instance without structural equality. The idea is quite simple -- it's doing almost the same thing as the initial version with toList. The difference is that the case of Collision is handled with more care.

So it doesn't require structural equality, but does depend on the assumption that the lists of Leafs and Collisions when read from left to right (i.e., like a yield in a parse tree) are the same in equal HashMaps.

Also, it passes all tests.

What do you think about it? I wasn't sure if you want to merge it or just have a backup plan, so I didn't open a pull request (it's in my eq2 branch).

Contributor

michalt commented Oct 17, 2012

Referenced above is my attempt at the Eq instance without structural equality. The idea is quite simple -- it's doing almost the same thing as the initial version with toList. The difference is that the case of Collision is handled with more care.

So it doesn't require structural equality, but does depend on the assumption that the lists of Leafs and Collisions when read from left to right (i.e., like a yield in a parse tree) are the same in equal HashMaps.

Also, it passes all tests.

What do you think about it? I wasn't sure if you want to merge it or just have a backup plan, so I didn't open a pull request (it's in my eq2 branch).

@tibbe

This comment has been minimized.

Show comment Hide comment
@tibbe

tibbe Oct 18, 2012

Owner

I think I want to merge it. I also want the change that deforests the structure when there's only a single leaf in a BitmapIndexed node. I think the latter is a nice optimization. I don't yet want to enforce the invariant that this deforestation happens (which is required by your earlier == implementation). The code base isn't ready for that yet.

Could you please make the needed pull requests?

Owner

tibbe commented Oct 18, 2012

I think I want to merge it. I also want the change that deforests the structure when there's only a single leaf in a BitmapIndexed node. I think the latter is a nice optimization. I don't yet want to enforce the invariant that this deforestation happens (which is required by your earlier == implementation). The code base isn't ready for that yet.

Could you please make the needed pull requests?

@tibbe

This comment has been minimized.

Show comment Hide comment
@tibbe

tibbe Oct 18, 2012

Owner

Fixed in df6c50e.

Owner

tibbe commented Oct 18, 2012

Fixed in df6c50e.

@tibbe tibbe closed this Oct 18, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment