Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Ord instances for IntSet and IntMap #470

Open
treeowl opened this issue Jan 1, 2018 · 14 comments
Open

Improve Ord instances for IntSet and IntMap #470

treeowl opened this issue Jan 1, 2018 · 14 comments

Comments

@treeowl
Copy link
Contributor

treeowl commented Jan 1, 2018

The current Ord instances for IntSet and IntMap don't take advantage of the way these types are structured at all. We may therefore be leaving some performance on the table. In particular:

  1. I don't think we should need to convert the maps to lists for comparison.
  2. We can recognize certain special arrangements of keys, prefixes, and perhaps masks to avoid following a bunch of pointers for nothing. For example, if one map has a negative prefix and the other has a (strictly) positive prefix, we are done.
@Zemyla
Copy link

Zemyla commented Jul 17, 2019

If nothing else, we should have use lists with unpacked Ints.

data KeyList
  = LEnd
  | LCons {-# UNPACK #-} !Key KeyList
  deriving (Eq, Ord)

data KeyPairList a
  = PEnd
  | PCons {-# UNPACK #-} !Key a (KeyPairList a)
  deriving (Eq, Ord)

And then you have pre-inlined functions to turn an IntSet into a KeyList and an IntMap into a KeyPairList.

Though what would probably be even better, at least for the KeyList case, would be:

data KeyList
  = LEnd
  | LCons {-# UNPACK #-} !Prefix {-# UNPACK #-} !BitMap KeyList
  deriving (Eq)

instance Ord KeyList where
  compare LEnd LEnd = EQ
  compare LEnd _ = LT
  compare (LCons prea bita la) (LCons preb bitb lb)
    | prea < preb = LT
    | prea > preb = GT
    | otherwise = case compare (revNat bita) (revNat bitb) of
        EQ -> compare la lb
        c  -> c

This way, there's only one entry per Tip node in each IntSet. This sort of thing would probably also speed up Show IntSet and Show IntMap as well.

EDIT: However, for those, the best bet may be just rewriting it so that the fold over the container constructs the desired string directly, such as:

instance Show IntSet where
  showsPrec p is = showParen (p > 10) $ \str -> showString "fromList [" $ either id id $ foldr showOne (Left (']':str)) is where
    showOne !i r = Right $ shows i $ either id ((:) ',') r

I'm pretty sure the definition for showList @Int isn't going to change from the default any time in the near future, is it? Similarly, because the liftShowWith2 instance for (,) is the default as well, we can inline the Show instance for IntMap the same way.

instance Show a => Show (IntMap a) where
  showsPrec p im = showParen (p > 10) $ \s -> showString "fromList [" $ either id id $ foldrWithKey showOne (Left (']':s)) im where
    showOne !i a r = Right $ showChar '(' $ shows i $ showChar ',' $ shows a $ showChar ')' $ either id ((:) ',') r

This means that we don't have to worry about creating and then destroying intermediate lists, boxed Ints, and tuples.

@treeowl
Copy link
Contributor Author

treeowl commented Jul 17, 2019

I'm not so sure about revNat being the right approach (it's kind of heavy). I think something like this might work:

instance Ord KeyList where
  compare LEnd LEnd = EQ
  compare LEnd _ = LT
  compare (LCons prea bita la) (LCons preb bitb lb)
    | prea < preb = LT
    | prea > preb = GT
    | lbm <- lowestBitMask (bita `xor` bitb)
    = compare (bita && lbm) (bitb && lbm) <> compare la lb

Still, it would be nice to use the IntMap/IntSet structure a bit...

@jwaldmann
Copy link
Contributor

Benchmarks? Ideally, derived from actual use cases.

Where do we need Ord on Sets? When we take a Set (or Map) of sets.
As in the power-set construction, for making deterministic automata?

I have something at https://gitlab.imn.htwk-leipzig.de/autotool/all0/blob/master/fa/src/Autolib/NFA/Det.hs but it needs to be extracted before being useful.

jwaldmann pushed a commit to jwaldmann/containers that referenced this issue Jul 18, 2019
@jwaldmann
Copy link
Contributor

jwaldmann commented Jul 18, 2019

stand-alone NFA determinisation in this branch: https://github.com/jwaldmann/containers/commits/instance-Ord-IntSet

Example profile - see below. I think it shows that indeed, Data.IntSet.Internal.compare is exercised here.

        Fri Jul 19 00:41 2019 Time and Allocation Profiling Report  (Final)

           ord-intset-benchmarks +RTS -P -RTS -q det/hard/n=16

        total time  =        7.55 secs   (7550 ticks @ 1000 us, 1 processor)
        total alloc = 13,662,565,528 bytes  (excludes profiling overheads)

COST CENTRE        MODULE                            SRC                                                %time %alloc  ticks     bytes

shiftRL            Utils.Containers.Internal.BitUtil src/Utils/Containers/Internal/BitUtil.hs:100:1-22   27.2   18.1   2055 2470760640
revNat             Data.IntSet.Internal              src/Data/IntSet/Internal.hs:(1464,1)-(1469,98)      26.1   33.4   1971 4568460432
toAscList          Data.IntSet.Internal              src/Data/IntSet/Internal.hs:1016:1-24               18.1   37.1   1365 5068974496
shiftLL            Utils.Containers.Internal.BitUtil src/Utils/Containers/Internal/BitUtil.hs:101:1-22   15.4    7.7   1162 1055309648
compare            Data.IntSet.Internal              src/Data/IntSet/Internal.hs:1160:5-57                6.4    0.0    483        16
findWithDefault.go Data.IntMap.Internal              src/Data/IntMap/Internal.hs:(625,5)-(630,16)         1.4    0.0    107         0
det.go.ts.next     Main                              benchmarks/OrdIntSet.hs:(64,26)-(66,72)              0.5    1.0     34 138936336

But - it really is never a good idea to use Set IntSet (or similar). It's at least as expensive as Set [Int].

NB: I wonder how poeople do implement the powerset construction (from NFA to DFA). Perhaps the answer is "they don't" (and go from regular expression to DFA directly).

So, are there other applications where one needs an (optimized) instance Ord IntSet?

@jwaldmann
Copy link
Contributor

I am trying this https://github.com/jwaldmann/containers/blob/instance-Ord-IntSet/containers-tests/benchmarks/OrdIntSet.hs#L76

It seems to be working for natural numbers, not for negative keys, since they break the assumption that in Bin p m l r, all the keys in l are smaller than all in r.

*Main> Bin p m l r = fromList [-1,0]
*Main> (l,r)
(fromList [0],fromList [-1])

But these cases should be easy to detect? I could use some ideas here (also for relateBM) @treeowl @int-e

jwaldmann added a commit to jwaldmann/containers that referenced this issue Jul 21, 2019
jwaldmann added a commit to jwaldmann/containers that referenced this issue Jul 21, 2019
@jwaldmann
Copy link
Contributor

Yes! I get a nice speed-up (around 3), and much reduced allocation.

@treeowl
Copy link
Contributor Author

treeowl commented Jul 21, 2019 via email

@jwaldmann
Copy link
Contributor

In my benchmarks (NFA -> DFA) I think the IntSet always fits into one Tip. I will spread out the numbers and see what happens there.

jwaldmann added a commit to jwaldmann/containers that referenced this issue Jul 22, 2019
@jwaldmann
Copy link
Contributor

I now have a version that I think is correct (by enumerative testing, it agrees with the original version, and for the NFA to DFA conversion, automata have expected size). Performance is roughly

  • for small trees (Tip only): 30 percent runtime
  • for large trees (each Tip is singleton): 70 percent runtime (both versions walk the tree in the same way but I don't do any allocation)

I am sure the code can be golfed and micro-optimised (using equivalences of bitwise operations). I don't have much experience with that.

@treeowl
Copy link
Contributor Author

treeowl commented Jul 22, 2019 via email

@jwaldmann
Copy link
Contributor

actually it's three parts

  • implementation
  • test for correctness (compare s t == compare (toAscList s) (toAscList t))
  • benchmark

@treeowl
Copy link
Contributor Author

treeowl commented Jul 22, 2019 via email

@jwaldmann
Copy link
Contributor

Are these properties (that Eq and Ord of IntSet go via toAscList) announced anywhere? If not, that deserves a separate discussion?

@jwaldmann
Copy link
Contributor

jwaldmann commented Jul 22, 2019

Oh I see now that intset-properties.hs already contains

prop_ord :: IntSet -> IntSet -> Bool
prop_ord s1 s2 = s1 `compare` s2 == toList s1 `compare` toList s2

but it's hidden deep down the file. I'd expected it to go at the top (following the order of the declarations:
data type, instances, operations).

And it really should be toAscList.

jwaldmann added a commit to jwaldmann/containers that referenced this issue Jul 22, 2019
that avoids toAscList and walks the tree directly. See haskell#470
treeowl pushed a commit that referenced this issue Dec 22, 2019
* Test that instances for Eq and Ord agree with going via toAscList

* Add benchmark for "instance Ord IntSet", using "Set IntSet"

* Improve implementation of "instance Ord IntSet"
that avoids toAscList and walks the tree directly. See #470
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants