Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some inner loops may benefit from (manual) static argument transformation #373

Open
sjakobi opened this issue Mar 9, 2022 · 2 comments
Open

Comments

@sjakobi
Copy link
Member

sjakobi commented Mar 9, 2022

Many inner loops in this package have a lot of static arguments, for example lookupCont:

lookupCont absent present !h0 !k0 !s0 !m0 = go h0 k0 s0 m0
where
go :: Eq k => Hash -> k -> Int -> HashMap k v -> r
go !_ !_ !_ Empty = absent (# #)
go h k _ (Leaf hx (L kx x))
| h == hx && k == kx = present x (-1)
| otherwise = absent (# #)
go h k s (BitmapIndexed b v)
| b .&. m == 0 = absent (# #)
| otherwise =
go h k (s+bitsPerSubkey) (A.index v (sparseIndex b m))
where m = mask h s
go h k s (Full v) =
go h k (s+bitsPerSubkey) (A.index v (index h s))
go h k _ (Collision hx v)
| h == hx = lookupInArrayCont absent present k v
| otherwise = absent (# #)
{-# INLINE lookupCont #-}

In bytestring we've sped up some functions by removing these arguments from the inner loop. See e.g. haskell/bytestring#273, haskell/bytestring#347. It would be good to check if u-c can benefit from similar transformations.

Note that GHC also has a -fstatic-argument-transformation option, but issues like GHC19285 suggest that it may not be in a usable state.

This GHC wiki page is interesting though: https://gitlab.haskell.org/ghc/ghc/-/wikis/static-argument-transformation

@sjakobi
Copy link
Member Author

sjakobi commented Mar 15, 2022

two.go might be a good candidate:

two :: Shift -> Hash -> k -> v -> Hash -> HashMap k v -> ST s (HashMap k v)
two = go
where
go s h1 k1 v1 h2 t2
| bp1 == bp2 = do
st <- go (s+bitsPerSubkey) h1 k1 v1 h2 t2
ary <- A.singletonM st
return $ BitmapIndexed bp1 ary
| otherwise = do
mary <- A.new 2 $! Leaf h1 (L k1 v1)
A.write mary idx2 t2
ary <- A.unsafeFreeze mary
return $ BitmapIndexed (bp1 .|. bp2) ary
where
bp1 = mask h1 s
bp2 = mask h2 s
idx2 | index h1 s < index h2 s = 1
| otherwise = 0

All but the s argument are static.

@sjakobi
Copy link
Member Author

sjakobi commented Mar 20, 2022

Thinking about this issue, I noticed an important difference to bytestring: In this library, most loops are bounded with a fairly small limit:

  • Loops over the arrays of BitmapIndexed or Full have at most 32 iterations.
  • Loops to the depth of the tree have at most ceiling (64 / 5) = 13 iterations (if I correctly understand the data structure)

For such loops the overhead of the static arguments may be fairly negligible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant