Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RULES for Data.IntMap.alterF #467

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

m-renaud
Copy link
Contributor

@m-renaud m-renaud commented Dec 29, 2017

Adds afterF rewrite RULES for the Identity functor.

Effect on Performance

Identity

Since Identity a is isomorphic to a we can rewrite alterF to to use the more efficient alter which has complexity O(min(n,w)) compared to O(log n) of alterF. alter also has a smaller constant factor because it only needs to traverse the map once whereas alterF must do so twice (once to find the element and then again to delete).

Testing

  • Adds unit tests for alterF for [], Identity, and Const
  • Property test comparing the rewritten versions to versions operating on isomorphic structures

Benchmark

Benchmarks performed using GHC 8.2.2

benchmarking alterF_IdentityNoRewrite
time                 1.101 ms   (1.089 ms .. 1.113 ms)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 1.099 ms   (1.091 ms .. 1.111 ms)
std dev              31.16 μs   (22.66 μs .. 48.93 μs)
variance introduced by outliers: 18% (moderately inflated)
               
benchmarking alterF_IdentityRewrite
time                 903.7 μs   (865.3 μs .. 946.8 μs)
                     0.989 R²   (0.981 R² .. 0.996 R²)
mean                 864.9 μs   (848.6 μs .. 886.1 μs)
std dev              63.00 μs   (50.28 μs .. 95.92 μs)
variance introduced by outliers: 59% (severely inflated)
               
benchmarking alterF_Const
time                 19.25 μs   (18.28 μs .. 20.68 μs)
                     0.971 R²   (0.962 R² .. 0.985 R²)
mean                 20.66 μs   (19.74 μs .. 21.61 μs)
std dev              3.234 μs   (2.766 μs .. 3.837 μs)
variance introduced by outliers: 94% (severely inflated)

No rewrite RULES were added for Const because the benchmark results showed no meaningful difference.

This addresses #325.

These ensure that the rewritten rules perform the same as the non-rewritten counterparts.
Benchmark results:

benchmarking alter
time                 698.5 μs   (676.3 μs .. 726.1 μs)
                     0.994 R²   (0.991 R² .. 0.999 R²)
mean                 674.8 μs   (668.6 μs .. 686.6 μs)
std dev              28.63 μs   (14.53 μs .. 45.31 μs)
variance introduced by outliers: 34% (moderately inflated)

benchmarking alterF
time                 969.5 μs   (962.1 μs .. 977.3 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 956.7 μs   (952.8 μs .. 961.1 μs)
std dev              13.98 μs   (11.21 μs .. 17.72 μs)

benchmarking alterFIdentity
time                 660.7 μs   (654.5 μs .. 673.4 μs)
                     0.994 R²   (0.990 R² .. 0.997 R²)
mean                 674.1 μs   (660.4 μs .. 691.3 μs)
std dev              49.44 μs   (38.40 μs .. 57.64 μs)
variance introduced by outliers: 61% (severely inflated)

benchmarking alterFConst
time                 14.98 μs   (13.94 μs .. 16.02 μs)
                     0.976 R²   (0.972 R² .. 0.990 R²)
mean                 15.12 μs   (14.57 μs .. 15.84 μs)
std dev              2.104 μs   (1.813 μs .. 2.236 μs)
variance introduced by outliers: 92% (severely inflated)
@m-renaud
Copy link
Contributor Author

Latest failure is due to CI issue, failed to install ghc for one of the versions.

Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work so far! Just a few little quibbles.

Data/IntMap/Internal.hs Outdated Show resolved Hide resolved
Data/IntMap/Internal.hs Outdated Show resolved Hide resolved
Data/IntMap/Internal.hs Outdated Show resolved Hide resolved
tests/intmap-properties.hs Outdated Show resolved Hide resolved
tests/intmap-properties.hs Outdated Show resolved Hide resolved
@treeowl
Copy link
Contributor

treeowl commented Dec 30, 2017

@m-renaud, you can click the restart button (circle with arrow) next to the test that failed in the Travis interface. Or at least I think you can. If not, let me know and I'll see about getting permissions fixed.

@m-renaud
Copy link
Contributor Author

Found it, thanks! I originally wasn't logged in and was on my phone so it wasn't very apparent. Addressed your comments, PTAL.

#if MIN_VERSION_base(4,9,0)
{-# RULES
"Const specialize alterF" forall (f :: Maybe a -> Const x (Maybe a)) k m.
alterF f k m = coerce . f $ lookup k m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for coerce here! If you really need Const . getConst here, then use it, but maybe give it a comment or a type signature. We want to use coerce only when needed to prevent silly eta expansion from adding closures and indirection at runtime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, of course, to re-type a whole structure from the outside.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, either Const . getConst or coerce is needed because f :: Maybe a -> Const c (Maybe a) but the function returns Const c (IntMap a). Since the second type parameter for Const is phantom I would have thought that would be an appropriate use of coerce, but I also just learned about Coercible this morning so I may very well be wrong 😛.

I've changed it back to Const . getConst but I was wondering, would Const . getConst have runtime overhead or should the optimizer figure out that this is just changing a phantom parameter? It seems to me that Const . getConst is equivalent to coerce id for Const.

@treeowl
Copy link
Contributor

treeowl commented Dec 30, 2017

Const, getConst, Const . getConst, and runIdentity should all be very reliably equivalent to coerce, at least when optimizations are enabled. The issue that comes with the Identity rule is that

runIdentity . f = coerce . f =?= coerce f

Why is that last equality questionable?

coerce . undefined = \x -> coerce (undefined x) = \ _ -> undefined  /=   undefined

Yuck! Because of this, GHC will (generally at least) only turn g (runIdentity . f) into coerce g f if it either

  1. Knows that f has an arity of at least 1, or
  2. Knows that g only uses f by applying it to an argument, and doesn't do something shady like using seq on it. I believe this currently requires that g be inlined, but I could be wrong.

In practice, GHC is pretty good at this stuff, but sometimes it goofs. In those circumstances where there's a risk, it's probably best to help it out. In first-order circumstances, there's generally no reason to bother (and make the code less clear).

@m-renaud
Copy link
Contributor Author

Thanks for the explanation! Do you have any more comments, or is this ready to be merged?

@treeowl
Copy link
Contributor

treeowl commented Jan 3, 2018

I don't understand the benchmark results. What does each of them mean? Please clarify in the commit message and PR text. How does the rewrite rule change performance for Identity and for Const? Be careful; if you define your own versions of Identity and Const to test with, be sure to copy the Functor and Applicative instances from base; there's some coercion magic about, as I recall.

@treeowl
Copy link
Contributor

treeowl commented Jan 3, 2018

For Identity:

instance Functor Identity where
  fmap = coerce
instance Applicative Identity where
  pure = Identity
  (<*>) = coerce
  liftA2 = coerce

For Const,

instance Monoid m => Applicative (Const m) where
    pure _ = Const mempty
    liftA2 _ (Const x) (Const y) = Const (x `mappend` y)
    (<*>) = coerce (mappend :: m -> m -> m)

@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 4, 2018

Added a section to the PR text, I can add it to the squashed commit when the branch is merged as well.

be sure to copy the Functor and Applicative instances from base

Good catch, thanks for pointing that out! The TestConst functor instance was the same, but I've updated TestIdentityto use coerce. There's no need to define an applicative instance because alterF only requires Functor.

@treeowl
Copy link
Contributor

treeowl commented Jan 4, 2018

No, I mean what does each line in the benchmark results actually represent? Most especially, what does the alterF line mean?

@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 4, 2018

Ohhh, sorry, I misunderstood. I'll rename (and rework the benchmarks) as soon as I get a chance. I was thinking:

  • alterF_IdentityNoRewrite (using TestIdentity)
  • alterF_IdentityRewrite (using Identity)
  • alterF_ConstNoRewrite (using TestConst)
  • alterF_ConstRewrite (using Const)

Does that sound better?

@treeowl
Copy link
Contributor

treeowl commented Jan 4, 2018 via email

@treeowl
Copy link
Contributor

treeowl commented Jan 5, 2018

You have bona fide CI failures. It looks like you forgot to import Data.Coerce (coerce) some places.

@treeowl
Copy link
Contributor

treeowl commented Jan 5, 2018

By the way, there are buttons to restart or cancel CI runs on the Travis page. Do you see those? Please restart your own when there are silly problems, and please try to cancel your own when they're out of date or doomed to fail. Canceling useless builds saves time for the whole Haskell organization. Speaking of which, as I think I said, you should use [ci skip] where appropriate. In particular, your whole tutorial project should be [ci skip] unless and until we make Travis build those docs. Since we don't even make Travis build the Haddocks, I don't think that's going to happen anytime soon.

@treeowl
Copy link
Contributor

treeowl commented Jan 8, 2018

Can you fix up the tiny problem so this can pass CI? I want it merged!

Results:

benchmarking alterF_IdentityNoRewrite
time                 1.177 ms   (1.171 ms .. 1.184 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.162 ms   (1.156 ms .. 1.169 ms)
std dev              20.18 μs   (16.39 μs .. 26.88 μs)

benchmarking alterF_IdentityRewrite
time                 818.9 μs   (814.4 μs .. 823.2 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 808.8 μs   (804.7 μs .. 815.8 μs)
std dev              17.13 μs   (10.29 μs .. 26.35 μs)
variance introduced by outliers: 11% (moderately inflated)

benchmarking alterF_ConstNoRewrite
time                 63.80 μs   (62.22 μs .. 65.81 μs)
                     0.996 R²   (0.993 R² .. 1.000 R²)
mean                 62.13 μs   (61.61 μs .. 63.07 μs)
std dev              2.440 μs   (1.271 μs .. 4.050 μs)
variance introduced by outliers: 42% (moderately inflated)

benchmarking alterF_ConstRewrite
time                 17.24 μs   (16.35 μs .. 18.38 μs)
                     0.979 R²   (0.974 R² .. 0.990 R²)
mean                 17.92 μs   (17.18 μs .. 18.71 μs)
std dev              2.446 μs   (2.125 μs .. 2.578 μs)
variance introduced by outliers: 92% (severely inflated)
@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 8, 2018

Sorry for the delay, I only had a few minutes here and there over the weekend to look at this stuff. I've re-organized the benchmarks as discussed and fixed the import.

It also appears that the import for Data.Coerce in IntMap/Internal.hs was redundant since it's been included in __GLASGOW_HASKELL__ >= 709, do I need to add another conditional to import if base >= 4.8 and GLASGOW_HASKELL > 709?

This improves the Const benchmark for the non-rewritten case:

Before
======

benchmarking alterF_IdentityNoRewrite
time                 1.177 ms   (1.171 ms .. 1.184 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 1.162 ms   (1.156 ms .. 1.169 ms)
std dev              20.18 μs   (16.39 μs .. 26.88 μs)

benchmarking alterF_IdentityRewrite
time                 818.9 μs   (814.4 μs .. 823.2 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 808.8 μs   (804.7 μs .. 815.8 μs)
std dev              17.13 μs   (10.29 μs .. 26.35 μs)
variance introduced by outliers: 11% (moderately inflated)

benchmarking alterF_ConstNoRewrite
time                 63.80 μs   (62.22 μs .. 65.81 μs)
                     0.996 R²   (0.993 R² .. 1.000 R²)
mean                 62.13 μs   (61.61 μs .. 63.07 μs)
std dev              2.440 μs   (1.271 μs .. 4.050 μs)
variance introduced by outliers: 42% (moderately inflated)

benchmarking alterF_ConstRewrite
time                 17.24 μs   (16.35 μs .. 18.38 μs)
                     0.979 R²   (0.974 R² .. 0.990 R²)
mean                 17.92 μs   (17.18 μs .. 18.71 μs)
std dev              2.446 μs   (2.125 μs .. 2.578 μs)
variance introduced by outliers: 92% (severely inflated)

After
=====

benchmarking alterF_IdentityNoRewrite
time                 1.134 ms   (1.116 ms .. 1.151 ms)
                     0.998 R²   (0.997 R² .. 0.999 R²)
mean                 1.100 ms   (1.092 ms .. 1.112 ms)
std dev              31.57 μs   (25.08 μs .. 46.04 μs)
variance introduced by outliers: 17% (moderately inflated)

benchmarking alterF_IdentityRewrite
time                 849.4 μs   (844.1 μs .. 853.8 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 836.1 μs   (831.8 μs .. 841.5 μs)
std dev              15.55 μs   (11.92 μs .. 24.54 μs)

benchmarking alterF_ConstNoRewrite
time                 21.12 μs   (20.26 μs .. 21.66 μs)
                     0.992 R²   (0.983 R² .. 0.999 R²)
mean                 20.71 μs   (19.90 μs .. 21.08 μs)
std dev              1.685 μs   (953.0 ns .. 2.410 μs)
variance introduced by outliers: 79% (severely inflated)

benchmarking alterF_ConstRewrite
time                 17.04 μs   (16.48 μs .. 18.07 μs)
                     0.977 R²   (0.969 R² .. 0.987 R²)
mean                 18.76 μs   (17.82 μs .. 19.84 μs)
std dev              3.275 μs   (2.764 μs .. 3.809 μs)
variance introduced by outliers: 95% (severely inflated)

[ci skip]
@treeowl
Copy link
Contributor

treeowl commented Jan 8, 2018

Um.... I wasn't telling you to just change that. I was suggesting that you check what that does to the benchmarks. I want to understand why there's a difference for Const, which I don't think there should be.

This also checks that the two results are 'valid'.
@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 8, 2018

I originally had a todo to see if alterF should be marked INLINABLE [1] or NOINLINE [1] but it somehow got clobbered (I went with NOINLINE to be safe). The difference in Const mostly goes away (see benchmark results in commit description), I think the remaining difference may just be noise (but I wanted to see what you thought). With such a small difference I'm not sure the custom rules for Const are necessary now, wdyt?

Both spellings work but the docs use "INLINABLE".

[ci skip]
@treeowl
Copy link
Contributor

treeowl commented Jan 8, 2018

Oh, I missed the new benchmark results in your commit. I think you should run more tests. See what happens if you reverse the order the benchmarks run in, for example. If the difference remains, it would be nice to understand why, but we should probably keep the rule. If the difference is really just noise, scrap the rule.

@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 8, 2018

I ran it several more times and changed the order of benchmarks but the results remain about the same.

@treeowl
Copy link
Contributor

treeowl commented Jan 8, 2018 via email

@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 8, 2018

Woah woah woah, I ran it a few more times for good measure and got the below results which indicate that the rewrite is now slower than the non-rewritten version :/

benchmarking alterF_ConstRewrite
time                 23.26 μs   (22.48 μs .. 23.70 μs)
                     0.993 R²   (0.986 R² .. 0.997 R²)
mean                 22.07 μs   (21.30 μs .. 22.63 μs)
std dev              2.207 μs   (1.692 μs .. 2.816 μs)
variance introduced by outliers: 85% (severely inflated)
             
benchmarking alterF_ConstNoRewrite
time                 18.75 μs   (17.94 μs .. 19.86 μs)
                     0.978 R²   (0.968 R² .. 0.992 R²)
mean                 20.02 μs   (19.11 μs .. 21.07 μs)
std dev              3.218 μs   (2.715 μs .. 3.618 μs)
variance introduced by outliers: 94% (severely inflated)

@treeowl
Copy link
Contributor

treeowl commented Jan 8, 2018 via email

@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 9, 2018

Done. I've left a few of the tests and benchmarks in case we want to revisit this in the future.

foldl' (\m k -> runTestIdentity $ M.alterF (TestIdentity . id) k m) m xs

altFIdentity :: [Int] -> M.IntMap Int -> M.IntMap Int
altFIdentity xs m = foldl' (\m k -> runIdentity $ M.alterF (pure . id) k m) m xs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waait a minute. You're only testing one way of using alterF! Please copy over the relevant Data.Map benchmarks and test properly! Also, pure . id is the same as pure.....

@treeowl
Copy link
Contributor

treeowl commented Jan 9, 2018

Also, please indicate in the commit what GHC version you used for benchmarking, and use GHC >= 8.2. I don't know if the new join point system affects this stuff, but I would not be at all surprised. NB: when you squash this thing, make sure the final commit message is the one with the latest/correctest benchmarks.

@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 9, 2018

Updated PR description with new benchmark results and note on what GHC version used. Is rewording an already pushed commit allowed? I thought it may mess up history or something.

@treeowl
Copy link
Contributor

treeowl commented Jan 9, 2018

Speaking of GHC changes, I should probably rerun the benchmarks that led to my choice of alterF Identity specialization for Data.Map.

@treeowl
Copy link
Contributor

treeowl commented Jan 9, 2018 via email

@treeowl
Copy link
Contributor

treeowl commented Jan 9, 2018

You still need the additional benchmarks ported from Data.Map.

These benchmarks tell a different story from the previous
benchmarks. Specifically, the rewritten alterF for Identity appears to perform
better in casses when the element being altered is present, but worse when the
element is absent. Specifically:
  better for: "benchmarking alterF delete present"
  	      "benchmarking alterF alter insert"
	      "benchmarking alterF alter update"
	      "benchmarking alterF alter delete"
  worse for:  "benchmarking alterF delete absent"
  	      "benchmarking alterF alter absent"

Benchmark Results:

benchmarking alterF lookup absent
time                 123.7 μs   (123.4 μs .. 124.0 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 123.2 μs   (122.8 μs .. 124.0 μs)
std dev              1.702 μs   (961.3 ns .. 2.891 μs)

benchmarking alterF lookup present
time                 126.9 μs   (125.4 μs .. 129.2 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 127.0 μs   (126.4 μs .. 128.3 μs)
std dev              2.608 μs   (1.219 μs .. 4.415 μs)
variance introduced by outliers: 15% (moderately inflated)

benchmarking alterF no rules lookup absent
time                 123.1 μs   (122.7 μs .. 123.6 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 122.7 μs   (122.3 μs .. 123.4 μs)
std dev              1.738 μs   (1.162 μs .. 2.847 μs)

benchmarking alterF no rules lookup present
time                 125.5 μs   (125.0 μs .. 126.1 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 125.3 μs   (124.8 μs .. 126.3 μs)
std dev              2.381 μs   (1.366 μs .. 4.219 μs)
variance introduced by outliers: 13% (moderately inflated)

benchmarking alterF insert absent
time                 286.4 μs   (283.2 μs .. 291.0 μs)
                     0.996 R²   (0.992 R² .. 0.999 R²)
mean                 285.4 μs   (281.3 μs .. 292.4 μs)
std dev              16.74 μs   (10.04 μs .. 23.84 μs)
variance introduced by outliers: 55% (severely inflated)

benchmarking alterF insert present
time                 268.8 μs   (266.0 μs .. 273.2 μs)
                     0.999 R²   (0.997 R² .. 1.000 R²)
mean                 266.6 μs   (264.5 μs .. 272.0 μs)
std dev              10.35 μs   (3.766 μs .. 21.79 μs)
variance introduced by outliers: 35% (moderately inflated)

benchmarking alterF no rules insert absent
time                 278.2 μs   (275.3 μs .. 282.1 μs)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 274.2 μs   (273.0 μs .. 276.4 μs)
std dev              4.920 μs   (3.263 μs .. 7.465 μs)
variance introduced by outliers: 11% (moderately inflated)

benchmarking alterF no rules insert present
time                 263.8 μs   (262.9 μs .. 264.9 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 261.2 μs   (260.4 μs .. 262.3 μs)
std dev              3.298 μs   (2.541 μs .. 4.858 μs)

benchmarking alterF delete absent
time                 229.7 μs   (228.4 μs .. 232.3 μs)
                     0.995 R²   (0.986 R² .. 0.999 R²)
mean                 234.9 μs   (230.4 μs .. 244.7 μs)
std dev              21.96 μs   (11.47 μs .. 39.15 μs)
variance introduced by outliers: 77% (severely inflated)

benchmarking alterF delete present
time                 263.6 μs   (263.0 μs .. 264.4 μs)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 264.2 μs   (261.9 μs .. 268.9 μs)
std dev              10.23 μs   (2.598 μs .. 17.63 μs)
variance introduced by outliers: 35% (moderately inflated)

benchmarking alterF no rules delete absent
time                 121.0 μs   (120.6 μs .. 121.4 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 120.6 μs   (120.1 μs .. 122.3 μs)
std dev              2.556 μs   (872.5 ns .. 5.536 μs)
variance introduced by outliers: 15% (moderately inflated)

benchmarking alterF no rules delete present
time                 375.4 μs   (373.4 μs .. 377.8 μs)
                     0.998 R²   (0.995 R² .. 1.000 R²)
mean                 377.0 μs   (372.5 μs .. 394.4 μs)
std dev              24.73 μs   (3.491 μs .. 49.28 μs)
variance introduced by outliers: 60% (severely inflated)

benchmarking alterF alter absent
time                 241.9 μs   (235.2 μs .. 251.3 μs)
                     0.994 R²   (0.990 R² .. 1.000 R²)
mean                 237.5 μs   (234.9 μs .. 242.1 μs)
std dev              10.98 μs   (5.881 μs .. 16.30 μs)
variance introduced by outliers: 44% (moderately inflated)

benchmarking alterF alter insert
time                 265.0 μs   (263.4 μs .. 267.1 μs)
                     0.997 R²   (0.994 R² .. 0.999 R²)
mean                 274.0 μs   (267.5 μs .. 283.6 μs)
std dev              25.20 μs   (18.18 μs .. 33.32 μs)
variance introduced by outliers: 76% (severely inflated)

benchmarking alterF alter update
time                 260.8 μs   (253.6 μs .. 270.1 μs)
                     0.996 R²   (0.992 R² .. 1.000 R²)
mean                 254.8 μs   (252.4 μs .. 258.6 μs)
std dev              9.796 μs   (5.279 μs .. 15.32 μs)
variance introduced by outliers: 35% (moderately inflated)

benchmarking alterF alter delete
time                 263.9 μs   (262.8 μs .. 265.6 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 262.8 μs   (261.7 μs .. 265.0 μs)
std dev              4.963 μs   (2.882 μs .. 8.375 μs)
variance introduced by outliers: 12% (moderately inflated)

benchmarking alterF no rules alter absent
time                 142.6 μs   (137.4 μs .. 148.4 μs)
                     0.991 R²   (0.988 R² .. 0.996 R²)
mean                 137.5 μs   (135.2 μs .. 141.0 μs)
std dev              9.531 μs   (6.703 μs .. 12.40 μs)
variance introduced by outliers: 67% (severely inflated)

benchmarking alterF no rules alter insert
time                 293.5 μs   (291.7 μs .. 295.5 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 291.1 μs   (289.8 μs .. 293.1 μs)
std dev              5.100 μs   (3.695 μs .. 7.448 μs)

benchmarking alterF no rules alter update
time                 409.7 μs   (407.8 μs .. 412.1 μs)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 406.3 μs   (404.4 μs .. 408.4 μs)
std dev              6.193 μs   (5.148 μs .. 7.883 μs)

benchmarking alterF no rules alter delete
time                 388.3 μs   (386.0 μs .. 391.0 μs)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 386.6 μs   (384.5 μs .. 396.3 μs)
std dev              12.23 μs   (3.727 μs .. 26.30 μs)
variance introduced by outliers: 25% (moderately inflated)
@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 9, 2018

Sooo, these benchmarks tell a different story than the other ones. TL;DR; rewrite rules perform
better in casses when the element being altered is present, but worse when the element is absent. See the commit message for more details.

@treeowl
Copy link
Contributor

treeowl commented Jan 9, 2018

Hrmmm. Yeah, this kind of weirdness complicated my choices for Map as well. I ended up rewriting the Identity case not to alter but to something that (for reasons I no longer remember) had performance characteristics somewhat more closely matched to the alterF version. Or something like that. Been a long time. Maybe you can come up with something? I think we should probably consider this longer and probably let it be till the next release. The benefit isn't nearly as great as for Map because IntMap lookups are really, really cheap.

@m-renaud
Copy link
Contributor Author

m-renaud commented Jan 9, 2018 via email

@treeowl
Copy link
Contributor

treeowl commented Jan 10, 2018

The biggest reason you're running into trouble is that if the passed function takes Nothing to Nothing and the key is absent, then

  • alterF does nothing
  • alter rebuilds a path in the map

One option is to use pointer equality in alter the way we do for many functions in Data.Map: if the new subtree is the same as the old, we return our argument unchanged. Another thing to experiment with is GHC's new unboxed sum feature.

type Maybe# a = (# Void# | a #)

pattern Nothing# :: Maybe# a
pattern Nothing# <- (# _ | #) where
  Nothing# = (# void# | #)

pattern Just# :: a -> Maybe# a
pattern Just# a = (# | a #)

Now we can make recursive calls in alter return Just# a new map or Nothing#, without having to allocate a real Just constructor like we would for Maybe. We'll have to experiment to see how that performs.

@treeowl
Copy link
Contributor

treeowl commented Jan 10, 2018

To avoid confusion: Void# is not an unboxed version of Void. For hysterical raisins, it's an unboxed version of ().

@m-renaud
Copy link
Contributor Author

So just using ptrEq in alter's zero and otherwise case makes the benchmarks worse (somewhat expected, since it's doing a ptrEq at every level). I'm going to try using unboxed sum approach and see if that helps.

@treeowl
Copy link
Contributor

treeowl commented Jan 31, 2018 via email

@m-renaud
Copy link
Contributor Author

m-renaud commented Feb 1, 2018

Well, if anything I was able to make alter twice as fast using unboxed sums :) I'll move that to its own PR.

@sjakobi sjakobi added the IntMap label Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants