-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overlays without the empty leaf at the end #103
Conversation
This is rather annoying. So far I think this is the only aspect of benchmarking that feels to be done wrong. What do people do when benchmarking functions like |
If possible, I'd like to have a solution that covers both |
I don't think there is a possible solution: we need to test a rather small finite set of values: thus we have to chose them. For example: https://github.com/haskell-perf/sequences#indexing The sequences have a size of
Yes foldr1 :: (a -> a -> a) -> t a -> a
foldr1 f xs = fromMaybe (errorWithoutStackTrace "foldr1: empty structure")
(foldr mf Nothing xs)
where
mf x m = Just (case m of
Nothing -> x
Just y -> f x y) All values are encapsulated into Indeed this seems to be a problem, since there is for example no better version for |
Aha, interesting. Maybe in benchmarks we should also list several points, separately, and omit the very first and last elements, which give rather unstable results. As for the regression suite, I think we should measure the worst-case time -- what do you think?
Yes, indeed. Not sure what the right solution is here. |
I will drop these very first and very lasts points you are right. I am already using several points (3 in the graphs and 1 outside for
The regression suite if for me just a warning, and thus needs to benchmark as many cases as possible. Let us after explain the result. For example, if we take this foolish hasVertex :: Eq a => a -> Graph a -> Bool
hasVertex x g = (foldg False (==x) (||) (||) g) && (not $ foldg True (/=x) (&&) (&&) g) The worst-case runtime (a vertex not in the graph) is the same for the actual
We can leave |
👍
Yes, I think you are right. As long as false positives are not too frequent, we can live with them.
This looks strangely asymmetric. Furthermore, |
Ok this one was very tricky ^^. The idea is to use: foldr1f :: (a -> a -> a) -> (b -> a) -> b -> [b] -> a
foldr1f k f = go
where
go y ys =
case ys of
[] -> f y
(x:xs) -> f y `k` go x xs which is a slightly modified overlays :: [Graph a] -> Graph a
overlays [] = empty
overlays (x:xs) = foldr1fId overlay x xs
{-# INLINE [0] overlays #-}
{-# RULES
"overlays/map"
forall f xs.
overlays (map f xs) =
case xs of
[] -> empty
(y:ys) -> foldr1f overlay f y ys
#-} (where Strangely, even with the rewrite rules firing, the time for Note also that |
src/Algebra/Graph.hs
Outdated
@@ -303,7 +306,7 @@ vertices = overlays . map vertex | |||
-- 'edgeCount' . edges == 'length' . 'Data.List.nub' | |||
-- @ | |||
edges :: [(a, a)] -> Graph a | |||
edges = overlays . map (uncurry edge) | |||
edges = overlays . map edge' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems to be unrelated to the rest of the PR? Do you really need this? Presumable, fusion doesn't care if the function you map is edge'
or uncurry edge
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I was too happy to find something here ^^. I will open a separate issue
Interesting, this looks like a step in the right direction :)
A couple of things are unclear to me:
|
Of course, this is optimized away after!
(In the new commit) I don't know how all of this will be inlined and I canno't restrain the inling of
Since I use New benchs:
|
Indeed -- I didn't notice that the rewrite rules are a bit different: they use Does performance stay the same if we add separate new functions |
First of all, apologies for the benchmarks in my last comment, it is faulty, please ignore it (I was using an other branch of my repo, with the same Now two things:
connects :: [Graph a] -> Graph a
connects = concatg connect
concatg :: (Graph a -> Graph a -> Graph a) -> [Graph a] -> Graph a
concatg combine = maybe empty (foldr1fId combine) . nonEmpty
{-# INLINE [0] concatg #-}
{-# RULES
"concatg/map"
forall c f xs.
concatg c (map f xs) = concatgMap c f xs
#-}
-- | Utilitary function for rewrite rules of 'overlays' and 'connects'
concatgMap :: (Graph a -> Graph a -> Graph a) -> (b -> Graph a) -> [b] -> Graph a
concatgMap combine f = maybe empty (foldr1f combine f) . nonEmpty
So there is now only one (and pretty simple) rewrite rules which is better !
edge' :: (a,a) -> Graph a
edge' = uncurry edge
{-# INLINE edge' #-} but of course without the
Comparing this actual PR with
|
That's great!
Could this be common expression elimination, mentioned by Simon in your ticket? If we use |
I think yes.
So Now with
Here
I don't know if we always want |
Well, the programmer (we) decided to inline this function, manually, but GHC effectively undid the inlining by factoring it out into a new function, which reduced the performance of our program. How is this not a bug? I guess this might happen quite often with CSE, so what shall programmers do to prevent this? Introduce functions like |
Do we ? overlays . map (uncurry edge)
Here GHC only factorized up |
Yes, we wrote it inlined as
It is inlined! Compared to the same expression with
I don't understand what you mean here, but it looks like we are mixing multiple separate concerns. Let me double check: you are saying that simply by going from ... overlays . map (uncurry edge) ... to ... overlays . map edge' ...
edge' :: (a,a) -> Graph a
edge' = uncurry edge
{-# INLINE edge' #-} we get a performance increase. Is this correct? |
Yes, but remember the rewrite rule ! overlays . map (uncurry edge) is changed in maybe empty (foldr1f overlay edge') . nonEmpty Without the rewrite rule there is no change in performances :) |
@snowleopard The problem is mainly on little graphs, and my solution wasn't one :( I removed the
I tried but sadly,
That could be great, But a problem comes with foldMap :: Monoid m => (a -> m) -> t a -> m
{-# INLINE foldMap #-}
-- This INLINE allows more list functions to fuse. See Trac #9848.
foldMap f = foldr (mappend . f) mempty So it add an empty leaf at the end... And I don't know if there is way to use rewrite rules since there is an |
Which You seem to have found a generally useful primitive, and I'd like to apply it uniformly everywhere where we do folding of lists with graphs and need to avoid
I didn't mean we should be using the standard |
About concatg c (map f xs) But because
Since
Totally, I will update the PR soon :) |
Ok, this don't build anymore with old GHC, but I wanted advices: newtype Overlaying a = Overlaying {getOverlaying :: Graph a}
deriving (Foldable, Functor, Show, Traversable)
instance Semigroup (Overlaying a) where
(<>) = coerce (overlay :: Graph a -> Graph a -> Graph a)
stimes = stimesIdempotentMonoid
instance Monoid (Overlaying a) where
mempty = Overlaying empty
newtype Connecting a = Connecting {getConnecting :: Graph a}
deriving (Foldable, Functor, Show, Traversable)
instance Semigroup (Connecting a) where
(<>) = coerce (connect :: Graph a -> Graph a -> Graph a)
stimes = stimesMonoid
instance Monoid (Connecting a) where
mempty = Connecting empty
-- [...]
edges :: [(a, a)] -> Graph a
edges = overlays . map (uncurry edge)
overlays :: [Graph a] -> Graph a
overlays = getOverlaying . sconcatM . coerce
{-# INLINE [0] overlays #-}
connects :: [Graph a] -> Graph a
connects = getConnecting . sconcatM . coerce
{-# INLINE [0] connects #-}
{-# RULES
"overlays/map" forall f xs. overlays (map f xs) = getOverlaying (sconcatMap (coerce . f) xs);
"connects/map" forall f xs. connects (map f xs) = getConnecting (sconcatMap (coerce . f) xs)
#-}
sconcatM :: Monoid m => [m] -> m
sconcatM = maybe mempty sconcat . nonEmpty
-- | Utilitary function for rewrite rules of 'overlays' and 'connects'
sconcatMap :: Monoid m => (b -> m) -> [b] -> m
sconcatMap f = maybe mempty (sconcatf f) . nonEmpty
-- | Function allowing fusion between 'sconcat' and a composed 'map'
sconcatf :: Semigroup a => (b -> a) -> NonEmpty b -> a
sconcatf f (a :| as) = go a as
where
go b (c:cs) = f b <> go c cs
go b [] = f b
{-# INLINABLE sconcatf #-} Is this approximately what you had in mind ? The rewrite rules are becoming a bit more complicated sadly. |
Interesting. Do you think this may be a mistake in What I don't understand is where the overlays1 :: NonEmpty (NonEmptyGraph a) -> NonEmptyGraph a
overlays1 = getOverlaying . sconcat . coerce
connects1 :: NonEmpty (NonEmptyGraph a) -> NonEmptyGraph a
connects1 = getConnecting . sconcat . coerce
vertices1 :: NonEmpty a -> NonEmptyGraph a
vertices1 = getOverlaying . sconcatf vertex . coerce
edges1 :: NonEmpty (a, a) -> NonEmptyGraph a
edges1 = getOverlaying . sconcatf (uncurry edge) . coerce There is no
I guess you mean we can't write simply
Yes! I would prefer to later rename |
By the way, are you sure these rules can match
Presumably, the |
I don't know at all :) It does not seem to be intentional... There is no comment concerning the inlining.
See the last paragraph.
Totally agree there are very bad names ^^
Good question, maybe with the
I think they can't because |
I suggest you try to find this out :) Perhaps, just raise a ticket, explaining the problem. You could use the classic map fusion example -- one can't write a rule for
Yes, it might work! |
Indeed something like: newtype Overlaying a = Overlaying {getOverlaying :: a}
instance Graph a => Semigroup (Overlaying a) where
(<>) = coerce (overlay :: a -> a -> a)
stimes = stimesIdempotentMonoid
instance Graph a => Monoid (Overlaying a) where
mempty = Overlaying empty
-- [...]
overlays :: Graph g => [g] -> g
overlays = getOverlaying . sconcatM . coerce
typecheck. Thus it might certainly work ^^. I never played with the |
@nobrakal Nice!
That's a problem, actually. I did quite a lot of work to decouple For now, I suggest we use the simple version, without any |
@snowleopard I think I found the solution, and without rewrite rules :) The main function is foldr1Safe :: (a -> a -> a) -> [a] -> Maybe a
foldr1Safe f = foldr mf Nothing
where
mf x m = Just (case m of
Nothing -> x
Just y -> f x y)
{-# INLINE foldr1Safe #-} So, this is how is implemented We now have edges :: [(a, a)] -> Graph a
edges = overlays . map (uncurry edge)
overlays :: [Graph a] -> Graph a
overlays = concatg overlay
concatg :: (Graph a -> Graph a -> Graph a) -> [Graph a] -> Graph a
concatg f = fromMaybe empty . foldr1Safe f And for non-empty graphs: edges1 :: NonEmpty (a, a) -> NonEmptyGraph a
edges1 = overlays1 . fmap (uncurry edge)
overlays1 :: NonEmpty (NonEmptyGraph a) -> NonEmptyGraph a
overlays1 (x :| xs) = maybe x (overlay x) $ foldr1Safe overlay xs Benchs:
So what happened ?
The only trick is to |
@nobrakal This is awesome! Much cleaner solution :-) One inconsistency is that you use |
You are totally right, I have added a couple of comments too, to remind the standard equivalent behind |
@nobrakal Great! I'm happy to merge this PR if you think it's ready. |
For me all is green :) |
Merged -- thanks again! |
Hi,
As discussed in #99
This is not applying for
connects
because it will needs to usefoldl
and I don't think this is well optimized after (anyway, we don't useconnects
in the library code).Note that this will drop performances of
hasEdge
because of a switch from the better case to the worse case (as well ashasVertex
).The rewrite rules seems to not interfere with other rules (like fusion of two maps after
overlays
).