-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve edges function of Algebraic.Graph #71
Comments
Of course graph-creation will be slightly longer, maybe allow the user to choose (between a more "alga" structure and a quick one) can be a good idea. |
One little but significant advantage of the current implementation of I think it's reasonable to add a new function, perhaps called In general, this is an interesting research question, how to efficiently turn a list of edges into a compact algebraic representation. One of the most promising approaches is to use modular decomposition. |
So concerning edgesOrd :: Ord a => [(a, a)] -> Graph a
edgesOrd = Map.foldrWithKey (\k v acc -> Overlay acc $ Connect (Vertex k) $ vertices1 v) Empty . Map.fromListWith (<>) . map (fmap return)
where
vertices1 (x :| xs) = foldr (Overlay . vertex) (vertex x) xs It build an adjacency map and convert it to a Then benchmarks are rather good, except for some functions ^^:
So the main comment is that it is a good idea, but can lead to some problems (see
|
@nobrakal Interesting :) I think your implementation is not lazy enough (you build the whole map before producing the graph) and also tries to do too much work by eliminating duplicated edges, which leads to significant slowdown on some graph algorithms. Does switching to What about trying an intermediate solution, where you build something like Perhaps, you could even use something simpler like https://hackage.haskell.org/package/base-4.11.1.0/docs/Data-List.html#v:group Keep in mind that |
Am I trying to remove duplicated edges ? The type of the map is But you are right, I will explore others solutions :) |
Oops, I didn't spot that you use |
Yep, you were right, using edgesEq :: Eq a => [(a, a)] -> Graph a
edgesEq = foldr (\(v,l) g -> Overlay g $ Connect (Vertex v) l) Empty . groupByWithVertices
groupByWithVertices :: Eq a => [(a,a)] -> [(a,Graph a)]
groupByWithVertices [] = []
groupByWithVertices (x:xs) = (fst x, vertices1 x ys) : groupByWithVertices zs
where
eq = on (==) fst
(ys,zs) = span (eq x) xs
vertices1 x = foldr (Overlay . vertex . snd) (vertex $ snd x) This behave almost like the previous try, but require only an Because it is the same idea than my previous comment, resulting graphs are the same than in my previous benchmarks, and thus the benchmarks are equivalents for the version of this comment. |
@nobrakal Good! So, you are saying with the new version, |
Here is a detailed benchmark of
I am a bit lost since the slowdown really come from "simple" Even stranger, for a |
@nobrakal You can test this hypothesis very easily: just generate a balanced P.S.: Comparing to |
Here are some benchs, using:
So the balanced option is less efficient than the one obtained with Also, I spotted something, with the benchs: I was not testing the last vertex of the domain, and this test is faster (the same problem that we had when improving
But even for the "better positioned" vertex, the time is not so great (comparing to the |
Your comment about the difference between
The function used was: edges :: [(a, a)] -> Graph a
edges = overlays . map (uncurry edge)
edgesEq :: Eq a => [(a, a)] -> Graph a
edgesEq = foldr (\(v,l) -> Overlay (Connect (Vertex v) l)) Empty . groupByWithVertices
groupByWithVertices :: Eq a => [(a,a)] -> [(a,Graph a)]
groupByWithVertices [] = []
groupByWithVertices (x:xs) = (fst x, vertices1 x ys) : groupByWithVertices zs
where
eq = on (==) fst
(ys,zs) = span (eq x) xs
vertices1 x = foldr (Overlay . vertex . snd) (vertex $ snd x) I am currently running the whole suite to see if it is introducing other problems ! |
Now |
@nobrakal So, speaking very roughtly, the algorithms get a small (say 30%) speed up at the cost of 2x slower graph construction. I guess this is a good tradeoff, but I hope we can find further improvements for the |
Yep, this is certainly possible, requiring an I played a bit, and came to an horrible function I won't show here (one can find it at: https://github.com/nobrakal/alga/blob/edgesOrd/src/Algebra/Graph.hs#L313 ). Roughly the idea I had was to build an adjacency map, and this time, to use it: Each time I am trying to connect two vertices x and y, I search if the two are connected to some vertices, c d e, so I can create: So it compresses greatly:
But, sadly, I cannot come to very descent performances:
Using specialized
The benchmarks:
Again an issue with I don't know if it worth investigating and optimizing this solution |
Ok, I have managed to find a better version of
It is 10 times slower than a regular Code is here: https://github.com/nobrakal/alga/blob/edgesOrd/src/Algebra/Graph.hs#L314 I will investigate further improvements :) |
I made some further improvements (https://github.com/nobrakal/alga/blob/edgesOrd/src/Algebra/Graph.hs#L312) With a general
Specialized to
|
@nobrakal Many thanks for ongoing experiments! The latest solution gives around 50% speed up across API, but it's quite complex and doesn't seem to provide any useful guarantees/properties for resulting expressions. We could, of course, include it into the API with a comment "This function is around 5x slower than If we are prepared to pay a lot in terms of performance and complexity, perhaps, it is worth looking into a more principled approach, for example, based on modular graph decomposition? Then we will at least be able to provide certain guarantees about the result, for example, that all graph modules have been collapsed into sub-expressions. Of course, it's tempting to add P.S.: By the way, I guess you could use the |
Perhaps, (But your latest implementation may be doing more than just |
Yep, we only know that Note that the previous
When building a graph with
Obviously this is the better thing to do :)
It certainly worth a try, but as you said this a complex thing, and the implementation has to be a bit optimized if don't want to take many time to build a graph !
I build a list of |
I don't understand. Let's assume we add the following function to the API: stars :: [(a, [a])] -> Graph a
stars = overlays . map (uncurry star) Then, given |
Ah this way, it is ok ! I just need to replace |
Sadly, using
(As a reminder, without
) |
@nobrakal Strange. Presumably, you should have exactly the same code in your implementation, since you are getting a bunch of stars in the end, isn't it? Could you explain the slowdown? |
You are right. I have reworked it to use non-empty lists when possible and thus avoiding a pattern match on the empty list, and I dropped the result to a:
What left of the slowdown comes I think to an optimization that does not happen in these intricate mix of foldr/map |
I would expect everything to be the same if |
I investigated about poor results of
alga
in https://github.com/haskell-perf/graphsI found that the
edges
function fromAlgebraic.Graph
is a bit naive:This can lead to very poor graph representation: For example with a
clique [0..9]
Doing a
edges $ edgeList $ clique [0..9]
will produce:This representation can lead to very poor performance on big graphs.
I just tried a little amelioration, converting the list
[(a,a)]
into an adjacency mapMap a [a]
, and then producing a graph usingfoldrWithKey (\k v acc -> Overlay acc $ Connect (Vertex k) $ vertices v)
.Only this little trick divided the running-time of function like
hasEdge
by two on big graphs. I am sure finding a better implementation foredges
can improvealga
times a lot.The text was updated successfully, but these errors were encountered: