Child groups of "equal elevation"? #85

fosskers · 2018-06-27T21:57:48Z

Hi there, I'm the author of a particular package manager written in Haskell. About 6 months ago you offered your help for when I reached the stage of evaluating algebraic-graphs. I'm there now, so hi!

Here's the general problem I'm looking to solve:

Packages on a Linux system have dependencies, and these dependency relationships form a graph. I'd like to implement the function:

-- | A `Package` contains a list of other `Package` names (Text) it depends on,
-- which is a good starting point for a real Graph.
sortPackages :: [Package] -> [[Package]]

the return value of which corresponds to the following:

Each cyan box of packages are sibling-like and "safe" to build and install at the same time, since they aren't interdependent. For this example, I'd expect the output of sortPackages to resemble:

[ [F, G], [C, D, E], [A, B] ]

Looking at your API, I suspect these functions are probably what I'm looking for. In the current version of my code, I'm using topSort from Data.Graph to naively get the overall safe ordering, then building one-at-a-time since topSort loses other "elevation information".

Can algebraic-graphs help me here?

Please and thanks,
Colin

The text was updated successfully, but these errors were encountered:

snowleopard · 2018-06-28T14:59:31Z

Hey @fosskers! Sure, happy to help :)

I think what we really need here is breadth-first search, but we could start with a simpler but less efficient approach:

-- Find the vertices that have no dependencies
-- O(n) complexity
leaves :: Ord a => AdjacencyMap a -> Set a
leaves x = Set.filter (null . flip postSet x) $ vertexSet x

-- Split a graph into batches of mutually independent vertices
-- Probably O(m * n * log(n)) complexity
batch :: Ord a => AdjacencyMap a -> [Set a]
batch g
  | isEmpty g = []
  | otherwise = ls : batch (induce (`Set.notMember` ls) g)
  where
    ls = leaves g

Seems to do what you need, although produces the list in the reverse order.

> g = edges [('A','C'), ('A','D'), ('B','D'), ('B','E'), ('C','F'), ('D','F'), ('E','G')]
> batch g
[fromList "FG",fromList "CDE",fromList "AB"]

This is very suboptimal in terms of performance. BFS could do this in O(n + m) time.

How large are your graphs? Perhaps this implementation is sufficient? It's advantage is that it's simple. If it's too slow, let's try to implement a BFS-based algorithm.

snowleopard · 2018-06-28T15:15:22Z

I edited my comment above. I initially misjudged the complexity of my solution, but then remembered that postSet is much faster than preSet for AdjacencyMap.

fosskers · 2018-06-28T18:36:08Z

These graphs won't be too big for the average user (dozens of vertices at most), but I have gotten reports of people with ~300 packages needing updates, would could balloon into a 500+ vertex graph.

snowleopard · 2018-06-28T19:02:34Z

I think my naive implementation will take sub-second time for graphs with <1000 vertices, so if you are going to do anything non-trivial with the packages, e.g. download or build them, it will be just noise.

But I'm still interested in implementing a fast version, so I'll keep thinking.

fosskers · 2018-06-28T19:05:12Z

Oh yeah, this app is completely IO-bound in terms of perf, so I suppose this isn't a worry.

fosskers · 2018-07-04T18:10:05Z

Alright, the algorithm as-is does work. For performance, I'm comparing the solving of dependencies for a very complicated package tree (ros-lunar-desktop, which summons ~200 other packages).

Timings

Using original, incomplete Data.Graph version:

444.07user 41.78system 2:20.92elapsed 344%CPU (0avgtext+0avgdata 360228maxresident)k
0inputs+0outputs (0major+8934531minor)pagefaults 0swaps

Using algabraic-graphs:

443.73user 42.09system 2:20.94elapsed 344%CPU (0avgtext+0avgdata 359852maxresident)k
0inputs+0outputs (0major+8936655minor)pagefaults 0swaps

Sweet, so no real slowdown for big trees.

snowleopard · 2018-07-04T18:24:21Z

@fosskers Great! I've been thinking about the algorithm I suggested and there is one peculiarity I'd like to highlight, since it may be surprising.

The current implementation maximises concurrency among leaves, instead of maximising concurrency among roots. For example, if you have a graph with three vertices a, b, c and dependency a -> b then the algorithm will produce [[a], [b,c]] instead of [[a,c], b] (the latter may seem more natural).

Note that in general, neither of these two schedules is better than the other. If the tasks take time (1,2,2) then the cost of the first schedule is 3 and the cost of the second one is 4. On the other hand, if the times are (2,1,2) then the situation is the opposite.

fosskers · 2018-07-04T18:39:18Z

The "leaves first" approach is probably fine, considering that the leaves here are package dependencies that need to be built and installed before their parents can be built.

snowleopard · 2018-07-04T18:58:41Z

I see, sounds good.

Do you need any further help with Alga? If not, I guess we can close this issue.

fosskers · 2018-07-04T19:01:19Z

Yup, case closed, I think! Thank you very much.

snowleopard · 2018-07-04T19:04:59Z

No problem, happy to help again in future.

fosskers mentioned this issue Jun 29, 2018

Grouped package building fosskers/aura#483

Merged

fosskers closed this as completed Jul 4, 2018

fosskers mentioned this issue Aug 2, 2018

Add isAcyclic, topSort, dfs and other graph analysis functions to ToGraph #105

Closed

snowleopard mentioned this issue Dec 4, 2018

Add support for acyclic graphs and associated operations #154

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Child groups of "equal elevation"? #85

Child groups of "equal elevation"? #85

fosskers commented Jun 27, 2018

snowleopard commented Jun 28, 2018 •

edited

Loading

snowleopard commented Jun 28, 2018

fosskers commented Jun 28, 2018

snowleopard commented Jun 28, 2018

fosskers commented Jun 28, 2018

fosskers commented Jul 4, 2018

snowleopard commented Jul 4, 2018

fosskers commented Jul 4, 2018

snowleopard commented Jul 4, 2018

fosskers commented Jul 4, 2018

snowleopard commented Jul 4, 2018

Child groups of "equal elevation"? #85

Child groups of "equal elevation"? #85

Comments

fosskers commented Jun 27, 2018

snowleopard commented Jun 28, 2018 • edited Loading

snowleopard commented Jun 28, 2018

fosskers commented Jun 28, 2018

snowleopard commented Jun 28, 2018

fosskers commented Jun 28, 2018

fosskers commented Jul 4, 2018

Timings

snowleopard commented Jul 4, 2018

fosskers commented Jul 4, 2018

snowleopard commented Jul 4, 2018

fosskers commented Jul 4, 2018

snowleopard commented Jul 4, 2018

snowleopard commented Jun 28, 2018 •

edited

Loading