Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A more efficient Graph representation #936

Open
meooow25 opened this issue Mar 4, 2023 · 4 comments
Open

A more efficient Graph representation #936

meooow25 opened this issue Mar 4, 2023 · 4 comments

Comments

@meooow25
Copy link
Contributor

meooow25 commented Mar 4, 2023

Bodigrim's comment about improved allocations #909 (comment) and #928 makes me wonder if it would be good to represent graphs as Array Int (UArray Int Int) or something similar, instead of Array Int [Int]. We all know that [Int] is not a great way to store Ints.

Downsides: Breaking change, and construction would be very complicated. For example we can't just use accumArray to construct from edges in buildG.
But memory used should be a lot better, 3 words out of 4 saved per edge if I understand correctly. And dfs algorithms should not suffer in any way.

@konsumlamm
Copy link
Contributor

IF we change the implementation, I'd argue that primitive is even better suited for arrays than array, as it doesn't store boxed indices. One possible problem is if Data.Graph uses slicing (by adjusting the start/stop indices), but it looks like array doesn't even provide a function for that.

It might be possible to optimize the layout even more by using the unboxed array directly instead of a nested array, but that would require a second array to store the indices where the vertex lists start.

It seems like a massive oversight to make Graph a type alias instead of a newtype, but I'm afraid it might be too late to change that now, at least not without a big announcement and a migration strategy (even though the amount of breakage is probably small).

@treeowl
Copy link
Contributor

treeowl commented Mar 4, 2023

@konsumlamm, I fear the amount of breakage is likely not small, though the number of programs affected likely is. I don't think the (few) users of the module have ever shied away from using what they know about the representation.

@meooow25
Copy link
Contributor Author

meooow25 commented Mar 4, 2023

I doubt we can use primitive, since it's not a GHC boot library. We can make our own type for an Int array with limited operations if we really want.

@treeowl so do you think this is not practical to do, given the amount of breakage?

@treeowl
Copy link
Contributor

treeowl commented Mar 13, 2023

@meooow25 You could attempt an impact assessment, but I'm not terribly optimistic. You might be better off exploring those ideas in another package, and try to see if you get good performance using that for, e.g., GHC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants