Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hcat is really *terrible* performance-wise #28

Closed
fryguybob opened this issue Aug 4, 2012 · 3 comments
Closed

hcat is really *terrible* performance-wise #28

fryguybob opened this issue Aug 4, 2012 · 3 comments

Comments

@fryguybob
Copy link
Member

(Imported from http://code.google.com/p/diagrams/issues/detail?id=57. Original issue from felipe.l...@gmail.com on November 2, 2011, 06:27:13 PM UTC)

Hello!

I'm sorry that I didn't do my homework to investigate the "why", but I've included a very simple test case. You may compile and run the test as follows:

$ ghc -O --make -fforce-recomp -rtsopts bug.hs
$ ./bug -o t.png --selection=??? +RTS -s

where ??? may be "hcat", "foldl1", "foldr1" or "foldTree". The program just concatenates 1,000 (rect 1 1)s. On my computer, I get the following:

hcat: 23s, 130 MiB
foldr1: 30s, 69 MiB
foldl1: 27s, 62 MiB
foldTree: 1s, 7 MiB

Concatenating one thousand unit squares shouldn't take more than 20 seconds =). This is a showstopper for me, so I've reimplemented hcat (see hcatB at the end of [1])

Cheers!

[1] https://patch-tag.com/r/felipe/hierarchical-clustering-diagrams/snapshot/hash/20111102171621-0b5ec-1f98cd6a796d46b46f54dc2e168e826f459572ea/content/pretty/src/Diagrams/Dendrogram.hs

@fryguybob
Copy link
Member Author

(Imported. Original comment by felipe.l...@gmail.com on November 2, 2011, 06:33:47 PM UTC)

Perhaps the problem is with (|||), which behaves has these worst cases. But fixing hcat may be easier, and maybe even profitable on a better world where (|||) too has better performance.

@fryguybob
Copy link
Member Author

(Imported. Original comment by byor...@gmail.com on November 3, 2011, 02:35:31 AM UTC)

Thanks for the report! And nice dendrogram package. =)

I've changed the implementation of cat' to use a balanced fold so now cat', cat, hcat, and vcat are all much faster (running bug.hs with 'hcat' is down to 0.5s / 7MB on my machine, the same time+memory as 'foldTree'). I also improved the implementation of 'beside' (and hence of (===) and (|||)), which brought the running time of 'foldr1' on my machine down from about 30s to 15s.

Wed Nov 2 22:12:46 EDT 2011 Brent Yorgey byorgey@cis.upenn.edu

  • D.Combinators: implement cat' using a balanced folding scheme (fixes Add NullBackend for Ellipsoid #57)

    Thanks to Felipe Lessa for the idea. The usual linear fold was O(n^2)
    because appending each new diagram to the accumulated list has to
    query its bounding function, which is built up as the iterated maximum
    of the O(n) bounding functions of the component diagrams. Doing a
    balanced fold is thus only O(n log n) (and seems to give a big speedup
    in practice). There is also probably more room for optimizing
    computation of bounding functions (using memoization and/or retaining
    a bounding box for the common special cases of querying in a direction
    parallel to an axis) but that can be left to future work.

    The new implementation is also considerably prettier and shorter than
    the old, so it's a win all around. =)

Wed Nov 2 17:43:21 EDT 2011 Brent Yorgey byorgey@cis.upenn.edu

  • improve performance of 'beside'
    Instead of doing an alignment on both arguments, superimposing, then translating
    the origin back, we just compute the proper offset, translate the second diagram,
    and superimpose. Leads to much better performance when the bounding functions
    are consulted later.

@fryguybob
Copy link
Member Author

(Imported. Original comment by felipe.l...@gmail.com on November 3, 2011, 02:46:27 AM UTC)

Thanks a lot for being so fast on fixing this issue! =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant