Quadratic performance issues #32

jwaldmann · 2016-02-05T21:47:54Z

Try this:

import Text.PrettyPrint
import System.Environment
main = do
  [ s ] <- getArgs
  print $ iterate ( \ x -> fsep [ text "a" , x <+> text "b" ] ) empty !! read s

on my machine:

input | runtime in sec

100   |  0.1
200   |  1.1
300   |  4.4
400   | 11.4

This testcase is simplified from https://ghc.haskell.org/trac/ghc/ticket/7666
I think this bug (gut feeling - some quadratic behaviour) has been sitting there for a long time.
I do think this is serious. I think it also hurts haddock.

pretty-1.1.3.2 for ghc-8.0.0.20160111 and pretty-1.1.2.0 for ghc-7.10.3

The text was updated successfully, but these errors were encountered:

haskell/pretty#32

thomie · 2016-02-06T18:11:45Z

Changing $! to $ in the function beside helps quite a bit:

input	before (s)	after (s)
100	0.19	0.12
200	1.7	0.35
400	17	1.3

This is with ghc-7.10.3 -O and pretty-1.1.3.2.

git bisect points at 6e01b2e and ghc/ghc@ac88f11.

I tested the same change in GHC's copy of pretty (compiler/utils/Pretty), hoping it would make GHC massively faster! Unfortunately, it just had a negative effect on T3294:

  tests/alloc/T3294     2679798176  + 4.21% 2792570824  bytes

jwaldmann · 2016-02-06T19:00:55Z

Interesting. Well, the bug seems to hit only for certain kinds of nesting. But I have a few applications where I think it matters. And I have seen horrendous ghc and/or haddock runtimes for modules with hundreds of identifiers, e.g., in OpenGLRaw, and I have a hunch that this is related. You could try your patched ghc on these.

I think 1second runtime for outputting 2k chars is still way high. For a similar test case with wl-pprint, it is more like 10 millisecs (but the combinators have different behaviour)

ndmitchell · 2016-06-01T14:02:40Z

I started debugging a stack overflow in Hoogle, see ndmitchell/hoogle#167, and it brought me to exactly this location. Specifically, given the operation:

import Text.PrettyPrint.Annotated.HughesPJ
main = print $ hsep $ replicate 1000 $ text "neil"

This takes an unbounded amount of stack. If I replace:

- beside (TextBeside t p)    g q   = TextBeside t $! rest
+ beside (TextBeside t p)    g q   = TextBeside t $ rest

Then it takes a constant amount of stack. It turns an O(n) memory algorithm into an O(1) algorithm, so seems like it should be considered a fix. @dterei, any thoughts?

dterei · 2016-06-01T17:07:40Z

Sure. Do you mind submitting a PR with a test-case please?

ndmitchell · 2016-06-01T17:54:37Z

Sure. @thomie, I intend to remove that one bang, do you have any others I should include in the ticket?

I found an even better test case, take 10 $ render $ hsep $ repeat $ text "neil" works with the new formulation and loops with the old.

thomie · 2016-06-01T18:15:00Z

Fine by me.

In case you know the answer, it would be great if you could add a comment or Note explaining why the other $!s are necessary (or a test even). They were all together added in 6e01b2e.

ndmitchell · 2016-06-01T18:42:48Z

To be clear, I can demonstrate that one particular bang is harmful - I have no opinion on the others - but my guess is someone had bangs and thought they added performance so sprayed them with a machine gun. It's a standard Haskell optimisation without benchmarking trick 😞

jwaldmann · 2016-06-02T07:14:06Z

I am glad that this issue is getting some attention.

The worrisome thing is that my benchmark above is basically an iteration of a random (!) small context
(\ x -> ... ). So even if behaviour improves for this particular case, others should be tested (as ndm did) - and it's probably best to enumerate contexts in a smallcheck style. Let me see if I find the time to do that.

ndmitchell · 2016-06-02T08:15:26Z

@jwaldmann - so I have one specific place which causes a more discrete regression (overly strict, different space complexity) - so is worth fixing independent of any performance measurements. Is it worth waiting for your benchmarks or shall I pull request it today?

jwaldmann · 2016-06-02T11:38:36Z

Don't wait for me. I made this for testing: https://github.com/jwaldmann/pretty-test

ndmitchell · 2016-06-02T12:18:24Z

Pull request for my piece as #35.

jwaldmann · 2016-06-02T14:39:37Z

For reference, here's the "winner" (most expensive context with depth 3 -- according to SmallCheck).
I write it in the form that you can execute in ghci:

import Text.PrettyPrint.HughesPJ

:set +s
length $ render $ iterate ( \ hole ->  sep [text  "l", cat [hole], text  "l"] ) (text  "l") !! 1000

4001
(10.42 secs, 9,684,556,640 bytes)

This is quite consistent for ghc-7.6.3, 7.8.4, 7.10.3, while ghc-8.0.1 is better by one second.

jwaldmann · 2016-06-02T16:11:47Z

What does the original prettyprint paper ( http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.8777 ) say about (asymptotic) performance?

There are no exact claims or proofs, but end of Section 9 has "seems to grow slightly faster than linearly" and "far superior to O(n^2)".

Really? Here is the runtimes for rendering iterations of context mentioned previously (x axis: iterations, y axis: time in seconds, average over 3 executions). (forgive the crude rendering)

Looks accidentally quadratic ?

I think that for a (pretty)printing library, anything more than linear is unacceptable , and as long as the behaviour is not fixed, "dangerous" combinators must be red-flagged in the API docs.

What are these dangerous combinators? Everything that has an "either .. or" ? (sep : either hsep or vcat, etc.)

Detection of non-linearity in pretty's code would really make a nice test case for automated analysis of runtime complexity of programs. I can advertise this at http://cl-informatik.uibk.ac.at/users/tpowell/lca but don't expect immediate results ...

Two technical points here: linear, quadratic, etc. - in what parameter exactly? Size of input? Size of output?

the natural measure is the number of API calls (the size of the abstract syntax tree). This is to include calls to, and processing of empty, which produces no output, but has a cost.
there are cases where output size is more than linear of input size (because of nesting, progressive amounts of whitespace could be produced). This does not happen for the context I presented above (everything starts in column 0). But then, since we render for a fixed page width, there cannot be too much whitespace in output? Or, we'd could whitespace in some extra way.

ndmitchell · 2016-06-02T16:20:20Z

From what I've read of the code I would be shocked if it wasn't quadratic. It seems to regularly be traversing the entire tree to ensure global properties, whereas I would expect it to consider all Doc values in some normal form and then use smart constructors to guarantee that without traversing downwards.

This is backport of [1] for GHC's copy of Pretty. See Note [Differences between libraries/pretty and compiler/utils/Pretty.hs]. [1] http://git.haskell.org/packages/pretty.git/commit/bbe9270c5f849a5bb74c9166a5f4202cfb0dba22 haskell/pretty#32 haskell/pretty#35 Reviewers: bgamari, austin Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D2397 GHC Trac Issues: #12227

This is backport of [1] for GHC's copy of Pretty. See Note [Differences between libraries/pretty and compiler/utils/Pretty.hs]. [1] http://git.haskell.org/packages/pretty.git/commit/bbe9270c5f849a5bb74c9166a5f4202cfb0dba22 haskell/pretty#32 haskell/pretty#35 Reviewers: bgamari, austin Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D2397 GHC Trac Issues: #12227 (cherry picked from commit 89a8be7)

The pretty package had an issue here; luckily, prettyprinter does not. See haskell/pretty#32

ghc-mirror pushed a commit to ghc/ghc that referenced this issue Feb 6, 2016

Experimental fix for pretty:32

d21262e

haskell/pretty#32

ndmitchell mentioned this issue Jun 1, 2016

Stack overflow when processing telegram-api ndmitchell/hoogle#167

Open

ndmitchell mentioned this issue Jun 2, 2016

Remove harmful $! forcing in beside #35

Merged

dterei changed the title ~~performance bug ( > 10 seconds for < 2k output chars)~~ Quadratic performance issues Jun 2, 2016

thomie mentioned this issue Sep 30, 2016

Replace GHC copy #1

Open

jwaldmann mentioned this issue Mar 25, 2017

is cyp's work (polynomially, linearly) bounded? noschinl/cyp#21

Open

quchen added a commit to quchen/prettyprinter that referenced this issue Jun 9, 2017

Add performance test for nested fillSep

9bbe14c

The pretty package had an issue here; luckily, prettyprinter does not. See haskell/pretty#32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quadratic performance issues #32

Quadratic performance issues #32

jwaldmann commented Feb 5, 2016

thomie commented Feb 6, 2016

jwaldmann commented Feb 6, 2016

ndmitchell commented Jun 1, 2016

dterei commented Jun 1, 2016

ndmitchell commented Jun 1, 2016

thomie commented Jun 1, 2016

ndmitchell commented Jun 1, 2016

jwaldmann commented Jun 2, 2016 •

edited

Loading

ndmitchell commented Jun 2, 2016

jwaldmann commented Jun 2, 2016

ndmitchell commented Jun 2, 2016

jwaldmann commented Jun 2, 2016

jwaldmann commented Jun 2, 2016 •

edited

Loading

ndmitchell commented Jun 2, 2016

Quadratic performance issues #32

Quadratic performance issues #32

Comments

jwaldmann commented Feb 5, 2016

thomie commented Feb 6, 2016

jwaldmann commented Feb 6, 2016

ndmitchell commented Jun 1, 2016

dterei commented Jun 1, 2016

ndmitchell commented Jun 1, 2016

thomie commented Jun 1, 2016

ndmitchell commented Jun 1, 2016

jwaldmann commented Jun 2, 2016 • edited Loading

ndmitchell commented Jun 2, 2016

jwaldmann commented Jun 2, 2016

ndmitchell commented Jun 2, 2016

jwaldmann commented Jun 2, 2016

jwaldmann commented Jun 2, 2016 • edited Loading

ndmitchell commented Jun 2, 2016

jwaldmann commented Jun 2, 2016 •

edited

Loading

jwaldmann commented Jun 2, 2016 •

edited

Loading