Replace GHC copy #1

Open
dterei opened this Issue Nov 6, 2011 · 23 comments

Projects

None yet

8 participants

@dterei
Member
dterei commented Nov 6, 2011

GHC has an internal copy of pretty. I've gone and merged any improvements from GHC into pretty, so we can be safe that the code in pretty is the best code. However GHC uses some different base types for performance. We need to offer better base types than String. Say ByteString and Text and also offer a builder approach to the render function for performance. After that improvement should be able to change GHC to use the pretty library.

@dterei dterei was assigned Nov 6, 2011
@hvr
Member
hvr commented Oct 27, 2013

Isn't GHC using upstream pretty by now?

@dterei
Member
dterei commented Oct 27, 2013

No, when I started on this it highlighted a choice that you can make in pretty that is ambiguous by the laws pretty uses. GHC makes the choice one way and external pretty makes it the other. That blocked me at the time and now the internal GHC pretty has diverged further I believe. Not hard to resolve, but it isn't.

@thomie
Member
thomie commented Jun 8, 2014

For future reference: GHC's internal copy of pretty.

@elliottt
Contributor

GHC moved away from using literate haskell recently, the module has moved here.

@elliottt
Contributor

Is the only thing blocking GHC adoption of the external pretty library someone doing the work?

@dterei
Member
dterei commented Mar 11, 2015

No also blocking is a slight difference in behavior. There is one situation where the laws for pretty are ambiguous and leave room for choice. GHC decided one way and pretty the other.

I may just close this ticket as the downside of ghc having its own copy is very small

On Mar 10, 2015, at 11:03 PM, Trevor Elliott notifications@github.com wrote:

GHC moved away from using literate haskell recently, the module has moved here.


Reply to this email directly or view it on GitHub.

@elliottt
Contributor

Ah, dang. I was hoping to extend the GHC pretty printer with the annotations, to get functionality like in the idris repl. Maybe extending their forked version with the annotations is the right path forward for that work :)

@dterei
Member
dterei commented Mar 11, 2015

Well if you're going to be putting effort into the pretty fork in GHC, then it may make sense to try to unify the two. It's an easy attempt, just try using the pretty library instead of GHC's forked version and see if any GHC output breaks and why.

Especially as pretty now has support for annotations in HEAD.

@elliottt
Contributor

Yes, I was reluctant to start until the package had been released, hence #21.

@thomie
Member
thomie commented Aug 4, 2015

I am working on making GHC's compiler/utils/Pretty.hs as similar as possible to src/Text/PrettyPrint/HughesPJ.hs from pretty-1.1.2.1 (before the annotation changes). This is a first step towards reconcillation between the two.

Edit: see https://ghc.haskell.org/trac/ghc/ticket/10735

Some obvious differences:

Edit: more differences, GHC's version defines:

quotes p       = char '`' <> p <> char '\''
rational n = text (show (fromRat n :: Double))
@dterei
Member
dterei commented Aug 5, 2015

Great! Thanks for doing this thomie.

@dterei
Member
dterei commented Jun 2, 2016

@thomie What's the status of your work? Is the only difference now with pretty and ghc the use of FastString?

@thomie
Member
thomie commented Jun 3, 2016

Status: I don't know how to proceed.

ghc's version resembles pretty-1.1.2.0. To see the differences, do this in a ghc git tree:

$ cd libraries/pretty
$ git checkout v1.1.2.0
$ cd ../../
$ vimdiff compiler/utils/Pretty.hs libraries/pretty/src/Text/PrettyPrint/HughesPJ.hs

Notable differences:

  • ghc Pretty starts with a giant changelog comment
  • ghc TextDetails uses Faststring

Other difference are minor. Both copies define some extra functions and instances not defined in the other copy.

GHC performance is very sensitive to Pretty changes. That FastString hack is absolutely necessary.

As mentioned in https://ghc.haskell.org/trac/ghc/ticket/10735#comment:23, for parity with pretty-1.1.2.1 the following two commits would also be needed. I did not land them, because they cause more allocation in the compiler (see discussion in ​​#9):

  • "Resolve foldr-strictness stack overflow bug"
  • "Special-case reduce for horiz/vert"

After that come the Annotated/ changes, which is where I stopped.

@dterei
Member
dterei commented Jun 3, 2016

OK, thanks for the update. Would supporting ByteString in Pretty be sufficient you think to provide the performance GHC needs?

@dterei
Member
dterei commented Jun 3, 2016

Given FastString is just a ByteString with a hash associated with it for faster Eq, seems like that would be fine.

@jwaldmann

obvious side remark (but it's really a separate issue): is this "ByteString with a hash associated with it for faster Eq" worthy of becoming a stand-alone library?

I take it that Eq is not used for "pretty" but I can think of several other use cases. But then, ByteString has O(1) slicing (take, drop), and this would not work with the hashes.

@dterei
Member
dterei commented Jun 3, 2016

Maybe, if it doesn't exist someone can just create it on Hackage, not really much point discussing it here. The alternative would be to pull FastString out of GHC into it's own library, which is a discussion for ghc-dev mailing list. I think that wouldn't be worth while though.

@thomie
Member
thomie commented Jun 3, 2016 edited

Would supporting ByteString in Pretty be sufficient you think to provide the performance GHC needs?

I guess so.

@bgamari has been toying with the idea of using a different pretty printing library than Pretty (https://ghc.haskell.org/trac/ghc/ticket/8809#comment:21), you might want to discuss with him.

Maybe switching to a different library would give better performance also?

@niteria
niteria commented Sep 30, 2016

GHC performance is very sensitive to Pretty changes. That FastString hack is absolutely necessary.

@thomie: Do you understand why that is? My intuition fails me here, I'd expect pretty printing speed to not matter in practice.

@thomie
Member
thomie commented Sep 30, 2016

@niteria: you can probably ask @simonmar more about it, I know he's aware of the problem.

Subtle performance issues I've encountered:

My guess: there is a simply a lot of code to prettyprint (cmm, assembly, llvm). Combine that with nobody really understanding the HughesPJ algorithm or the Pretty source code, and it is very easy to introduce an "accidentally quadratic".

@simonmar
Member

The Pretty library runs in two modes: the normal mode where it is nicely indenting things, and a fast mode where it doesn't indent anything. The fast mode is used for generating assembly and LLVM, and we really really care about it being fast.

The fast way should probably use a ByteString builder instead of the current hack, actually.

@niteria
niteria commented Oct 2, 2016

Thank you @thomie and @simonmar for context! I profiled GHC while compiling one of the pretty files and pprNativeCode is 13% of allocations and reduceDoc which pprNativeCode uses through bufLeftRender is 6.7%. It seems like we can do better here, especially since the output format doesn't have to be pretty. I've tried to get rid of reduceDoc in this case, but different kinds of empty Docs make it difficult and I don't understand all the semantics yet to have anything correct.

@dalaing
dalaing commented Jan 26, 2017

I'm currently looking at modifying the Doc type to take an extra parameter, so that TextDetails can be swapped out for ByteString or Text. I've got the basics of the abstraction together, I just need to get the pragmas in the right pace to inline / specialize things, and then check the core that comes out of that and some benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment