Restructure code base.

haskell · Mar 6, 2012 · a75ee60 · a75ee60
1 parent 0a22cc0
commit a75ee60
Show file tree

Hide file tree

Showing 10 changed files with 1,386 additions and 41 deletions.
diff --git a/CHANGELOG b/CHANGELOG
@@ -0,0 +1,169 @@
+======== CHANGE LOG ==========
+
+Pretty library change log.
+
+========= Version 4.0, 24 August 2011 ==========
+
+* Big change to the structure of the library. Now we don't have a fixed
+  TextDetails data type for storing the various String types that we
+  support. Instead we have changed that to be a type class that just
+  provides a way to convert String and Chars to an arbitary type. This
+  arbitary type is now provided by the user of the library so that they
+  can implement support very easily for any String type they want.
+
+  This new code lives in Text.PrettyPrint.Core and the Text.PrettyPrint
+  module uses it to implement the old API. The Text.PrettyPrint.HughesPJ
+  module has been left unchanged for a compatability module but deprecated.
+
+========= Version 3.0, 28 May 1987 ==========
+
+* Cured massive performance bug. If you write:
+
+    foldl <> empty (map (text.show) [1..10000])
+
+  You get quadratic behaviour with V2.0. Why? For just the same
+  reason as you get quadratic behaviour with left-associated (++)
+  chains.
+
+  This is really bad news. One thing a pretty-printer abstraction
+  should certainly guarantee is insensitivity to associativity. It
+  matters: suddenly GHC's compilation times went up by a factor of
+  100 when I switched to the new pretty printer.
+
+  I fixed it with a bit of a hack (because I wanted to get GHC back
+  on the road). I added two new constructors to the Doc type, Above
+  and Beside:
+
+    <> = Beside
+    $$ = Above
+
+  Then, where I need to get to a "TextBeside" or "NilAbove" form I
+  "force" the Doc to squeeze out these suspended calls to Beside and
+  Above; but in so doing I re-associate. It's quite simple, but I'm
+  not satisfied that I've done the best possible job. I'll send you
+  the code if you are interested.
+
+* Added new exports:
+    punctuate, hang
+    int, integer, float, double, rational,
+    lparen, rparen, lbrack, rbrack, lbrace, rbrace,
+
+* fullRender's type signature has changed. Rather than producing a
+  string it now takes an extra couple of arguments that tells it how
+  to glue fragments of output together:
+
+    fullRender :: Mode
+               -> Int                       -- Line length
+               -> Float                     -- Ribbons per line
+               -> (TextDetails -> a -> a)   -- What to do with text
+               -> a                         -- What to do at the end
+               -> Doc
+               -> a                         -- Result
+
+  The "fragments" are encapsulated in the TextDetails data type:
+
+    data TextDetails = Chr  Char
+                     | Str  String
+                     | PStr FAST_STRING
+
+  The Chr and Str constructors are obvious enough. The PStr
+  constructor has a packed string (FAST_STRING) inside it. It's
+  generated by using the new "ptext" export.
+
+  An advantage of this new setup is that you can get the renderer to
+  do output directly (by passing in a function of type (TextDetails
+  -> IO () -> IO ()), rather than producing a string that you then
+  print.
+
+
+
+========= Version 3.0, 28 May 1987 ==========
+
+* Made empty into a left unit for <> as well as a right unit;
+  it is also now true that
+    nest k empty = empty
+  which wasn't true before.
+
+* Fixed an obscure bug in sep that occasionally gave very weird behaviour
+
+* Added $+$
+
+* Corrected and tidied up the laws and invariants
+
+
+
+========= Version 1.0 ==========
+
+Relative to John's original paper, there are the following new features:
+
+1. There's an empty document, "empty". It's a left and right unit for
+   both <> and $$, and anywhere in the argument list for
+   sep, hcat, hsep, vcat, fcat etc.
+
+   It is Really Useful in practice.
+
+2. There is a paragraph-fill combinator, fsep, that's much like sep,
+   only it keeps fitting things on one line until it can't fit any more.
+
+3. Some random useful extra combinators are provided.
+     <+> puts its arguments beside each other with a space between them,
+         unless either argument is empty in which case it returns the other
+
+
+     hcat is a list version of <>
+     hsep is a list version of <+>
+     vcat is a list version of $$
+
+     sep (separate) is either like hsep or like vcat, depending on what fits
+
+     cat  behaves like sep,  but it uses <> for horizontal composition
+     fcat behaves like fsep, but it uses <> for horizontal composition
+
+     These new ones do the obvious things:
+       char, semi, comma, colon, space,
+       parens, brackets, braces,
+       quotes, doubleQuotes
+
+4. The "above" combinator, $$, now overlaps its two arguments if the
+   last line of the top argument stops before the first line of the
+   second begins.
+
+     For example:  text "hi" $$ nest 5 (text "there")
+     lays out as
+                   hi   there
+     rather than
+                   hi
+                        there
+
+   There are two places this is really useful
+
+     a) When making labelled blocks, like this:
+            Left ->   code for left
+            Right ->  code for right
+            LongLongLongLabel ->
+                      code for longlonglonglabel
+        The block is on the same line as the label if the label is
+        short, but on the next line otherwise.
+
+     b) When laying out lists like this:
+            [ first
+            , second
+            , third
+            ]
+        which some people like. But if the list fits on one line you
+        want [first, second, third]. You can't do this with John's
+        original combinators, but it's quite easy with the new $$.
+
+   The combinator $+$ gives the original "never-overlap" behaviour.
+
+5. Several different renderers are provided:
+     * a standard one
+     * one that uses cut-marks to avoid deeply-nested documents
+       simply piling up in the right-hand margin
+     * one that ignores indentation (fewer chars output; good for machines)
+     * one that ignores indentation and newlines (ditto, only more so)
+
+6. Numerous implementation tidy-ups
+   Use of unboxed data types to speed up the implementation
+
+
diff --git a/README.md b/README.md
@@ -6,6 +6,10 @@ choosing. This is useful for compilers and related tools. The library was
 originally designed by John Hughes's and has since been heavily modified by
 Simon Peyton Jones.
 
+It is based on the pretty-printer outlined in the  paper 'The Design of a
+Pretty-printing Library' in Advanced Functional Programming, Johan Jeuring and
+Erik Meijer (eds), LNCS 925 <http://www.cs.chalmers.se/~rjmh/Papers/pretty.ps>
+
 The library uses the Cabal build system, so building is simply a matter of
 running 'cabal install' or 'cabal configure && cabal build'.
 

diff --git a/Setup.hs b/Setup.hs
@@ -4,3 +4,4 @@ import Distribution.Simple
 
 main :: IO ()
 main = defaultMain
+
diff --git a/pretty.cabal b/pretty.cabal
@@ -18,14 +18,35 @@ homepage:      http://github.com/haskell/pretty
 bug-reports:   http://hackage.haskell.org/trac/ghc/newticket?component=libraries/pretty
 stability:     Stable
 build-type:    Simple
+Extra-Source-Files: README CHANGELOG
 Cabal-Version: >= 1.6
 
 Library
+    hs-source-dirs: src
     exposed-modules:
         Text.PrettyPrint
         Text.PrettyPrint.HughesPJ
     build-depends: base >= 3 && < 5
     extensions: CPP
+    ghc-options: -Wall -Werror -O -fwarn-tabs
+
+Test-Suite test-pretty
+    type: exitcode-stdio-1.0
+    hs-source-dirs: test
+                    src
+    build-depends: base >= 3 && < 5,
+                   QuickCheck == 1.*
+    main-is: Test.hs
+    extensions: CPP
+    include-dirs: src/Text/PrettyPrint
+
+-- Executable Bench1
+--     Main-Is: Bench1.hs
+--     Other-Modules:
+--         Text.PrettyPrint
+--         Text.PrettyPrint.HughesPJ
+--         Text.PrettyPrint.Core
+--     ghc-options: -Wall -Werror -O -fwarn-tabs
 
 source-repository head
     type:     git

diff --git a/Text/PrettyPrint.hs → src/Text/PrettyPrint.hs b/Text/PrettyPrint.hs → src/Text/PrettyPrint.hs
diff --git a/Text/PrettyPrint/HughesPJ.hs → src/Text/PrettyPrint/HughesPJ.hs b/Text/PrettyPrint/HughesPJ.hs → src/Text/PrettyPrint/HughesPJ.hs
@@ -21,8 +21,6 @@
 -- Johan Jeuring and Erik Meijer (eds), LNCS 925
 -- <http://www.cs.chalmers.se/~rjmh/Papers/pretty.ps>
 --
--- Heavily modified by Simon Peyton Jones (December 1996).
---
 -----------------------------------------------------------------------------
 module Text.PrettyPrint.HughesPJ (
 
@@ -79,8 +77,6 @@ import Data.String ( IsString(fromString) )
 -- ---------------------------------------------------------------------------
 -- The Doc calculus
 
--- The Doc combinators satisfy the following laws:
-
 {-
 Laws for $$
 ~~~~~~~~~~~
@@ -177,36 +173,36 @@ data Doc
   | Above Doc Bool Doc                               -- True <=> never overlap
 
 {-
-  Here are the invariants:
+Here are the invariants:
 
-  1) The argument of NilAbove is never Empty. Therefore
-     a NilAbove occupies at least two lines.
+1) The argument of NilAbove is never Empty. Therefore
+   a NilAbove occupies at least two lines.
 
-  2) The argument of @TextBeside@ is never @Nest@.
+2) The argument of @TextBeside@ is never @Nest@.
 
-  3) The layouts of the two arguments of @Union@ both flatten to the same
-     string.
+3) The layouts of the two arguments of @Union@ both flatten to the same
+   string.
 
-  4) The arguments of @Union@ are either @TextBeside@, or @NilAbove@.
+4) The arguments of @Union@ are either @TextBeside@, or @NilAbove@.
 
-  5) A @NoDoc@ may only appear on the first line of the left argument of an
-     union. Therefore, the right argument of an union can never be equivalent
-     to the empty set (@NoDoc@).
+5) A @NoDoc@ may only appear on the first line of the left argument of an
+   union. Therefore, the right argument of an union can never be equivalent
+   to the empty set (@NoDoc@).
 
-  6) An empty document is always represented by @Empty@.  It can't be
-     hidden inside a @Nest@, or a @Union@ of two @Empty@s.
+6) An empty document is always represented by @Empty@.  It can't be
+   hidden inside a @Nest@, or a @Union@ of two @Empty@s.
 
-  7) The first line of every layout in the left argument of @Union@ is
-     longer than the first line of any layout in the right argument.
-     (1) ensures that the left argument has a first line.  In view of
-     (3), this invariant means that the right argument must have at
-     least two lines.
+7) The first line of every layout in the left argument of @Union@ is
+   longer than the first line of any layout in the right argument.
+   (1) ensures that the left argument has a first line.  In view of
+   (3), this invariant means that the right argument must have at
+   least two lines.
 
- Notice the difference between
-         * NoDoc (no documents)
-         * Empty (one empty document; no height and no width)
-         * text "" (a document containing the empty string;
-                    one line high, but has no width)
+Notice the difference between
+   * NoDoc (no documents)
+   * Empty (one empty document; no height and no width)
+   * text "" (a document containing the empty string;
+              one line high, but has no width)
 -}
 
 
@@ -331,7 +327,6 @@ lbrack :: Doc -- ^ A '[' character
 rbrack :: Doc -- ^ A ']' character
 lbrace :: Doc -- ^ A '{' character
 rbrace :: Doc -- ^ A '}' character
-semi   = char ';'
 comma  = char ','
 colon  = char ':'
 space  = char ' '
@@ -503,9 +498,8 @@ above (Above p g1 q1)  g2 q2 = above p g1 (above q1 g2 q2)
 above p@(Beside _ _ _) g  q  = aboveNest (reduceDoc p) g 0 (reduceDoc q)
 above p g q                  = aboveNest p             g 0 (reduceDoc q)
 
-aboveNest :: RDoc a -> Bool -> Int -> RDoc a -> RDoc a
 -- Specfication: aboveNest p g k q = p $g$ (nest k q)
-
+aboveNest :: RDoc a -> Bool -> Int -> RDoc a -> RDoc a
 aboveNest _                   _ k _ | k `seq` False = undefined
 aboveNest NoDoc               _ _ _ = NoDoc
 aboveNest (p1 `Union` p2)     g k q = aboveNest p1 g k q `union_`
@@ -525,15 +519,13 @@ aboveNest (TextBeside s sl p) g k q = textBeside_ s sl rest
 aboveNest (Above {})          _ _ _ = error "aboveNest Above"
 aboveNest (Beside {})         _ _ _ = error "aboveNest Beside"
 
-nilAboveNest :: Bool -> Int -> RDoc a -> RDoc a
 -- Specification: text s <> nilaboveNest g k q
 --              = text s <> (text "" $g$ nest k q)
-
+nilAboveNest :: Bool -> Int -> RDoc a -> RDoc a
 nilAboveNest _ k _           | k `seq` False = undefined
 nilAboveNest _ _ Empty       = Empty
                                -- Here's why the "text s <>" is in the spec!
 nilAboveNest g k (Nest k1 q) = nilAboveNest g (k + k1) q
-
 nilAboveNest g k q           | not g && k > 0      -- No newline if no overlap
                              = textBeside_ (Str (indent k)) k q
                              | otherwise           -- Put them really above
@@ -562,9 +554,8 @@ beside_ p _ Empty = p
 beside_ Empty _ q = q
 beside_ p g q     = Beside p g q
 
-beside :: Doc -> Bool -> RDoc a -> RDoc a
 -- Specification: beside g p q = p <g> q
-
+beside :: Doc -> Bool -> RDoc a -> RDoc a
 beside NoDoc               _ _   = NoDoc
 beside (p1 `Union` p2)     g q   = beside p1 g q `union_` beside p2 g q
 beside Empty               _ q   = q
@@ -580,10 +571,9 @@ beside (TextBeside s sl p) g q   = textBeside_ s sl $! rest
                                            Empty -> nilBeside g q
                                            _     -> beside p g q
 
-nilBeside :: Bool -> RDoc a -> RDoc a
 -- Specification: text "" <> nilBeside g p
 --              = text "" <g> p
-
+nilBeside :: Bool -> RDoc a -> RDoc a
 nilBeside _ Empty         = Empty -- Hence the text "" in the spec
 nilBeside g (Nest _ p)    = nilBeside g p
 nilBeside g p | g         = textBeside_ space_text 1 p
@@ -613,7 +603,6 @@ sepX x (p:ps) = sep1 x (reduceDoc p) 0 ps
 -- Specification: sep1 g k ys = sep (x : map (nest k) ys)
 --                            = oneLiner (x <g> nest k (hsep ys))
 --                              `union` x $$ nest k (vcat ys)
-
 sep1 :: Bool -> RDoc a -> Int -> [Doc] -> RDoc a
 sep1 _ _                   k _  | k `seq` False = undefined
 sep1 _ NoDoc               _ _  = NoDoc
@@ -629,16 +618,15 @@ sep1 g (TextBeside s sl p) k ys = textBeside_ s sl (sepNB g p (k - sl) ys)
 sep1 _ (Above {})          _ _  = error "sep1 Above"
 sep1 _ (Beside {})         _ _  = error "sep1 Beside"
 
-sepNB :: Bool -> Doc -> Int -> [Doc] -> Doc
 -- Specification: sepNB p k ys = sep1 (text "" <> p) k ys
 -- Called when we have already found some text in the first item
 -- We have to eat up nests
-
+sepNB :: Bool -> Doc -> Int -> [Doc] -> Doc
 sepNB g (Nest _ p) k ys
   = sepNB g p k ys -- Never triggered, because of invariant (2)
 sepNB g Empty k ys
   = oneLiner (nilBeside g (reduceDoc rest)) `mkUnion`
--- XXX: PRETTY: Used True here
+    -- XXX: TODO: PRETTY: Used True here
     nilAboveNest False k (reduceDoc (vcat ys))
   where
     rest | g         = hsep ys
@@ -703,7 +691,7 @@ fillNB g p k ys             = fill1 g p k ys
 fillNBE :: Bool -> Int -> Doc -> [Doc] -> Doc
 fillNBE g k y ys
   = nilBeside g (fill1 g ((elideNest . oneLiner . reduceDoc) y) k' ys)
--- XXX: PRETTY: Used True here
+    -- XXX: TODO: PRETTY: Used True here
     `mkUnion` nilAboveNest False k (fill g (y:ys))
   where k' = if g then k - 1 else k