Pretty printing implemented* #1047

Closed
wants to merge 11 commits into
from

Conversation

Projects
None yet
7 participants
@manpages
Contributor

manpages commented May 12, 2013

Closes #965.

wadler.ex implements Hughes-Wadler document algebra;
binary/inspect.ex now pretty-prints.

Commentaries about implementation are avaliable upon the request.

Rebased properly, thanks @alco.
Fixed issue existed on case insensitive platforms, thanks @yrashk.

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

How about modifying IEx.inspect_opts to have pretty printing enabled by default?

Contributor

yrashk commented May 12, 2013

How about modifying IEx.inspect_opts to have pretty printing enabled by default?

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

Also, how about using :io.columnsfor :width upon a printout in IEx?

Contributor

yrashk commented May 12, 2013

Also, how about using :io.columnsfor :width upon a printout in IEx?

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 12, 2013

Contributor

I have a deja vu of answering that already, but in case I haven't — :io.columns is injected in operator return that wraps outputs of inspect implementations and takes care of some opts. For instance, return decides which ot two output formats — docentity or binary — to use.

As for IEx.inspect_opts, I'll see what it is and what I can do.

Contributor

manpages commented May 12, 2013

I have a deja vu of answering that already, but in case I haven't — :io.columns is injected in operator return that wraps outputs of inspect implementations and takes care of some opts. For instance, return decides which ot two output formats — docentity or binary — to use.

As for IEx.inspect_opts, I'll see what it is and what I can do.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 12, 2013

Contributor

Okay, so it's basically one line in mix.exs for iex.

Will push to branch pretty in a sec.

Contributor

manpages commented May 12, 2013

Okay, so it's basically one line in mix.exs for iex.

Will push to branch pretty in a sec.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 12, 2013

Owner

@yrashk, do you think this is a wrong place to have call :io.columns?
Do you think that we should default width to the terminal width and not to min(80, maxwidth)?

Well, it would be better to put it under else so that :io.columns get called once
per inspect call, but I don't think that there is a place we can move call to io.

@yrashk, do you think this is a wrong place to have call :io.columns?
Do you think that we should default width to the terminal width and not to min(80, maxwidth)?

Well, it would be better to put it under else so that :io.columns get called once
per inspect call, but I don't think that there is a place we can move call to io.

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

I think :io.columns should only be called in IO.inspect, not inspect...

Also min(80, maxwidth) — what if my terminal is 20 characters wide?

Contributor

yrashk commented May 12, 2013

I think :io.columns should only be called in IO.inspect, not inspect...

Also min(80, maxwidth) — what if my terminal is 20 characters wide?

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

I am not entirely sure what's happening, but when I pulled the last IEx patch, IEx didn't start prettifying by default.

Contributor

yrashk commented May 12, 2013

I am not entirely sure what's happening, but when I pulled the last IEx patch, IEx didn't start prettifying by default.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 12, 2013

Contributor

@yrashk it worked for me.

Sorry to ask such a stupid question, but have you run make clean test && bin/iex?

sweater@thoughtflare ~/github/mine/elixir $ bin/iex 
Erlang R15B02 (erts-5.9.2) [source] [64-bit] [smp:8:8] [async-threads:0] [hipe] [kernel-poll:false]

Interactive Elixir (0.8.3.dev) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> {{},{{}},{{},{{}}}}
{{},{{}},{{},{{}}}}
iex(2)> {v(1),v(1),v(1),v(1)}
{ {{},{{}},{{},{{}}}},
  {{},{{}},{{},{{}}}},
  {{},{{}},{{},{{}}}},
  {{},{{}},{{},{{}}}} }
Contributor

manpages commented May 12, 2013

@yrashk it worked for me.

Sorry to ask such a stupid question, but have you run make clean test && bin/iex?

sweater@thoughtflare ~/github/mine/elixir $ bin/iex 
Erlang R15B02 (erts-5.9.2) [source] [64-bit] [smp:8:8] [async-threads:0] [hipe] [kernel-poll:false]

Interactive Elixir (0.8.3.dev) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> {{},{{}},{{},{{}}}}
{{},{{}},{{},{{}}}}
iex(2)> {v(1),v(1),v(1),v(1)}
{ {{},{{}},{{},{{}}}},
  {{},{{}},{{},{{}}}},
  {{},{{}},{{},{{}}}},
  {{},{{}},{{},{{}}}} }
@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 12, 2013

Contributor

As for the :io.columns dilemma —
I'll have a look at IO.inspect and most certainly put call to :io.columns there.

Contributor

manpages commented May 12, 2013

As for the :io.columns dilemma —
I'll have a look at IO.inspect and most certainly put call to :io.columns there.

lib/elixir/lib/binary/inspect.ex
@@ -17,6 +18,34 @@ end
defmodule Binary.Inspect.Utils do
@moduledoc false
+ ## groups aware of depth

This comment has been minimized.

@josevalim

josevalim May 12, 2013

Member

Can we add documentation to those functions too? Maybe move them to the Wadler module? It seems like most inspect implementations are relying on them. :)

@josevalim

josevalim May 12, 2013

Member

Can we add documentation to those functions too? Maybe move them to the Wadler module? It seems like most inspect implementations are relying on them. :)

This comment has been minimized.

@manpages

manpages May 14, 2013

Contributor

@josevalim We shouldn't move that to the Wadler module because it has nothing to do with the algebra itself.
Instead it's an optimization that prevents grouping of docentities that are deep enough.
OTOH we can make Wadler.Utils module or something like that which will have all the convinience functions such as brackets/3, group_maybe/2,3, folddoc/2 and leave only the core algebra implementation in Wadler.
I'm not sure what to do about it, but I'm definitely against moving group*_maybe to wadler.ex.
I'll make a longer commentary about group depth optimization in inspect.ex for now and will
wait for your decision on where should we place those functions. Your word is last. :)

@manpages

manpages May 14, 2013

Contributor

@josevalim We shouldn't move that to the Wadler module because it has nothing to do with the algebra itself.
Instead it's an optimization that prevents grouping of docentities that are deep enough.
OTOH we can make Wadler.Utils module or something like that which will have all the convinience functions such as brackets/3, group_maybe/2,3, folddoc/2 and leave only the core algebra implementation in Wadler.
I'm not sure what to do about it, but I'm definitely against moving group*_maybe to wadler.ex.
I'll make a longer commentary about group depth optimization in inspect.ex for now and will
wait for your decision on where should we place those functions. Your word is last. :)

This comment has been minimized.

@manpages

manpages May 14, 2013

Contributor

@josevalim I gave it more thought and realized that I can move group_maybe to wadler.ex but instead of getting opts and extracting opts[:depth] I can simply and move the duty of looking up current depth to inspect implementations. That way a call to group_maybe will look like

# in inspect implementation
opts = inc_depth(opts)
# ...
group_maybe(
  concat(
    # ...
  ),
  opts[:depth]
)

If you're OK with that, that's what I'll do.

@manpages

manpages May 14, 2013

Contributor

@josevalim I gave it more thought and realized that I can move group_maybe to wadler.ex but instead of getting opts and extracting opts[:depth] I can simply and move the duty of looking up current depth to inspect implementations. That way a call to group_maybe will look like

# in inspect implementation
opts = inc_depth(opts)
# ...
group_maybe(
  concat(
    # ...
  ),
  opts[:depth]
)

If you're OK with that, that's what I'll do.

@@ -0,0 +1,515 @@
+defmodule Wadler do
+ @moduledoc """

This comment has been minimized.

@josevalim

josevalim May 12, 2013

Member

Let's add a link to the paper. :)

@josevalim

josevalim May 12, 2013

Member

Let's add a link to the paper. :)

This comment has been minimized.

@manpages

manpages May 14, 2013

Contributor

@josevalim Oh, I wanted to but forgot to do that as it seems. Will do in a minute.

@manpages

manpages May 14, 2013

Contributor

@josevalim Oh, I wanted to but forgot to do that as it seems. Will do in a minute.

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

@manpages mea culpa, it works

Contributor

yrashk commented May 12, 2013

@manpages mea culpa, it works

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

@manpages can this algorithm be used to split long lines into a sequence of "stri" <> "ngs" ?

Contributor

yrashk commented May 12, 2013

@manpages can this algorithm be used to split long lines into a sequence of "stri" <> "ngs" ?

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

(or is it a bad idea... @josevalim ?)

Contributor

yrashk commented May 12, 2013

(or is it a bad idea... @josevalim ?)

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

Another question:

iex(15)> IO.inspect List.duplicate(String.duplicate("a",10), 10), pretty: true, width: 100
[ "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa" ]

Can this algorithm be smarter and print a few entries a line or is this out of the algorithm's reach?

Contributor

yrashk commented May 12, 2013

Another question:

iex(15)> IO.inspect List.duplicate(String.duplicate("a",10), 10), pretty: true, width: 100
[ "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa" ]

Can this algorithm be smarter and print a few entries a line or is this out of the algorithm's reach?

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 12, 2013

Contributor

@yrashk it's possible. I'll implement that later though.
Implementing that both features can be done in very similar way, so I'll do it at once.

Contributor

manpages commented May 12, 2013

@yrashk it's possible. I'll implement that later though.
Implementing that both features can be done in very similar way, so I'll do it at once.

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

@manpages what is possible — string splitting or entry coupling within width?

Contributor

yrashk commented May 12, 2013

@manpages what is possible — string splitting or entry coupling within width?

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 12, 2013

Contributor

@yrashk both.

Contributor

manpages commented May 12, 2013

@yrashk both.

@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim May 12, 2013

Member

Is string splitting really a good idea? If we are planning to go with it, we should break strings into many lines using the <> operator, like:

"this is a tooooo long string"

should become something like:

"this is a toooo" <> "o long string"

But I am still not decided it is a good idea.

Member

josevalim commented May 12, 2013

Is string splitting really a good idea? If we are planning to go with it, we should break strings into many lines using the <> operator, like:

"this is a tooooo long string"

should become something like:

"this is a toooo" <> "o long string"

But I am still not decided it is a good idea.

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

I am not sure it is a good idea either, but <> is exactly what I suggested if this is to be done

Contributor

yrashk commented May 12, 2013

I am not sure it is a good idea either, but <> is exactly what I suggested if this is to be done

@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim May 12, 2013

Member

So let's not do anything right now regarding String splitting, we can
always add it later. :)

Member

josevalim commented May 12, 2013

So let's not do anything right now regarding String splitting, we can
always add it later. :)

@yrashk

This comment has been minimized.

Show comment
Hide comment
@yrashk

yrashk May 12, 2013

Contributor

agree

On Sun, May 12, 2013 at 1:25 PM, José Valim notifications@github.comwrote:

So let's not do anything right now regarding String splitting, we can
always add it later. :)


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-17784691
.

Contributor

yrashk commented May 12, 2013

agree

On Sun, May 12, 2013 at 1:25 PM, José Valim notifications@github.comwrote:

So let's not do anything right now regarding String splitting, we can
always add it later. :)


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-17784691
.

@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim May 14, 2013

Member

It seems we are good to go. Thanks @manpages! @yrashk any more feedback?

One last question, are we handling this scenario already?

iex(15)> IO.inspect List.duplicate(String.duplicate("a",10), 10), pretty: true, width: 100
[ "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa" ]

❤️

Member

josevalim commented May 14, 2013

It seems we are good to go. Thanks @manpages! @yrashk any more feedback?

One last question, are we handling this scenario already?

iex(15)> IO.inspect List.duplicate(String.duplicate("a",10), 10), pretty: true, width: 100
[ "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa",
  "aaaaaaaaaa" ]

❤️

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 14, 2013

Contributor

@josevalim not yet. It's not trivial (if we want to be effective). I'd go with another issue/pull req cycle.

Contributor

manpages commented May 14, 2013

@josevalim not yet. It's not trivial (if we want to be effective). I'd go with another issue/pull req cycle.

@alco

This comment has been minimized.

Show comment
Hide comment
@alco

alco May 14, 2013

Member

Fancy some stress testing? Here's something completely nuts. https://gist.github.com/alco/5579369

This is documentation for Elixir in one Erlang list. You can load it into IEx like this:

{:ok, terms} = :file.consult 'docs.erl'
IO.inspect terms

It takes a couple of seconds for the current IO.inspect to print. With your patch, I wasn't able to see it finish. It eats 100% CPU and gradually eats up memory, then release a piece of memory, then it's it up again -- in a cycle. So it seems to me that it gets into an infinite loop somewhere during the process.

If it's not possible to pretty-print this much data, I would expect to at least be presented with something like initial 100 characters of output and then a message "too much data to format. Use inspect: raw".

Member

alco commented May 14, 2013

Fancy some stress testing? Here's something completely nuts. https://gist.github.com/alco/5579369

This is documentation for Elixir in one Erlang list. You can load it into IEx like this:

{:ok, terms} = :file.consult 'docs.erl'
IO.inspect terms

It takes a couple of seconds for the current IO.inspect to print. With your patch, I wasn't able to see it finish. It eats 100% CPU and gradually eats up memory, then release a piece of memory, then it's it up again -- in a cycle. So it seems to me that it gets into an infinite loop somewhere during the process.

If it's not possible to pretty-print this much data, I would expect to at least be presented with something like initial 100 characters of output and then a message "too much data to format. Use inspect: raw".

@alco

This comment has been minimized.

Show comment
Hide comment
@alco

alco May 14, 2013

Member

More precisely, it takes half a minute for the current IO.inspect to print this data on my machine.

Member

alco commented May 14, 2013

More precisely, it takes half a minute for the current IO.inspect to print this data on my machine.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 15, 2013

Contributor

@alco I made some stress tesing with data structures with definitions up to 5MB.
Here is a sample result for something around 5MB definition — https://gist.github.com/manpages/5558956

Maybe I forgot to make default opts[:limit] for binary?!
Let me check.

Contributor

manpages commented May 15, 2013

@alco I made some stress tesing with data structures with definitions up to 5MB.
Here is a sample result for something around 5MB definition — https://gist.github.com/manpages/5558956

Maybe I forgot to make default opts[:limit] for binary?!
Let me check.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 15, 2013

Contributor

Fixed as it seems —

iex(17)> File.read("/usr/bin/ls")             
{ :ok,
  <<127,69,76,70,2,1,1,0,0,0,0,0,0,0,0,0,2,0,62,0,1,0,0,0,120,72,64,0,0,0,0,0,64,0,0,0,0,0,0,0,16,167,1,0,0,0,0,0,0,0,...>> }
Contributor

manpages commented May 15, 2013

Fixed as it seems —

iex(17)> File.read("/usr/bin/ls")             
{ :ok,
  <<127,69,76,70,2,1,1,0,0,0,0,0,0,0,0,0,2,0,62,0,1,0,0,0,120,72,64,0,0,0,0,0,64,0,0,0,0,0,0,0,16,167,1,0,0,0,0,0,0,0,...>> }
@alco

This comment has been minimized.

Show comment
Hide comment
@alco

alco May 15, 2013

Member

@manpages Have you tested it with my file? Still doesn't work for me.

Member

alco commented May 15, 2013

@manpages Have you tested it with my file? Still doesn't work for me.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 15, 2013

Contributor

@alco oops, sorry. Testing now!

Thank you very much for testing and input!

Contributor

manpages commented May 15, 2013

@alco oops, sorry. Testing now!

Thank you very much for testing and input!

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 15, 2013

Contributor

Yep, it doesn't work here as well. I'll debug tonight.

Contributor

manpages commented May 15, 2013

Yep, it doesn't work here as well. I'll debug tonight.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 18, 2013

Contributor

@alco, fixed by making grouping less deep. It brought up another issue though — the algorithm really freaks out on width overflow, not quite sure how to deal with it.

When no overflow occurs stuff gets printed just fine —

iex(4)> { ExDoc.ModuleNode, List.Chars, "List.Chars", "The List.Chars protocol is responsible for\nconverting a structure to a list (only if applicable).\nThe only function required to be implemented is\n`to_char_list` which does the conversion.\n\nThe `to_char_list` function automatically imported\nby Kernel invokes this protocol.\n", [ { ExDoc.FunctionNode, :to_char_list, 1, "to_char_list/1", nil, "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L14", :def, 14, [ { :thing, [ line: 14 ], nil } ] } ], "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L1", [], :protocol, "List.Chars", 1, nil } 
{ ExDoc.ModuleNode,
  List.Chars,
  "List.Chars",
  "The List.Chars protocol is responsible for\nconverting a structure to a list (only if applicable).\nThe only function required to be implemented is\n`to_char_list` which does the conversion.\n\nThe `to_char_list` function automatically imported\nby Kernel invokes this protocol.\n",
  [ { ExDoc.FunctionNode,
      :to_char_list,
      1,
      "to_char_list/1",
      nil,
      "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L14",
      :def,
      14,
      [{:thing,[line: 14],nil}] } ],
  "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L1",
  [],
  :protocol,
  "List.Chars",
  1,
  nil }
Contributor

manpages commented May 18, 2013

@alco, fixed by making grouping less deep. It brought up another issue though — the algorithm really freaks out on width overflow, not quite sure how to deal with it.

When no overflow occurs stuff gets printed just fine —

iex(4)> { ExDoc.ModuleNode, List.Chars, "List.Chars", "The List.Chars protocol is responsible for\nconverting a structure to a list (only if applicable).\nThe only function required to be implemented is\n`to_char_list` which does the conversion.\n\nThe `to_char_list` function automatically imported\nby Kernel invokes this protocol.\n", [ { ExDoc.FunctionNode, :to_char_list, 1, "to_char_list/1", nil, "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L14", :def, 14, [ { :thing, [ line: 14 ], nil } ] } ], "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L1", [], :protocol, "List.Chars", 1, nil } 
{ ExDoc.ModuleNode,
  List.Chars,
  "List.Chars",
  "The List.Chars protocol is responsible for\nconverting a structure to a list (only if applicable).\nThe only function required to be implemented is\n`to_char_list` which does the conversion.\n\nThe `to_char_list` function automatically imported\nby Kernel invokes this protocol.\n",
  [ { ExDoc.FunctionNode,
      :to_char_list,
      1,
      "to_char_list/1",
      nil,
      "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L14",
      :def,
      14,
      [{:thing,[line: 14],nil}] } ],
  "https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/list/chars.ex#L1",
  [],
  :protocol,
  "List.Chars",
  1,
  nil }
@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim May 18, 2013

Member

What do you mean by "the algorithm really freaks out on width overflow"? The pasted example seems fine.

Member

josevalim commented May 18, 2013

What do you mean by "the algorithm really freaks out on width overflow"? The pasted example seems fine.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 18, 2013

Contributor

@josevalim let me paste somewhere pretty printed version of @alco's docs.erl. :-)

Contributor

manpages commented May 18, 2013

@josevalim let me paste somewhere pretty printed version of @alco's docs.erl. :-)

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 18, 2013

Contributor

Here is the pasted iex output — https://gist.github.com/manpages/5603803.
I'm not sure why this happens but I'll investigate when I've got the time.

Contributor

manpages commented May 18, 2013

Here is the pasted iex output — https://gist.github.com/manpages/5603803.
I'm not sure why this happens but I'll investigate when I've got the time.

@devinus

This comment has been minimized.

Show comment
Hide comment
@devinus

devinus May 30, 2013

Contributor

My only criticism: can we rename the module from wadler.ex to pretty_printer.ex or something? Wadler exposes implementation detail, which could change any time. And I want people to be able to use this module. @manpages @josevalim

Contributor

devinus commented May 30, 2013

My only criticism: can we rename the module from wadler.ex to pretty_printer.ex or something? Wadler exposes implementation detail, which could change any time. And I want people to be able to use this module. @manpages @josevalim

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 31, 2013

Contributor

Dear @devinus!

From my point of view, this module is not pretty printer. It's document
builder to be used in pretty printing implementations.

Currently if implementation of document builder changes and it will no
longer implement Hughes-Wadler document algebra,
then API will be broken and all the pretty printers that use standard
document builder are going to be rewritten.

On the other hand — if API doesn't change then it is still a variation of
Hughes-Wadler document algebra, no matter what's
inside.

On top of things, when we were discussing the question of naming the
module, @josevalim and some other people (logs?)
really enjoyed the idea of calling this module in honor of Dr. Wadler.

~ @manpages.

Contributor

manpages commented May 31, 2013

Dear @devinus!

From my point of view, this module is not pretty printer. It's document
builder to be used in pretty printing implementations.

Currently if implementation of document builder changes and it will no
longer implement Hughes-Wadler document algebra,
then API will be broken and all the pretty printers that use standard
document builder are going to be rewritten.

On the other hand — if API doesn't change then it is still a variation of
Hughes-Wadler document algebra, no matter what's
inside.

On top of things, when we were discussing the question of naming the
module, @josevalim and some other people (logs?)
really enjoyed the idea of calling this module in honor of Dr. Wadler.

~ @manpages.

@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim May 31, 2013

Member

I agree with @manpages. Btw, @manpages, did you have some time to investigate the slow down in the current implementation? :)

Member

josevalim commented May 31, 2013

I agree with @manpages. Btw, @manpages, did you have some time to investigate the slow down in the current implementation? :)

@meh

This comment has been minimized.

Show comment
Hide comment
@meh

meh May 31, 2013

Contributor

I agree with @manpages too.

Contributor

meh commented May 31, 2013

I agree with @manpages too.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages May 31, 2013

Contributor

I have fixed performance issue.
What really prevents us to merge this is the fact that long lines (see the
test data provided by truedroid) make pretty printer print something really
strange while on "sane" data it works ok.
Hard to debug such thing and I don't have enough time, so sorry T_T.
On May 31, 2013 6:58 AM, "José Valim" notifications@github.com wrote:

I agree with @manpages https://github.com/manpages. Btw, @manpageshttps://github.com/manpages,
did you have some time to investigate the slow down in the current
implementation? :)


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-18722226
.

Contributor

manpages commented May 31, 2013

I have fixed performance issue.
What really prevents us to merge this is the fact that long lines (see the
test data provided by truedroid) make pretty printer print something really
strange while on "sane" data it works ok.
Hard to debug such thing and I don't have enough time, so sorry T_T.
On May 31, 2013 6:58 AM, "José Valim" notifications@github.com wrote:

I agree with @manpages https://github.com/manpages. Btw, @manpageshttps://github.com/manpages,
did you have some time to investigate the slow down in the current
implementation? :)


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-18722226
.

@brunoro

This comment has been minimized.

Show comment
Hide comment
@brunoro

brunoro Jun 7, 2013

Contributor

It seems that the implementation described in Haskell by Wadler for his document algebra relies on lazy evaluation to solve the problem. If the performance issue still persists it might be a good shot trying out Lindig's[1] approach, where an efficient implementation of the document algebra for a strict language (OCaml) is presented.

1: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.2200

Contributor

brunoro commented Jun 7, 2013

It seems that the implementation described in Haskell by Wadler for his document algebra relies on lazy evaluation to solve the problem. If the performance issue still persists it might be a good shot trying out Lindig's[1] approach, where an efficient implementation of the document algebra for a strict language (OCaml) is presented.

1: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.2200

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages Jun 7, 2013

Contributor

Good catch.
I love how Hughes-Wadler algebra is easy and beautiful though.
I'll see how I can reimplement it.
On Jun 7, 2013 3:44 AM, "Gustavo Brunoro" notifications@github.com wrote:

It seems that the implementation described in Haskell by Wadler for his
document algebra relies on lazy evaluation to solve the problem. If the
performance issue still persists it might be a good shot trying out
Lindig's[1] approach, where an efficient implementation of the document
algebra for a strict language (OCaml) is presented.

1: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.2200


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-19083227
.

Contributor

manpages commented Jun 7, 2013

Good catch.
I love how Hughes-Wadler algebra is easy and beautiful though.
I'll see how I can reimplement it.
On Jun 7, 2013 3:44 AM, "Gustavo Brunoro" notifications@github.com wrote:

It seems that the implementation described in Haskell by Wadler for his
document algebra relies on lazy evaluation to solve the problem. If the
performance issue still persists it might be a good shot trying out
Lindig's[1] approach, where an efficient implementation of the document
algebra for a strict language (OCaml) is presented.

1: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.34.2200


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-19083227
.

@brunoro

This comment has been minimized.

Show comment
Hide comment
@brunoro

brunoro Jun 12, 2013

Contributor

I tried implementing the strict version of the Wadler document algebra based on @manpages code: https://gist.github.com/brunoro/5762689

It seems to work fine, but it's still missing functions providing the same interface as the code from this pull request.
I can work on that later this week so we could try out that huge documentation input file.

Contributor

brunoro commented Jun 12, 2013

I tried implementing the strict version of the Wadler document algebra based on @manpages code: https://gist.github.com/brunoro/5762689

It seems to work fine, but it's still missing functions providing the same interface as the code from this pull request.
I can work on that later this week so we could try out that huge documentation input file.

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages Jun 12, 2013

Contributor

Wow, thank you, sir!
Now we have some code bases to merge, but that's way easier when we have a
working implementation.
I'll try to get to this ticket ASAP (next weekend-ish) and update the pull
request using your Wadler-Lindig module.
Again, thanks!

On Wed, Jun 12, 2013 at 6:56 AM, Gustavo Brunoro
notifications@github.comwrote:

I tried implementing the strict version of the Wadler document algebra
based on @manpages https://github.com/manpages code:
https://gist.github.com/brunoro/5762689

It seems to work fine, but it's still missing functions providing the same
interface as the code from this pull request.
I can work on that later this week so we could try out that huge
documentation input file.


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-19305753
.

Fly safe,
Sloz

Contributor

manpages commented Jun 12, 2013

Wow, thank you, sir!
Now we have some code bases to merge, but that's way easier when we have a
working implementation.
I'll try to get to this ticket ASAP (next weekend-ish) and update the pull
request using your Wadler-Lindig module.
Again, thanks!

On Wed, Jun 12, 2013 at 6:56 AM, Gustavo Brunoro
notifications@github.comwrote:

I tried implementing the strict version of the Wadler document algebra
based on @manpages https://github.com/manpages code:
https://gist.github.com/brunoro/5762689

It seems to work fine, but it's still missing functions providing the same
interface as the code from this pull request.
I can work on that later this week so we could try out that huge
documentation input file.


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-19305753
.

Fly safe,
Sloz

@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim Jun 12, 2013

Member

Excellent work @brunoro! Would you mind doing your work on a fork of @manpages? This way, @manpages can just merge your commits or we can just merge your fork, so everyone gets the proper credentials at the end! :D

Member

josevalim commented Jun 12, 2013

Excellent work @brunoro! Would you mind doing your work on a fork of @manpages? This way, @manpages can just merge your commits or we can just merge your fork, so everyone gets the proper credentials at the end! :D

@brunoro

This comment has been minimized.

Show comment
Hide comment
@brunoro

brunoro Jun 15, 2013

Contributor

I've implemented the code on that gist and integrated it with the Wadler module from this pull request on my fork (https://github.com/brunoro/elixir).

However @manpages's implementation still shows better performance than my implementation of the strict pretty printer proposed by Lindig.
The results (mean of 10 runs per data structure depth):

$ wadler-lindig-elixir/bin/elixir measure.ex 
depth   mean        std
50      9489.20     10099.79
100     18204.40    227.90
150     38173.80    1220.10
200     85592.20    1777.51
250     154213.70   3656.89
300     232109.40   4574.05
350     386209.10   12664.53
400     569718.80   16954.76
450     861126.90   26857.86
500     1202718.60  33078.20

$ wadler-elixir/bin/elixir measure.ex 
depth   mean        std
50      13520.10    12023.12
100     16894.90    1507.81
150     29082.20    2415.77
200     65267.90    4072.26
250     109254.30   8464.41
300     127620.10   1093.82
350     162079.80   2341.23
400     241817.50   2446.05
450     392840.30   1899.71
500     547663.20   7887.47

The code I used on the benchmarks is here: https://gist.github.com/brunoro/5786164
Both implementations got OOM killed on my machine using those Erlang docs posted before =/

Contributor

brunoro commented Jun 15, 2013

I've implemented the code on that gist and integrated it with the Wadler module from this pull request on my fork (https://github.com/brunoro/elixir).

However @manpages's implementation still shows better performance than my implementation of the strict pretty printer proposed by Lindig.
The results (mean of 10 runs per data structure depth):

$ wadler-lindig-elixir/bin/elixir measure.ex 
depth   mean        std
50      9489.20     10099.79
100     18204.40    227.90
150     38173.80    1220.10
200     85592.20    1777.51
250     154213.70   3656.89
300     232109.40   4574.05
350     386209.10   12664.53
400     569718.80   16954.76
450     861126.90   26857.86
500     1202718.60  33078.20

$ wadler-elixir/bin/elixir measure.ex 
depth   mean        std
50      13520.10    12023.12
100     16894.90    1507.81
150     29082.20    2415.77
200     65267.90    4072.26
250     109254.30   8464.41
300     127620.10   1093.82
350     162079.80   2341.23
400     241817.50   2446.05
450     392840.30   1899.71
500     547663.20   7887.47

The code I used on the benchmarks is here: https://gist.github.com/brunoro/5786164
Both implementations got OOM killed on my machine using those Erlang docs posted before =/

@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim Jun 15, 2013

Member

@brunoro good work! The performance on both cases is still much lower than expected, considering that erlang inspect is employing the same algorithm. There are things that will make Elixir's one a bit slower, but not considerably slower.

@brunoro could you please benchmark also the implementation we have in master today? Maybe the issue is not with the pretty print algorithm but the current inspect mechanism. Maybe there is a list append or bad binary concat hidden somewhere.

Member

josevalim commented Jun 15, 2013

@brunoro good work! The performance on both cases is still much lower than expected, considering that erlang inspect is employing the same algorithm. There are things that will make Elixir's one a bit slower, but not considerably slower.

@brunoro could you please benchmark also the implementation we have in master today? Maybe the issue is not with the pretty print algorithm but the current inspect mechanism. Maybe there is a list append or bad binary concat hidden somewhere.

@brunoro

This comment has been minimized.

Show comment
Hide comment
@brunoro

brunoro Jun 15, 2013

Contributor

Here are the results for the current implementation on master, there's no sign of any performance bottlenecks:

$ elixir measure.ex 
depth   mean        std
50      1915.60     2461.86
100     2317.00     222.63
150     2689.40     161.73
200     4085.30     197.06
250     6466.10     226.91
300     6224.70     147.17
350     9047.50     229.80
400     10661.90    383.54
450     13150.60    384.30
500     15310.70    297.61

I'll profile both pretty printing implementations we have now looking for optimization oportunities.

EDIT: I've plotted the results here

Contributor

brunoro commented Jun 15, 2013

Here are the results for the current implementation on master, there's no sign of any performance bottlenecks:

$ elixir measure.ex 
depth   mean        std
50      1915.60     2461.86
100     2317.00     222.63
150     2689.40     161.73
200     4085.30     197.06
250     6466.10     226.91
300     6224.70     147.17
350     9047.50     229.80
400     10661.90    383.54
450     13150.60    384.30
500     15310.70    297.61

I'll profile both pretty printing implementations we have now looking for optimization oportunities.

EDIT: I've plotted the results here

@brunoro

This comment has been minimized.

Show comment
Hide comment
@brunoro

brunoro Jun 15, 2013

Contributor

String concatenation on the render function was taking most of time. I changed that function to write a string list and then join it using Enum.join and got amazing results (:

$ wadler-lindig/bin/elixir pprint/benchmark.ex 
depth   mean        std
50      7074.60     7205.38
100     8636.70     499.15
150     9770.10     264.16
200     14181.40    307.37
250     19421.50    553.42
300     21570.70    1162.76
350     28332.20    3893.04
400     32111.80    3262.23
450     35212.60    1590.58
500     43002.60    1711.86

Already pushed those results to my fork. I'll also try the same optimization on the code from the pull request and post the results later.

EDIT: another plot here

Contributor

brunoro commented Jun 15, 2013

String concatenation on the render function was taking most of time. I changed that function to write a string list and then join it using Enum.join and got amazing results (:

$ wadler-lindig/bin/elixir pprint/benchmark.ex 
depth   mean        std
50      7074.60     7205.38
100     8636.70     499.15
150     9770.10     264.16
200     14181.40    307.37
250     19421.50    553.42
300     21570.70    1162.76
350     28332.20    3893.04
400     32111.80    3262.23
450     35212.60    1590.58
500     43002.60    1711.86

Already pushed those results to my fork. I'll also try the same optimization on the code from the pull request and post the results later.

EDIT: another plot here

@manpages

This comment has been minimized.

Show comment
Hide comment
@manpages

manpages Jun 15, 2013

Contributor

Hello, I have just read the discussion.
Great that things get done and we've got some proper profiling!
I'll prioritize this issue when I get home from EUC.

Thank you so much, Mr. Brunoro.
On Jun 16, 2013 12:52 AM, "Gustavo Brunoro" notifications@github.com
wrote:

String concatenation on the render function was taking most of time. I
changed that function to write a string list and then join it using
Enum.join and got amazing results (:

$ wadler-lindig/bin/elixir pprint/benchmark.ex
depth mean std
50 7074.60 7205.38
100 8636.70 499.15
150 9770.10 264.16
200 14181.40 307.37
250 19421.50 553.42
300 21570.70 1162.76
350 28332.20 3893.04
400 32111.80 3262.23
450 35212.60 1590.58
500 43002.60 1711.86

Already pushed those results to my fork. I'll also try the same
optimization on the code from the pull request and post the results later.


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-19504817
.

Contributor

manpages commented Jun 15, 2013

Hello, I have just read the discussion.
Great that things get done and we've got some proper profiling!
I'll prioritize this issue when I get home from EUC.

Thank you so much, Mr. Brunoro.
On Jun 16, 2013 12:52 AM, "Gustavo Brunoro" notifications@github.com
wrote:

String concatenation on the render function was taking most of time. I
changed that function to write a string list and then join it using
Enum.join and got amazing results (:

$ wadler-lindig/bin/elixir pprint/benchmark.ex
depth mean std
50 7074.60 7205.38
100 8636.70 499.15
150 9770.10 264.16
200 14181.40 307.37
250 19421.50 553.42
300 21570.70 1162.76
350 28332.20 3893.04
400 32111.80 3262.23
450 35212.60 1590.58
500 43002.60 1711.86

Already pushed those results to my fork. I'll also try the same
optimization on the code from the pull request and post the results later.


Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/pull/1047#issuecomment-19504817
.

@brunoro

This comment has been minimized.

Show comment
Hide comment
@brunoro

brunoro Jun 15, 2013

Contributor

Just wrapping up here, I've implemented the same optimization on @manpages code and got really good results too.

An experiment with deeper data structures and discarding the results from the versions using the <> operator shows that the stric implementation is still slightly faster than the one relying on lazy evaluation. Here are the results; note that we have linear performance for both implementations now (:

Contributor

brunoro commented Jun 15, 2013

Just wrapping up here, I've implemented the same optimization on @manpages code and got really good results too.

An experiment with deeper data structures and discarding the results from the versions using the <> operator shows that the stric implementation is still slightly faster than the one relying on lazy evaluation. Here are the results; note that we have linear performance for both implementations now (:

@devinus

This comment has been minimized.

Show comment
Hide comment
@devinus

devinus Jun 16, 2013

Contributor

@brunoro @josevalim I have a feeling that a lot of the performance could be improved by simply prepending to a iolist and using iolist_to_binary at the end, which is supposed to be Super Fast(™️). It looks like there's a lot of binaries being created.

Contributor

devinus commented Jun 16, 2013

@brunoro @josevalim I have a feeling that a lot of the performance could be improved by simply prepending to a iolist and using iolist_to_binary at the end, which is supposed to be Super Fast(™️). It looks like there's a lot of binaries being created.

@devinus

This comment has been minimized.

Show comment
Hide comment
@devinus

devinus Jun 16, 2013

Contributor

@brunoro Can you benchmark an implementation that uses iolist_to_binary instead of Enum.join?

Contributor

devinus commented Jun 16, 2013

@brunoro Can you benchmark an implementation that uses iolist_to_binary instead of Enum.join?

@brunoro

This comment has been minimized.

Show comment
Hide comment
@brunoro

brunoro Jun 16, 2013

Contributor

@devinus here it goes: http://i.imgur.com/EUzREMC.png. There's a tiny performance gain using iolist_to_binary, which is always good.

We should continue this discussion on #1267

Contributor

brunoro commented Jun 16, 2013

@devinus here it goes: http://i.imgur.com/EUzREMC.png. There's a tiny performance gain using iolist_to_binary, which is always good.

We should continue this discussion on #1267

@devinus

This comment has been minimized.

Show comment
Hide comment
@devinus

devinus Jun 17, 2013

Contributor

@brunoro Awesome! I'm also willing to bet there's a memory savings as well because less binaries are created on the heap.

Contributor

devinus commented Jun 17, 2013

@brunoro Awesome! I'm also willing to bet there's a memory savings as well because less binaries are created on the heap.

@josevalim

This comment has been minimized.

Show comment
Hide comment
@josevalim

josevalim Jun 17, 2013

Member

Closing this one in favor of the new pull request, thanks everyone! ❤️ 💚 💙 💛 💜

Member

josevalim commented Jun 17, 2013

Closing this one in favor of the new pull request, thanks everyone! ❤️ 💚 💙 💛 💜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment