Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vertically align tokens #41

Closed
HarrisonGrodin opened this issue Dec 25, 2021 · 18 comments
Closed

Vertically align tokens #41

HarrisonGrodin opened this issue Dec 25, 2021 · 18 comments
Labels
aesthetics Improvement to code formatter output

Comments

@HarrisonGrodin
Copy link
Collaborator

We should at least align delimiters, imo, e.g.:

val apple  = 0
and orange = 1

val () =
  case foo of
    (NONE, _)        => ()
  | (_, NONE)        => ()
  | (SOME x, SOME y) => ()

datatype apple = Foo | Bar
and orange     = Baz

I would also prefer right-aligning keywords before name definitions so names are aligned, e.g.:

eqtype apple
   and orange

datatype apple  = Foo | Bar
     and orange = Baz

Unsure if this is a popular opinion, though.

@HarrisonGrodin HarrisonGrodin added the aesthetics Improvement to code formatter output label Dec 25, 2021
@HarrisonGrodin
Copy link
Collaborator Author

@shwestrick Thinking about how to implement this. Instinctively, I want to compute the length of each column and (left or right) align by padding with spaces. However, I'm not sure how this would work with comments and non-atomic components (e.g., ('a, 'b, 'verylong...) apple). Do you think something like this would be workable, or perhaps this alignment should be a doc operation?

@shwestrick
Copy link
Owner

shwestrick commented Dec 26, 2021

Yeah I've thought a bit about this and I'm not sure what the best solution is.

The simplest solution, from the layout perspective, is something like

datatype doc =
...
| Table of {numRows: int, numCols: int, elem: (int * int) -> token}

And you could imagine augmenting it with padding/justification info, like a function int -> {Left | Right | Center} which describes how to justify each column.

But this kinda sucks for actually writing formatter cases.

@shwestrick
Copy link
Owner

It might be possible to instead allow each element of the table to be a doc, with nested tables. Not sure how to lay those out.

@shwestrick
Copy link
Owner

I would also prefer right-aligning keywords before name definitions so names are aligned

Personally I find this really difficult to read, because I look to the left to see where the next declaration begins, rather than at the names.

This maybe isn't too bad?

eqtype apple
and    orange

datatype apple  = Foo | Bar
and      orange = Baz

@HarrisonGrodin
Copy link
Collaborator Author

It might be possible to instead allow each element of the table to be a doc, with nested tables. Not sure how to lay those out.

Hmm, yeah... I think we'd need to support at least a few tokens in a row (Token.t Seq.t?), due to type parameters:

datatype ('a, 'b) apple = Foo 'a | Bar of 'b
and orange              = Baz of (int, int) apple

@HarrisonGrodin
Copy link
Collaborator Author

HarrisonGrodin commented Dec 27, 2021

Personally I find this really difficult to read, because I look to the left to see where the next declaration begins, rather than at the names.

Ah, I see. My eye tends to center around the sort/name, I think (datatype foo, and bar), which makes right alignment read more naturally.

I wonder how each of our approaches fare as the length of type names vary.

Some sample cases:

datatype foo = Foo
and bar      = Bar

(* left/left *)
datatype foo = Foo
and      bar = Bar

(* right/left, right/right *)
datatype foo = Foo
     and bar = Bar
datatype ('a, 'b) apple = Foo
and orange              = Bar

(* left/left *)
datatype ('a, 'b) apple = Foo
and      orange         = Bar

(* right/left *)
datatype ('a, 'b) apple = Foo
     and orange         = Bar

(* right/right *)
datatype ('a, 'b) apple = Foo
     and         orange = Bar
datatype apple      = Foo
and ('a, 'b) orange = Bar

(* left/left *)
datatype apple           = Foo
and      ('a, 'b) orange = Bar

(* right/left *)
datatype apple           = Foo
     and ('a, 'b) orange = Bar

(* right/right *)
datatype           apple = Foo
     and ('a, 'b) orange = Bar

(Edit: for type/eqtype/datatype, I suppose there's also the question of whether type parameters would be in a column of their own...)

@HarrisonGrodin
Copy link
Collaborator Author

HarrisonGrodin commented Dec 27, 2021

Just realizing - the left vs. right justification issue only arises in a few relatively uncommon cases for and: afaict val rec, type, eqtype, datatype, exception, structure, signature, functor. The other cases, val and fun (most common?) don't matter anyway, since they're three letters.

EDIT: where, too.

@HarrisonGrodin
Copy link
Collaborator Author

Just remembered another use case for this (in specs):

val twelve : int
and thirteen : int

(* formatted to *)
val twelve   : int
and thirteen : int

If values are declared together, should probably align their :s, so asymmetries aren't introduced simply due to variable name length.

@HarrisonGrodin
Copy link
Collaborator Author

Perhaps also:

type t =
  { apple : t
  , watermelon : t
  , orange : t
  }

(* formatted to *)
type t =
  { apple      : t
  , watermelon : t
  , orange     : t
  }

@shwestrick
Copy link
Owner

@HarrisonGrodin I'm thinking we could try adding this to the interface

val table: {numRows: int, numCols: int, elem: {row:int, col:int} -> doc} -> doc

I'm picturing that each doc element would be forcibly flattened, so that there are no "multirows" of the resulting layout. Tables themselves would be rigid. And, we would throw a runtime error if an element of a table is rigid (disallowing e.g. nested tables but of course other problematic elements as well).

I don't think this would be too difficult to implement. But it leaves the question of whether or not it is usable for writing pretty cases. One hangup is that you would need to be careful to not put possibly rigid things inside a table. Thoughts?

@shwestrick
Copy link
Owner

Here's one place where this would be a pain.

val x : foo
val y : bar longbar very long bar
          definitely needs multiple
          lines bar
val z : baz

Essentially this is a case of "wrapping" within a cell, but of course it's not the same semantics as table cell wrapping.

@HarrisonGrodin
Copy link
Collaborator Author

And, we would throw a runtime error if an element of a table is rigid (disallowing e.g. nested tables but of course other problematic elements as well).

This seems problematic to me; it seems important that all code have at least some autoformatted output.

Forcible flattening sounds somewhat reasonable, although this probably shouldn't apply to the last doc on a line. For example:

(* should align ='s *)
val pat1 = exp1
and pat2 = exp2

val x   =
  case bar of
    nil     => 0
  | x :: xs => x + prod xs
and foo = exp2

A similar issue shows up for many other and-able keywords, such as fun, type, structure, and datatype, all of which can have multi-line/possibly rigid docs at the end of a row.

@HarrisonGrodin
Copy link
Collaborator Author

HarrisonGrodin commented Jan 4, 2022

Oh, just realized - we probably want nested tables, as long as they're nested only at the end of another table, in cases like this:

type short            =
  { label1 : t1
  , label2 : t2
  }
and  veryveryverylong =
  { label3 : t3
  , label4 : t4
  , label5 : t5
  }

and this:

case foo of
  pat       => 1
| longpat   => (
    case bar of
      pat       => 2
    | longpat   => 3
    | longerpat => 4
  )
| longerpat => 5

@shwestrick
Copy link
Owner

shwestrick commented Jan 4, 2022

And, we would throw a runtime error if an element of a table is rigid (disallowing e.g. nested tables but of course other problematic elements as well).

This seems problematic to me; it seems important that all code have at least some autoformatted output.

Well, keep in mind these are implementation details. It restricts how you use tables, not what the pretty-printer can do. If there was a runtime error due to rigid inside table, that would be a bug in the implementation of the pretty printer.

Forcible flattening sounds somewhat reasonable, although this probably shouldn't apply to the last doc on a line.

You can do this without changing the table definition. If the last doc shouldn't be flattened, then we can just take it out of the table, and prettify it separately.

type short            =
  { label1 : t1
  , label2 : t2
  }
and  veryveryverylong =
  { label3 : t3
  , label4 : t4
  , label5 : t5
  }

More of an aesthetics point, but I think this is a great example of where we need to be careful with vertical alignment. The first equals definitely should not be way out there like that ;)

@HarrisonGrodin
Copy link
Collaborator Author

If there was a runtime error due to rigid inside table, that would be a bug in the implementation of the pretty printer.

Ohh, I see what you mean. 👍

You can do this without changing the table definition. If the last doc shouldn't be flattened, then we can just take it out of the table, and prettify it separately.

Not sure I follow this - if the last doc in each row shouldn't be flattened, how could we insert it after each table row if the table was a single compound doc?

The first equals definitely should not be way out there like that ;)

Hmm... I can't decide if I agree or not! I like the principle that if definitions are given with and, they're "combined", and thus asymmetries induced due to name lengths should be modded out by; this means that programmers feel comfortable using any names, rather than trying to align word lengths to bring out the symmetry (e.g., hi/lo instead of high/low).
My instinct is that it's unusual that you would have short and veryveryverylong defined together via and, but if you do, it makes sense for the ='s to be aligned.

@HarrisonGrodin
Copy link
Collaborator Author

Another case for potential alignment I was just thinking about - of in datatype declarations. For example:

datatype foo =
  Apple of int
| Pineapple of string

(* should perhaps be formatted to *)
datatype foo =
  Apple     of int
| Pineapple of string

I think this is probably sensible, since it's dual to aligning the :s in a record definition.

@shwestrick
Copy link
Owner

You can do this without changing the table definition. If the last doc shouldn't be flattened, then we can just take it out of the table, and prettify it separately.

Not sure I follow this - if the last doc in each row shouldn't be flattened, how could we insert it after each table row if the table was a single compound doc?

You would need to have in hand the last doc of the table anyway, in order to define what the elements of the table are. So all you need to do is just not give that particular doc to the table, and instead do something like beside (mytable, makePretty last).

it makes sense for the ='s to be aligned.

I think I agree only if the =s are touching, i.e. no extra rows in between. Otherwise, the symmetry is lost.

Actually, this might be a nice general principle: vertical alignment only when the aligned tokens are on adjacent lines.

@HarrisonGrodin
Copy link
Collaborator Author

HarrisonGrodin commented Jan 4, 2022

You would need to have in hand the last doc of the table anyway, in order to define what the elements of the table are. So all you need to do is just not give that particular doc to the table, and instead do something like beside (mytable, makePretty last).

Think I might be missing something here; given last1 through lastn, we'd want beside (row1, last1) $$ ... $$ beside (rown, lastn), right?

I think I agree only if the =s are touching, i.e. no extra rows in between. Otherwise, the symmetry is lost.

Actually, this might be a nice general principle: vertical alignment only when the aligned tokens are on adjacent lines.

Hmm, that's interesting; I think I'm okay with either. (I still feel a bit of the symmetry across multiple lines, but your suggestion might be the less obscure style.) It would make table formatting easier, I suppose (given multiline rows, just display in sequence with no alignment?). What should happen if some of the =s are touching but not others, though? e.g.,

type foo = int
and  short =
  { label1 : t1
  , label2 : t2
  }
and  veryveryverylong =
  { label3 : t3
  , label4 : t4
  , label5 : t5
  }

Should the first two =s be aligned; all three; none? Similar issue with =>'s in case expressions:

case foo of
  pat =>
    long multiline expression
| longpat => 1
| longerpat => 2

I'm perfectly happy with always aligning the =>s no matter what (and I'd prefer this to no alignment), since on a large scale it suggests that the case arms are parallel to one another, even if one happens to be particularly long. Perhaps there's a clever middle ground, though?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aesthetics Improvement to code formatter output
Projects
None yet
Development

No branches or pull requests

2 participants