dhall-format removes comments #145

psibi · 2017-09-29T08:51:09Z

Example file:

--- This is dhall file
{ foo = "jax" }

The Output it gives:

{ foo = "jax" }

The text was updated successfully, but these errors were encountered:

psibi · 2017-09-29T08:56:58Z

Going through the code, it seems a known issue. :) Anyway keeping this open.

Gabriella439 · 2017-09-29T13:37:39Z

One thing I can do as an immediate workaround is to have the formatter preserve leading whitespace and comments at the beginning of the file. Then if you need comments for a subexpression you can at least have the workaround of splitting it out into another file

Partial fix for #145 Before this change `dhall-format` would get rid of all comments when formatting code. After this change `dhall-format` will preserve any leading comments and whitespace so that users can at least add top-level comment headers to their files

f-f · 2018-11-01T09:53:48Z

@Gabriel439 What's the main challenge to tackle in order to fix this? As far as I remember the main problem is that it'd be hard to associate comment fragments to other AST nodes, right? (as in: a comment line that comes before some code goes with the next node, but what to do in case of a comment fragment on the same line? Also how about the alignments? And does this mean that comments would slow down all operations, since it's a whole lot of packing-unpacking dummy AST comment nodes?)

ocharles · 2018-11-01T10:09:09Z

The GHC approach is not to out comments in the AST, but have a separate map Map (NodeType, SrcLoc) Annotation. So as you traverse the AST you have SrcLocs and can find annotations for the node you're at. That's one approach. Or we could have a trees that grow approach and erase comments from the AST for anything but formatting.

…

On Thu, 1 Nov 2018, 9:53 am Fabrizio Ferrai ***@***.*** wrote: @Gabriel439 <https://github.com/Gabriel439> What's the main challenge to tackle in order to fix this? As far as I remember the main problem is that it'd be hard to associate comment fragments to other AST nodes, right? (as in: a comment line that comes before some code goes with the next node, but what to do in case of a comment fragment on the same line? Also how about the alignments? And does this mean that comments would slow down all operations, since it's a whole lot of packing-unpacking dummy AST comment nodes?) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#145 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABRjlJH8aGjbXdkKSoOcAC8gEWfHHQ6ks5uqsSsgaJpZM4PoYNA> .

Gabriella439 · 2018-11-01T14:48:03Z

There is also a third solution, which is a "low-tech" formatter that doesn't actually parse the AST but rather just scans the text and formats as it goes. I believe this is how go fmt works.

However, I think the simplest and most robust solution is the second one that @ocharles mentioned which is just to preserve parsed whitespace in the syntax tree using the Noted constructors.

That still doesn't get us all the way, though, because intelligently preserving comments when formatting code is still difficult even when your AST preserves whitespace and comments. Here are some pathological cases to consider when designing a comment-preserving formatting algorithm:

The formatter wants to format an expression as one line since it's less than 80 characters, but there is a line comment in the way:
```
{ x = 1  -- !
, y = 2
}
```

Same as the previous example, but now there is a multi-line comment in the way:

{ x = 1  {- !
            !
         -}
, y = 2
}

The formatter wants to reindent a value but doing so might break the indentation of a multi-line comment:
```
[ 1
, 2
...
,     99 {- !
            !
         -}
]
```
Same as the previous example, except that te user wrote the multi-line comment using multiple single-line comments (i.e. -- ), so the formatter doesn't know that they are supposed to be indented together:
```
[ 1
, 2
...
,     99 -- !
         -- !
]
```
User inserts comments preceding syntactic elements that we want to horizontally align:
```
[ 1
, 2
...
{- ! -} , 99
]
```

User adds a comment that goes beyond the desired column limit:

1 -- !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

User adds whitespace before a comment pushing it beyond the desired column limit:

1                                                                                          -- !

macalinao · 2019-03-17T02:02:17Z

I'd like to have comments on record fields if possible. This doesn't work right now.

let Block
	: Type
	= { id :
		  Text
		  -- Used in radio selects, checkbox selects
	  , label :
		  Text
	  , fieldType :
		  Text
	  , options :
		  List { id : Text, label : Text }
	  }

gets nuked.

feliksik · 2019-07-06T08:53:03Z

If we compare different styles and techniques of formatting: one aspect of gofmt is that is preserves line breaks. So it may compact many linebraks to /n/n, but other than that, the user has some freedom on formatting. gofmt then mainly manages indentation, not breaks.

I think this is useful. The fact that some dhall maps are formatted in one line and some are not (based on line lengths) is not something I enjoy.

This may warrant a separate issue, but if you want to preserve linebreaks and want to go for AST style formatting and include comments, linebreaks would also have to be added to AST.

Gabriella439 · 2019-07-06T22:49:23Z

@feliksik: The Haskell AST does preserve the original source code for all nodes, including whitespace/newlines, but it's still not clear to me if only dealing with indentation is enough.

For example, one difference between Dhall and Go is that Dhall is less whitespace-sensitive than Go is. For example, in Go, this is syntactically valid:

func main() {
	fmt.Println("Hello, world!")
}

... but this is not valid

func main()
{
	fmt.Println("Hello, world!")
}

If the latter were valid then Go would have to delete the newline in between main() and the opening brace to format the code according to the standard style

Because Dhall is not as restrictive as Go, it would have to accommodate all sorts of weirdness if it took greater care to preserve the original newlines.

feliksik · 2019-07-07T18:55:40Z

@Gabriel439 thank you for your response.

Because Dhall is not as restrictive as Go, it would have to accommodate all sorts of weirdness if it took greater care to preserve the original newlines.

This is a good point! Indeed, only dealing with indentation may not be enough to make it great. If I correctly understand, you would need some rules (that are eventually arbitrary), and these would need to be a bit more sophisticated than with go.

I tried to get this straight for myself, so I can just as well share the examples I made:

This would be fine (it's current formatting):

λ(x : Text) → { a = x, b = x }

As would possibly:

λ(x : Text) →
  { a = x
  , b = x
  }

but this would probably be considered too weird:

λ(
  x : Text) →
  {
   a = x,
   b = x
  }

But you may consider such rules are undesirable (apart from the existing 80-char-line rewrite logic).

Note that go also implements such heuristic, a bit: The following is gofmt'ed, but one newline would be removed if the comment is removed:

package main

import "fmt"

func main() {
        a := map[string]string{"k1": "v1", "k2": "v2"}

        b := map[string]string{
                "k1": "v1", "k2": "v2",
        }

        c := map[string]string{
                "k1": "v1",
                "k2": "v2",
        }
        d := map[string]string{
                "k1": "v1", "k2": // note that this newline WOULD be removed if you remove this comment
                "v2",
        }
        e := map[string]string{
                "k1": "v1",

                "k2": "v2", // extra line before
        }

        fmt.Println(
                "Hello, world!", a, b,

                c,
                d, e,
        )
}

So to summarize: if I get it right, there is 2 taking home points here:

line rewrite heuristics can be formulated, and it's arbitrary how far you want to take this
the main point of this ticket (preserving comments) could possibly be implemented (and made easier?) by only respecting newlines that are preceded by a comment (--).

Gabriella439 · 2019-07-07T19:13:30Z

@feliksik: Is there a document or code describing the go fmt algorithm? It would help a lot if I could crib from that when working on the dhall format equivalent

feliksik · 2019-07-07T21:22:14Z

Hmm I am not aware of such a document.
The code is not too long, though: https://golang.org/src/cmd/gofmt/gofmt.go . I didn't study it yet, though.

I did find http://journal.stuffwithstuff.com/2015/09/08/the-hardest-program-ive-ever-written/ , which elaborates how hard this problem can become, based on the requirements. But this aims for much more sophisticated results.

By all means, don't let this spoil the work that can be done on preserving comments in the shorter term. It will be of great value, and much more important than the newlines.

Profpatsch · 2019-07-09T11:12:50Z

Personally, I do like the current dhall fmt behaviour, especially since dhall is by design a line-break and leading whitespace insensitive language.

But I think this discussion is derailing the original issue too much and should move to a different issue.

lorenzo · 2019-07-09T15:09:56Z

What about starting with simple rules for preserving comments such as "only comments in their own lines are preserved"?

I think this would go a long way, at least for documenting functions and types at the top level

Gabriella439 · 2019-07-09T15:31:03Z

I really liked the idea of preserving newlines that follow a comment. I'm currently trying to see how easy it would be to implement

Gabriella439 · 2019-07-09T16:07:04Z

Alright, here's the rough idea I have for how to preserve one trailing comment per AST node.

First, some background: every node in the AST currently keeps track of its source code, including trailing whitespace (but not leading whitespace).

So what we can do for each node is to check the trailing whitespace and behave differently under the following three conditions:

No trailing comment

In this case, don't preserve the trailing whitespace for that node at all, which is the current behavior
Trailing comment beginning on the same line

Preserve the comment and align it against the right-hand side of the expression (indenting/dedenting the comment as necessary if it's a multi-line comment)

For example, this expression:
```
let x =   1   {- Foo
                 Bar
              -}



in  x
```
... would be formatted as:
```
let x =
      1 {- Foo
           Bar
        -}

in  x
```
Trailing comment beginning on another line

Preserve the comment and align it below the expression to the left-hand side (indent/dedenting the comment as necessary)

For example, this expression:
```
let x = 1


{- Foo
   Bar
-}

in  x
```
... would be formatted as:
```
let x =
      1
      {- Foo
         Bar
      -}

in  x
```

We could also add special cases for preserving comments right before a let binding or a record/union key since we can preserve their interior whitespace, too.

Related to #145 Note that this also refactors `Let` to use `Binding` in order to avoid having to duplicate `Src`-related fields in two places.

Gabriella439 · 2019-09-02T18:52:39Z

Alright, I made some progress on this here by fixing dhall format to at least preserve comments inside of a let binding (but not immediately before):

#1273

So, for example, it will preserve the comment if you do this:

let x = 1

let {- Example
       comment
    -}
    y = 1

in  x + y

... but it will not preserve the comment if you do this:

let x = 1

{- Example
   comment
-}
let y = 1

in  x + y

To provide a quick update: the general tack I suggested in my previous comment did not work. The issue is that the same trailing whitespace would be preserved by multiple nodes in the syntax tree, so it was difficult to get a preserved comment to show up exactly once (i.e. there were many errors with comments that were duplicated or missing).

However, the trick that does work (which is partially implemented in the above pull request) is just adding Src spans directly to the Expr constructors everywhere that there is whitespace within the grammar. The above pull request only implements it for Let but I can slowly add support for preserving comments to more constructors as time goes on.

For now, I'm only doing this for Let to minimize disruption to downstream libraries and I'll try to add support for a few more constructors with each release to ease the migration process for people using the Expr API. Once I've covered all constructors then we can close out this issue.

Related to #145 Note that this also refactors `Let` to use `Binding` in order to avoid having to duplicate `Src`-related fields in two places.

Gabriella439 · 2020-01-21T16:09:39Z

@ari-becker: Another work-around that dhall lint will not remove is this:

{ field =
    let -- foo/bar is for bazzes
        result = "value"

    in  result
}

sjakobi · 2020-07-18T12:58:18Z

Now that #1908 is merged, the parser output also contains the comments that precede labels in records and record types. e.g. the A and B comments in these examples:

{ {- A -} x = 0, {- B -} y = 1 }

{ {- A -}
  x : T
, {- B -}
  y : U
}

However the pretty-printer still needs to be updated to output these comments. @german1608's initial work (including tests) on this is on this branch, mixed with parts of #1908 and #1926.

If there are no other volunteers, I'd take a stab at finishing the the pretty-printer update. :)

It should also be also be fairly simple to retain the comments between labels and the field value or type now, e.g.

{ x {- A -} = {- B -} 0 }

and

{ x {- A -} : {- B -} T }

PierreR · 2020-08-04T19:12:33Z

@sjakobi this issue hasn't been resolved in1.34, has it ? I was just very exited believing record comments would not be removed by the linter anymore. My hope has been high on this one because some of our users have expressed their disappointment when they realize their comments were garbaged on record fields in a military way.

german1608 · 2020-08-04T19:20:19Z

@PierreR dhall format doesn't preserve those comments on 1.34.0. The work to record those on the AST is done, but the formatter needs an update to not remove them.

PierreR · 2020-08-04T19:22:31Z

@german1608 thanks for the info. Will look forward to it. Amazing work !

sjakobi · 2020-08-04T23:34:12Z

@PierreR A lot of preparatory work has been done, most importantly #1465.

I hope to get to the pretty-printer next week. But I also wouldn't mind if someone else would take over.

sjakobi · 2020-09-10T17:41:01Z

Now that #2021 has been merged, dhall format can preserve comments in several positions in record literals and record types. The expression below demonstrates all the positions where comments are now preserved:

{- Header
-}
let {- A -} letBinding
    {- B -} : {- C -} T
    = {- D -} x

let recordType =
      { {- E -}
        x
        {- F -}
        : {- G -}
          T
      }

let pun = x

let record =
      { {- H -}
        simple
        {- I -}
        = {- J -}
          z
      , {- K -}
        pun
        {- L -}
      ,   {- M -}
          dotted
          {- N -}
        . {- O -}
          fields
          {- P -}
        = {- R -}
          x
      }

in  {=}

I assume that these changes will be included in v1.35.0 (#2016).

As @TristanCacqueray has pointed out in #2021, it would be relatively easy to preserve comments in union types now. E.g.

< {- A -} X {- B -} : {- C -} Y >

Also, @german1608 has already laid the ground work for preserving comments on

functions: λ({- A -} a {- B -} : {- C -} T) -> e
function types: ∀({- A -} a {- B -} : {- C -} T) -> e
field selections: e . {- A -} x {- B -}

Gabriella439 · 2020-09-10T18:04:02Z

@sjakobi: Yes, I plan to cut the 1.35.0 branch tonight, which will incorporate the formatting improvements

hanshoglund · 2020-10-26T15:11:52Z

EDIT: Retracting my original comment as it was badly formulated and sounded like a complaint which I didn't intend.

What I meant to say was "It would be great to get this fully resolved to facilitate greater adoption." 😄

Gabriella439 · 2020-10-26T15:22:31Z

@hanshoglund: I think my main question is: which places do you still need to preserve comments? The most recent release preserves comments on let bindings and record value/type fields

hanshoglund-da · 2020-10-26T16:23:36Z

@Gabriel439 That is a great improvement, but to minimize negative first-user reactions I think it is still crucial to ensure that all comments accepted by the parser are also preserved by the formatter. Maybe the solution is to make the parser accept fewer comments?

I'm sure this is being worked on, just wanted to add another data point :D.

JohannesRudolph · 2020-10-26T16:44:02Z

I can for sure say that it has also caused some weird reactions on our team and sth. we had to explicitly educate everyone about. To the point that we had long discussions on figuring out the best "tricks" like e.g. artificial let bindings for how we can use comments to document our dhall types. This was before the most recent improvements in 1.35+ and introduction of the dhall docs (which is still very rough).

Anyhow, I'm all for a "canonical formatter" and the benefits of this hugely outweigh the cons. In particular I also appreciate that the formatter enforces e.g. comment whitespace/alignment to some degree.

I like the idea proposed by @hanshoglund-da to think about how the "write comment & format file" experience can be improved to yield less surprising results.

Gabriella439 · 2020-10-29T02:51:11Z

@hanshoglund: I don't think disallowing other comments would work, mainly because that would be a highly breaking change to the standard and the standard isn't intended to track details specific to the Haskell implementation (such as the formatter).

hanshoglund-da · 2020-10-29T06:47:05Z

@Gabriel439 Yes, avoiding breaking changes to the standard is very reasonable indeed :D.

PierreR · 2020-10-29T07:29:56Z

@Gabriel439 it is puzzling to see that the evolution of the Dhall language won't meet such a sensible request knowing how young it is.

Maybe the formatter part should also be part of the standard ?

Quoting Guy Steele:

I should not design a small language, and I should not design a large one. I need to design a language that can grow.

When you say "breaking changes" do you mean you expect this change to be too big of a burden for the language binding implementers ? If so, beeing not in charge of maintaining a languague binding for Dhall, I will respect the concern very much.

As a educated Dhall user I don't mind about this issue. On the other hand as a Dhall lover/promoter I've witnessed more than once the "negative first-user reaction".

Of course another reason "not to do this for now" is a question of ROI/priority/available resources ...

Gabriella439 · 2020-10-29T14:28:24Z

@PierreR: What I meant was that disallowing comments that the Haskell formatter does not support would break a lot of users' Dhall code in the wild

PierreR · 2020-11-02T11:19:36Z

What I find surprising it the fact that comments are removed when added after the term.

For instance:

let letBinding
    = x {- A -}

let record =
      { simple = z {- B -}
      }

The fact that the Dhall Language Support vscode extension is not highlighting correctly many of the suggested options by @sjakobi does not help.

PierreR · 2020-11-02T11:28:43Z

FWIW another nitpick that might be worth considering: the line after the comment is indented for let while it is not so within record.

Gabriella439 · 2020-11-02T15:49:41Z

@PierreR: Issues with the VSCode highlighting should be reported at https://github.com/dhall-lang/vscode-language-dhall

PierreR · 2020-11-02T17:18:34Z

Done here: dhall-lang/vscode-language-dhall#9. Thanks for pointing this out.

Well, sort of. We left `rule` and `topic` untouched for now because they have a bunch of comments in `List Topic` literals, which `dhall format` removes. Related: <dhall-lang/dhall-haskell#145>

GandelXIV · 2023-03-18T20:10:02Z

Pardon my ignorance, but could there be a CLI option to disable any comment manipulations when using dhall format until this bug gets fixed?

Gabriella439 · 2023-03-21T02:07:56Z

There isn't a good way to disable comment manipulations using the current implementation (that we're aware of). If there were such an option we would have already implemented it because we're not trying to intentionally delete comments if we can help it

Gabriella439 mentioned this issue Sep 29, 2017

Preserve leading comments/whitespace when formatting code #146

Merged

Gabriella439 mentioned this issue Jan 21, 2018

dhall-format does not retain comments at end of file #219

Closed

Gabriella439 added the help wanted label Jun 8, 2018

f-f mentioned this issue Jan 7, 2019

dhall format removes comments dhall-lang/dhall-lang#339

Closed

psibi mentioned this issue Jan 23, 2019

Auto-reformatting nukes comments psibi/dhall-mode#22

Open

sjakobi mentioned this issue Aug 4, 2019

Restore the CI check for dhall lint dhall-lang/dhall-lang#679

Merged

sjakobi mentioned this issue Sep 1, 2019

dhall format should remove leading newlines #1267

Closed

Gabriella439 added a commit that referenced this issue Sep 2, 2019

Fix dhall format to preserve let comments

e9b802e

Related to #145 Note that this also refactors `Let` to use `Binding` in order to avoid having to duplicate `Src`-related fields in two places.

Gabriella439 mentioned this issue Sep 2, 2019

Fix dhall format to preserve let comments #1273

Merged

Gabriella439 added a commit that referenced this issue Sep 5, 2019

Fix dhall format to preserve let comments (#1273)

96921f0

Related to #145 Note that this also refactors `Let` to use `Binding` in order to avoid having to duplicate `Src`-related fields in two places.

sjakobi mentioned this issue Jul 11, 2020

feat(comments): support prefix comments on Record's key-value pairs. #1908

Merged

german1608 mentioned this issue Aug 5, 2020

dhall-docs jump to definition of names introduced by Lam and Pi constructor #1978

Open

ggilmore mentioned this issue Aug 11, 2020

change package to export CloudProvider, EnvVar fields sourcegraph/deploy-sourcegraph-dhall-archived#23

Merged

sjakobi mentioned this issue Sep 2, 2020

Pretty-print RecordField comments #2021

Merged

SiriusStarr mentioned this issue Jan 8, 2022

dhall format removes shebang #2361

Closed

sjakobi mentioned this issue Feb 1, 2022

Allow Comments To Be Emitted In Dhall's Output #2375

Closed

philderbeast mentioned this issue Mar 31, 2022

Generate waspc.cabal using hpack-dhall. wasp-lang/wasp#528

Closed

2 tasks

Navigation Menu

dhall-format removes comments #145

dhall-format removes comments #145

Comments

psibi commented Sep 29, 2017

psibi commented Sep 29, 2017

Gabriella439 commented Sep 29, 2017

f-f commented Nov 1, 2018

ocharles commented Nov 1, 2018 via email

Gabriella439 commented Nov 1, 2018

macalinao commented Mar 17, 2019

feliksik commented Jul 6, 2019

Gabriella439 commented Jul 6, 2019

feliksik commented Jul 7, 2019

Gabriella439 commented Jul 7, 2019

feliksik commented Jul 7, 2019

Profpatsch commented Jul 9, 2019 • edited

lorenzo commented Jul 9, 2019

Gabriella439 commented Jul 9, 2019

Gabriella439 commented Jul 9, 2019 • edited

Gabriella439 commented Sep 2, 2019

Gabriella439 commented Jan 21, 2020

sjakobi commented Jul 18, 2020

PierreR commented Aug 4, 2020 • edited

german1608 commented Aug 4, 2020 • edited

PierreR commented Aug 4, 2020 • edited

sjakobi commented Aug 4, 2020

sjakobi commented Sep 10, 2020

Gabriella439 commented Sep 10, 2020

hanshoglund commented Oct 26, 2020 • edited

Gabriella439 commented Oct 26, 2020

hanshoglund-da commented Oct 26, 2020

JohannesRudolph commented Oct 26, 2020

Gabriella439 commented Oct 29, 2020

hanshoglund-da commented Oct 29, 2020

PierreR commented Oct 29, 2020 • edited

Gabriella439 commented Oct 29, 2020

PierreR commented Nov 2, 2020

PierreR commented Nov 2, 2020 • edited

Gabriella439 commented Nov 2, 2020

PierreR commented Nov 2, 2020

GandelXIV commented Mar 18, 2023

Gabriella439 commented Mar 21, 2023

Profpatsch commented Jul 9, 2019 •

edited

Gabriella439 commented Jul 9, 2019 •

edited

PierreR commented Aug 4, 2020 •

edited

german1608 commented Aug 4, 2020 •

edited

PierreR commented Aug 4, 2020 •

edited

hanshoglund commented Oct 26, 2020 •

edited

PierreR commented Oct 29, 2020 •

edited

PierreR commented Nov 2, 2020 •

edited