New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dhall-format removes comments #145
Comments
Going through the code, it seems a known issue. :) Anyway keeping this open. |
One thing I can do as an immediate workaround is to have the formatter preserve leading whitespace and comments at the beginning of the file. Then if you need comments for a subexpression you can at least have the workaround of splitting it out into another file |
Partial fix for #145 Before this change `dhall-format` would get rid of all comments when formatting code. After this change `dhall-format` will preserve any leading comments and whitespace so that users can at least add top-level comment headers to their files
Partial fix for #145 Before this change `dhall-format` would get rid of all comments when formatting code. After this change `dhall-format` will preserve any leading comments and whitespace so that users can at least add top-level comment headers to their files
Partial fix for #145 Before this change `dhall-format` would get rid of all comments when formatting code. After this change `dhall-format` will preserve any leading comments and whitespace so that users can at least add top-level comment headers to their files
Partial fix for #145 Before this change `dhall-format` would get rid of all comments when formatting code. After this change `dhall-format` will preserve any leading comments and whitespace so that users can at least add top-level comment headers to their files
@Gabriel439 What's the main challenge to tackle in order to fix this? As far as I remember the main problem is that it'd be hard to associate comment fragments to other AST nodes, right? (as in: a comment line that comes before some code goes with the next node, but what to do in case of a comment fragment on the same line? Also how about the alignments? And does this mean that comments would slow down all operations, since it's a whole lot of packing-unpacking dummy AST comment nodes?) |
The GHC approach is not to out comments in the AST, but have a separate map
Map (NodeType, SrcLoc) Annotation. So as you traverse the AST you have
SrcLocs and can find annotations for the node you're at. That's one
approach. Or we could have a trees that grow approach and erase comments
from the AST for anything but formatting.
…On Thu, 1 Nov 2018, 9:53 am Fabrizio Ferrai ***@***.*** wrote:
@Gabriel439 <https://github.com/Gabriel439> What's the main challenge to
tackle in order to fix this? As far as I remember the main problem is that
it'd be hard to associate comment fragments to other AST nodes, right? (as
in: a comment line that comes before some code goes with the next node, but
what to do in case of a comment fragment on the same line? Also how about
the alignments? And does this mean that comments would slow down all
operations, since it's a whole lot of packing-unpacking dummy AST comment
nodes?)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#145 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABRjlJH8aGjbXdkKSoOcAC8gEWfHHQ6ks5uqsSsgaJpZM4PoYNA>
.
|
There is also a third solution, which is a "low-tech" formatter that doesn't actually parse the AST but rather just scans the text and formats as it goes. I believe this is how However, I think the simplest and most robust solution is the second one that @ocharles mentioned which is just to preserve parsed whitespace in the syntax tree using the That still doesn't get us all the way, though, because intelligently preserving comments when formatting code is still difficult even when your AST preserves whitespace and comments. Here are some pathological cases to consider when designing a comment-preserving formatting algorithm:
|
I'd like to have comments on record fields if possible. This doesn't work right now. let Block
: Type
= { id :
Text
-- Used in radio selects, checkbox selects
, label :
Text
, fieldType :
Text
, options :
List { id : Text, label : Text }
} gets nuked. |
If we compare different styles and techniques of formatting: one aspect of gofmt is that is preserves line breaks. So it may compact many linebraks to I think this is useful. The fact that some dhall maps are formatted in one line and some are not (based on line lengths) is not something I enjoy. This may warrant a separate issue, but if you want to preserve linebreaks and want to go for AST style formatting and include comments, linebreaks would also have to be added to AST. |
@feliksik: The Haskell AST does preserve the original source code for all nodes, including whitespace/newlines, but it's still not clear to me if only dealing with indentation is enough. For example, one difference between Dhall and Go is that Dhall is less whitespace-sensitive than Go is. For example, in Go, this is syntactically valid: func main() {
fmt.Println("Hello, world!")
} ... but this is not valid func main()
{
fmt.Println("Hello, world!")
} If the latter were valid then Go would have to delete the newline in between Because Dhall is not as restrictive as Go, it would have to accommodate all sorts of weirdness if it took greater care to preserve the original newlines. |
@Gabriel439 thank you for your response.
This is a good point! Indeed, only dealing with indentation may not be enough to make it great. If I correctly understand, you would need some rules (that are eventually arbitrary), and these would need to be a bit more sophisticated than with go. I tried to get this straight for myself, so I can just as well share the examples I made: This would be fine (it's current formatting):
As would possibly:
but this would probably be considered too weird:
But you may consider such rules are undesirable (apart from the existing 80-char-line rewrite logic). Note that go also implements such heuristic, a bit: The following is gofmt'ed, but one newline would be removed if the comment is removed:
So to summarize: if I get it right, there is 2 taking home points here:
|
@feliksik: Is there a document or code describing the |
Hmm I am not aware of such a document. I did find http://journal.stuffwithstuff.com/2015/09/08/the-hardest-program-ive-ever-written/ , which elaborates how hard this problem can become, based on the requirements. But this aims for much more sophisticated results. By all means, don't let this spoil the work that can be done on preserving comments in the shorter term. It will be of great value, and much more important than the newlines. |
Personally, I do like the current dhall fmt behaviour, especially since dhall is by design a line-break and leading whitespace insensitive language. But I think this discussion is derailing the original issue too much and should move to a different issue. |
What about starting with simple rules for preserving comments such as "only comments in their own lines are preserved"? I think this would go a long way, at least for documenting functions and types at the top level |
I really liked the idea of preserving newlines that follow a comment. I'm currently trying to see how easy it would be to implement |
Alright, here's the rough idea I have for how to preserve one trailing comment per AST node. First, some background: every node in the AST currently keeps track of its source code, including trailing whitespace (but not leading whitespace). So what we can do for each node is to check the trailing whitespace and behave differently under the following three conditions:
We could also add special cases for preserving comments right before a |
Related to #145 Note that this also refactors `Let` to use `Binding` in order to avoid having to duplicate `Src`-related fields in two places.
Alright, I made some progress on this here by fixing So, for example, it will preserve the comment if you do this: let x = 1
let {- Example
comment
-}
y = 1
in x + y ... but it will not preserve the comment if you do this: let x = 1
{- Example
comment
-}
let y = 1
in x + y To provide a quick update: the general tack I suggested in my previous comment did not work. The issue is that the same trailing whitespace would be preserved by multiple nodes in the syntax tree, so it was difficult to get a preserved comment to show up exactly once (i.e. there were many errors with comments that were duplicated or missing). However, the trick that does work (which is partially implemented in the above pull request) is just adding For now, I'm only doing this for |
Related to #145 Note that this also refactors `Let` to use `Binding` in order to avoid having to duplicate `Src`-related fields in two places.
@ari-becker: Another work-around that { field =
let -- foo/bar is for bazzes
result = "value"
in result
} |
Now that #1908 is merged, the parser output also contains the comments that precede labels in records and record types. e.g. the A and B comments in these examples:
However the pretty-printer still needs to be updated to output these comments. @german1608's initial work (including tests) on this is on this branch, mixed with parts of #1908 and #1926. If there are no other volunteers, I'd take a stab at finishing the the pretty-printer update. :) It should also be also be fairly simple to retain the comments between labels and the field value or type now, e.g.
and
|
@sjakobi this issue hasn't been resolved in1.34, has it ? I was just very exited believing record comments would not be removed by the linter anymore. My hope has been high on this one because some of our users have expressed their disappointment when they realize their comments were garbaged on record fields in a military way. |
@PierreR |
@german1608 thanks for the info. Will look forward to it. Amazing work ! |
Now that #2021 has been merged, {- Header
-}
let {- A -} letBinding
{- B -} : {- C -} T
= {- D -} x
let recordType =
{ {- E -}
x
{- F -}
: {- G -}
T
}
let pun = x
let record =
{ {- H -}
simple
{- I -}
= {- J -}
z
, {- K -}
pun
{- L -}
, {- M -}
dotted
{- N -}
. {- O -}
fields
{- P -}
= {- R -}
x
}
in {=} I assume that these changes will be included in v1.35.0 (#2016). As @TristanCacqueray has pointed out in #2021, it would be relatively easy to preserve comments in union types now. E.g. < {- A -} X {- B -} : {- C -} Y > Also, @german1608 has already laid the ground work for preserving comments on
|
@sjakobi: Yes, I plan to cut the |
EDIT: Retracting my original comment as it was badly formulated and sounded like a complaint which I didn't intend. What I meant to say was "It would be great to get this fully resolved to facilitate greater adoption." 😄 |
@hanshoglund: I think my main question is: which places do you still need to preserve comments? The most recent release preserves comments on |
@Gabriel439 That is a great improvement, but to minimize negative first-user reactions I think it is still crucial to ensure that all comments accepted by the parser are also preserved by the formatter. Maybe the solution is to make the parser accept fewer comments? I'm sure this is being worked on, just wanted to add another data point :D. |
I can for sure say that it has also caused some weird reactions on our team and sth. we had to explicitly educate everyone about. To the point that we had long discussions on figuring out the best "tricks" like e.g. artificial let bindings for how we can use comments to document our dhall types. This was before the most recent improvements in 1.35+ and introduction of the Anyhow, I'm all for a "canonical formatter" and the benefits of this hugely outweigh the cons. In particular I also appreciate that the formatter enforces e.g. comment whitespace/alignment to some degree. I like the idea proposed by @hanshoglund-da to think about how the "write comment & format file" experience can be improved to yield less surprising results. |
@hanshoglund: I don't think disallowing other comments would work, mainly because that would be a highly breaking change to the standard and the standard isn't intended to track details specific to the Haskell implementation (such as the formatter). |
@Gabriel439 Yes, avoiding breaking changes to the standard is very reasonable indeed :D. |
@Gabriel439 it is puzzling to see that the evolution of the Dhall language won't meet such a sensible request knowing how young it is. Maybe the formatter part should also be part of the standard ? Quoting Guy Steele:
When you say "breaking changes" do you mean you expect this change to be too big of a burden for the language binding implementers ? If so, beeing not in charge of maintaining a languague binding for Dhall, I will respect the concern very much. As a educated Dhall user I don't mind about this issue. On the other hand as a Dhall lover/promoter I've witnessed more than once the "negative first-user reaction". Of course another reason "not to do this for now" is a question of ROI/priority/available resources ... |
@PierreR: What I meant was that disallowing comments that the Haskell formatter does not support would break a lot of users' Dhall code in the wild |
What I find surprising it the fact that comments are removed when added after the term. For instance:
The fact that the |
@PierreR: Issues with the VSCode highlighting should be reported at https://github.com/dhall-lang/vscode-language-dhall |
Done here: dhall-lang/vscode-language-dhall#9. Thanks for pointing this out. |
Well, sort of. We left `rule` and `topic` untouched for now because they have a bunch of comments in `List Topic` literals, which `dhall format` removes. Related: <dhall-lang/dhall-haskell#145>
Pardon my ignorance, but could there be a CLI option to disable any comment manipulations when using |
There isn't a good way to disable comment manipulations using the current implementation (that we're aware of). If there were such an option we would have already implemented it because we're not trying to intentionally delete comments if we can help it |
Example file:
The Output it gives:
The text was updated successfully, but these errors were encountered: