-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/ast: Free-floating comments are single-biggest issue when manipulating the AST #20744
Comments
I've made (unattached) comments a node type in various toy/simple
parsers where I needed to preserve comments and it worked well for my
needs
|
@jimmyfrasche Thanks for your comment. I do have a design and prototype that is close, but not good enough for gofmt purposes. I'd be interested in how you integrate your comment nodes in the rest of a syntax tree. |
To put it in terms of the existing go/ast:
All attached comments would remain where they are on various nodes.
Comment would be both a Decl and a Stmt.
Unattached Comments would be interspersed with declarations and
statements where they appeared in the source.
For go/ast you'd also need a second CommentGroup for comments before
the comments attached to the package declaration.
If you don't parse comments, they aren't included.
If you do, sometimes you have to skip comment nodes when walking the
AST, but you can add and delete comments very easily and preserve
comments when modifying the AST just as easily.
I've done this one or twice with much simpler parsers/formatters built
for much simpler purposes, but it worked well for my uses.
|
I should add that when I did that they were languages that only had
"until end of line" comments so there would still be complications in
cases like fmt.Println(x /* + y */)
|
@jimmyfrasche Thanks for the clarification. Because /* */ style comment can appear anywhere, they do add complexity as it's not always clear if they belong to the token before or after. |
Everything stores a position so it could be worked out, though that
makes it hard to preserve the comments when updating code with inline
comments.
|
I suppose one way around that would be, in expressions, to always have comments come after (where, for binops "after" is defined as after the binary operator so that
stores 1 on a, 2 on +, and 3 on b. Stuff like CallExpr would need to store multiple comments for code like
and statements would need before and after comments for
With that and the "comments are nodes" approach plus the extra CommentGroup on the File, I think that would get everything. Verbose and filled with cases but explicit and easy to manipulate. |
@jimmyfrasche My approach is similar. On the one extreme, each token could have a potential comment attached; and on the other, there's one comment for each node. The former is too expensive in terms of representation but would probably preserve comment positioning perfectly; and the latter is relatively cheap but may not preserve all comments in place (depending on heuristics). The stored position is helpful, but relying on it introduces the same problems we have already. Thus, any solution will have to work based on the graph alone, and ignore positions (they are used only in the beginning for the heuristics to determine how to associate a node). |
Are there any advices in how this can be done today? In my experience, trailing comments are particularly hard to handle. const (
C // C comment
A
B // B comment
) With manual const (
A
// C comment
C
// B comment
B
) I am sorry if this is just me being lame. #17108 had some progress, but it is unclear if it has any relation to this. |
@quasilyte This issue is about fixing the problem. Let's not complicate the issue with a discussion of how to deal with the problem before it is fixed. Please take that discussion to a forum; see https://golang.org/wiki/Questions. Thanks. |
Hey guys, I'm wondering if I can help push this one along in some way or if there are any workarounds to this issue? I keep finding reasons to modify, append and delete pieces of the AST, but adding and positioning comments ends up biting pretty hard. Right now I use I'm sure you guys have thought about this a ton already, so I'm not sure how helpful this is, but I think the AST changes would have the following characteristics:
And then, any changes to these comment nodes would be reflected when |
@matthewmueller This sounds about right to me, but the devil is in the details. I think we would want each AST node to have an additional comment field (that is usually a nil pointer) and that field would then be used to attach comments that "belong" (are associated) with that node. There may be additional information about the comment's position needed (as in "above", "before", "after", or "below" the node's line) for good positioning. Such associated comments would be always attached with the relevant nodes. (It's not clear to me that we can just use the existing Doc fields and expand the concept of Doc fields to all nodes given that those have specific meaning at the moment.) Additionally, as you have observed, "free-floating" (non-associated) comments would need to act like Decls and Stmts (and presumably those would always have an empty line before and after them as otherwise they would be associated with a node. It should be possible to do this is an backward-compatible way: Add a new *Comment (or similar) field to all nodes, and define the respective data structure. Then, a new function could take the existing list of comments from the parser and associate them with the nodes. This is where a good heuristic is needed (I've had a prototype for this a couple of years back but I didn't get to make it complete). Finally, go/printer would need to be adjusted to position comments based on node association (I'd copy go/printer and create a 2nd version that is appropriately modified - this would make sure the existing code will continue to run as always). The hard part is getting the comment association heuristic work well. |
@matthewmueller some concrete examples of cases that are currently mishandled: #21755, #22371. |
I'm currently working on a project that attempts to address this problem. github.com/dave/dst is a fork of the All the position fields have been removed from I've finished a very rough prototype that works pretty well. (Take a look at restorer_test.go - all the tests pass apart from There's several special cases that it doesn't currently handle. Right now I'm generating much of the code, so the special cases are non-trivial to implement. (e.g. Look at FuncDecl - the As @griesemer points out a big problem is where to attach the decorations so as you manipulate the tree they remain attached to the node you were expecting. My algorithm probably needs improvement here too (see decorator_test.go), but I think it currently works well enough to be useful. This issue is concerned with a replacement for the |
Change https://golang.org/cl/137076 mentions this issue: |
I'm pretty happy with how development of https://github.com/dave/dst has come on in the past few weeks. My main yardstick is a test that converts the entire Go standard library from I'm also a lot happier with the algorithm that attaches comments to nodes. It gives acceptable results in the vast majority of cases. Over the next few weeks I'll be dog-fooding the system in a couple of projects, so I'll get a better idea of how it works in the real-world. I would invite you all to take a look. I may make some tweaks to the API that controls adding / removing comments from the attachment points, and I'd very much appreciate feedback on this. |
I have just gotten a chance to test the I haven’t had the need to touch comments for my use-case (yet?), but the whitespace controls are very valuable, and I did not run into any correctness issues with Thanks for the helpful package! |
Thanks @stapelberg! I worked really hard on getting the output perfect. Let me know if you find anything unexpected and I'll look into it. |
I want to thank all of you for your work on this. It is enables our team to build up tooling to some very nice tooling and support massive-scale refactoring.
… Am 29.01.2019 um 16:10 schrieb Dave Brophy ***@***.***>:
Thanks @stapelberg! I worked really hard on getting the output perfect. Let me know if you find anything unexpected and I'll look into it.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@dave the dst package just saved my life. it's very helpful! and the dstutil module makes the migration seamless. |
@dave really appreciate dst. this was such an issue for generating good code |
This is awesome news. Thanks @adonovan ... as soon as I get in front of my laptop next week I'll have a good look through it. I'm very much looking forward to retiring the dst package! |
Thanks @dave, I appreciate the support but don't retire just yet: this is still just an early sketch of an approach. |
I am also delighted to see that you’re working on this issue! :) Feel free to reach out if you want me to give your CL a shot in a few refactoring tools where we currently use the dst package. |
What is the status of this? Anything which made the work here stuck somewhere? |
https://go.dev/cl/429639 is as far as I have gotten; I plan to return to it later this year. There are a couple of failing tests in the formatter which I think I could fix in a few hours; after that I will be convinced that the approach solves the problem of faithfully recording the stream of tokens-and-comments so that AST modifications preserve that order. However, two questions remain, one easy, one hard:
|
Hello, I would like to ask my code 130 lines, so write, why the function comments are not generated? |
This seems to be hanging. Is there some sort of expectation for when this may be ready to release? It would be nice to be able to depend on the standard library AST tools to manipulate syntax tree nodes rather than needing to roll your own or use third party tools. Are there are any stuck-points or questions that the community could help with? In terms of preserving comment order for different field types, I think that the primary goal, from what I understand from the comments and my own use case, is to preserve the association between comments, spacing, and the lines of code (AST nodes) that they are associated with. With that in mind, I think your API that links node types to their positionally associated comments makes sense. |
Not hanging, but it's the third priority on my to-do list; I hope to start on it this month. Glad to know this will be as useful to others as it will be to us. |
It will! I hit #22371 again just now. |
Not even close, unfortunately. This will have to wait for go1.24. |
Reminder issue.
The original design of the go/ast makes it very difficult to update the AST and retain correct comment placement for printing. The culprit is that the comments are "free-floating" and printed based on their positions relative to the AST nodes. (The go/ast package is one of the earliest Go packages in existence, and this is clearly a major design flaw.)
Instead, a newer/updated ast should associate all comments (not just Doc and Comment comments) with nodes. That would make it much easier to manipulate a syntax tree and have comments correctly attached. The main complexity with that approach is the heuristic that decides which comments are attached to which nodes.
The text was updated successfully, but these errors were encountered: