Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

go/ast: Free-floating comments are single-biggest issue when manipulating the AST #20744

Open
griesemer opened this Issue Jun 21, 2017 · 10 comments

Comments

Projects
None yet
4 participants
Contributor

griesemer commented Jun 21, 2017

Reminder issue.

The original design of the go/ast makes it very difficult to update the AST and retain correct comment placement for printing. The culprit is that the comments are "free-floating" and printed based on their positions relative to the AST nodes. (The go/ast package is one of the earliest Go packages in existence, and this is clearly a major design flaw.)

Instead, a newer/updated ast should associate all comments (not just Doc and Comment comments) with nodes. That would make it much easier to manipulate a syntax tree and have comments correctly attached. The main complexity with that approach is the heuristic that decides which comments are attached to which nodes.

@griesemer griesemer added this to the Unreleased milestone Jun 21, 2017

@griesemer griesemer self-assigned this Jun 21, 2017

Contributor

griesemer commented Jun 21, 2017

@jimmyfrasche Thanks for your comment. I do have a design and prototype that is close, but not good enough for gofmt purposes. I'd be interested in how you integrate your comment nodes in the rest of a syntax tree.

Contributor

griesemer commented Jun 21, 2017

@jimmyfrasche Thanks for the clarification. Because /* */ style comment can appear anywhere, they do add complexity as it's not always clear if they belong to the token before or after.

I suppose one way around that would be, in expressions, to always have comments come after (where, for binops "after" is defined as after the binary operator so that

a /* 1 */ + /* 2 */ b /* 3 */

stores 1 on a, 2 on +, and 3 on b.

Stuff like CallExpr would need to store multiple comments for code like

f /* 1 */ () /* 2 */

and statements would need before and after comments for

/* before */ return /* after - but if expression is returned comment is attached to that */

With that and the "comments are nodes" approach plus the extra CommentGroup on the File, I think that would get everything.

Verbose and filled with cases but explicit and easy to manipulate.

Contributor

griesemer commented Jun 23, 2017

@jimmyfrasche My approach is similar. On the one extreme, each token could have a potential comment attached; and on the other, there's one comment for each node. The former is too expensive in terms of representation but would probably preserve comment positioning perfectly; and the latter is relatively cheap but may not preserve all comments in place (depending on heuristics). The stored position is helpful, but relying on it introduces the same problems we have already. Thus, any solution will have to work based on the graph alone, and ignore positions (they are used only in the beginning for the heuristics to determine how to associate a node).

Contributor

Quasilyte commented Oct 16, 2017

Are there any advices in how this can be done today?

In my experience, trailing comments are particularly hard to handle.
Suppose a tool that sorts constants in alphabetical order:

const (
  C // C comment
  A
  B // B comment
)

With manual Pos update and format.Node, the best result I managed to achieve is:

const (
  A
  // C comment
  C 
  // B comment
  B
)

I am sorry if this is just me being lame.

#17108 had some progress, but it is unclear if it has any relation to this.

Contributor

ianlancetaylor commented Oct 16, 2017

@Quasilyte This issue is about fixing the problem. Let's not complicate the issue with a discussion of how to deal with the problem before it is fixed. Please take that discussion to a forum; see https://golang.org/wiki/Questions. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment