Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/parser: incorrect comment end position with CRLF line endings #41197

Open
stamblerre opened this issue Sep 2, 2020 · 4 comments
Open

go/parser: incorrect comment end position with CRLF line endings #41197

stamblerre opened this issue Sep 2, 2020 · 4 comments
Milestone

Comments

@stamblerre
Copy link
Contributor

@stamblerre stamblerre commented Sep 2, 2020

Full repro case here: https://play.golang.org/p/JwZQ7Zjaypp.

go/parser produces incorrect position information for multi-line comments in files with CRLF line endings. This caused #41057 in gopls.

@dmitshur
Copy link
Member

@dmitshur dmitshur commented Sep 2, 2020

/cc @griesemer per owners.

@dmitshur dmitshur added this to the Backlog milestone Sep 2, 2020
@findleyr
Copy link
Contributor

@findleyr findleyr commented Sep 16, 2020

This is a bit tricky to fix. The comment literal has '\r' stripped (from scanner.scanComment: "matching the pre-existing behavior of the scanner"), and this literal is used to compute the comment end position on the AST. In other words the comment text has normalized line endings, but positions do not, so the AST doesn't have enough information to accurately compute End for the ast.Comment.

This could of course be fixed by simply not stripping the \r from comments, but that should probably be considered a breaking change. Alternatively, I suppose we'd have to add an additional piece of information to ast.Comment -- perhaps EndSlash -- to capture the missing information. In order to populate this information and not change the Scanner, we'd have to use a similar workaround to gopls (count lines and reverse-engineer the token.Pos).

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 17, 2020

Change https://golang.org/cl/255657 mentions this issue: go/ast,go/parser: add ast.Comment.EndSlash to fix End calculation

@findleyr
Copy link
Contributor

@findleyr findleyr commented Sep 28, 2020

After jotting down the above CL, I realized that solving this bug may not be possible.

We cannot change comment text to include carriage returns. That ship has sailed. The CL above instead captures more information: the position of the 'EndSlash' in the original file. On the surface, this avoids a breaking change by simply adding new information to the AST. But of course, this information is not independent: for newline separated text, we must have the invariant that c.Slash + token.Pos(len(c.Text)) == c.EndSlash (for /*-style comments).

What I realized is that there is already a moderate amount of code out there (some in the stdlib, some not) that implicitly relies on the invariant that c.Slash + token.Pos(len(c.Text)) == c.End(). One example of such an implicit reliance would be any code that modifies ast.Comment.Text. That code is not currently doing anything wrong, and could break if we implemented the above fix.

Unfortunately, on first principles I don't think there's a fix out there that avoids this pitfall. We could be slightly more 'independent' by, for example, storing the number of carriage returns in the comment text, rather than the exact position of the ending slash. But that still breaks any rewrite that changes this number.

So this is probably not a bug that can be fixed, and code that needs exact end positions of the comment node must recompute it using the original file, as gopls does.

Unless anyone has other ideas, I'll send a CL updating the documentation for ast.Comment to make this limitation clearer, and close this issue as unfixable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.