Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/format: normalize number prefixes and exponents, like cmd/gofmt does #37476

Open
dmitshur opened this issue Feb 26, 2020 · 9 comments
Open

go/format: normalize number prefixes and exponents, like cmd/gofmt does #37476

dmitshur opened this issue Feb 26, 2020 · 9 comments
Assignees
Milestone

Comments

@dmitshur
Copy link
Member

@dmitshur dmitshur commented Feb 26, 2020

CL 160184 implemented a change to cmd/gofmt for Go 1.13 to normalize number prefixes and expontents.

The same change should be applied to the go/format package (or one of its dependencies, depending on where it's most appropriate to implement).

Also see #37453.

/cc @griesemer @cespare @heschik

@dmitshur dmitshur added the NeedsFix label Feb 26, 2020
@dmitshur dmitshur added this to the Go1.15 milestone Feb 26, 2020
@griesemer

This comment has been minimized.

Copy link
Contributor

@griesemer griesemer commented Feb 26, 2020

I wonder if this should just be done in the go/printer; possibly with a flag to control it (of course, the flag won't be exposed to gofmt).

@dmitshur

This comment has been minimized.

Copy link
Member Author

@dmitshur dmitshur commented Feb 26, 2020

I suspect go/printer may be a good place for it. Whoever is working on this issue should investigate that possibility and compare it with others.

@dmitshur

This comment has been minimized.

Copy link
Member Author

@dmitshur dmitshur commented Feb 26, 2020

A flag to control this behavior is possible, but I don't think it would be helpful, and so I prefer it's not added unless there is a good reason. We can never remove the flag and it'll adding some complexity.

If someone wants to "disable" this behavior, or produce Go code that is formatted in a way that is compatible with an older version of Go, a better way of accomplishing that is by using the cmd/gofmt binary or the go/format package from an older version of Go.

@dmitshur

This comment has been minimized.

Copy link
Member Author

@dmitshur dmitshur commented Feb 27, 2020

I've investigated this briefly, and want to share my early findings. I looked into how it'd look if the responsibility of formatting numbers was moved from cmd/go into the go/printer package.

The go/printer package has a small amount of precedent for modifying the AST it is about to print. It happens when printing *ast.ImportSpec nodes:

func (p *printer) spec(spec ast.Spec, n int, doIndent bool) {
	switch s := spec.(type) {
	case *ast.ImportSpec:
		// ...
		p.expr(sanitizeImportPath(s.Path))
		// ...
}

func sanitizeImportPath(lit *ast.BasicLit) *ast.BasicLit { ... }

I realize that printing import paths (especially ones that need to be sanitized) happens much less frequently that numbers may need to be formatted, so it's very possible the same approach would not work for this. But I wanted to try it as a first step, and see if it would at least produce correct results or not.

It was easy to perform a simple code transformation and move normalizeNumbers from cmd/gofmt into go/printer, and start using it in the printer.expr1 method as follows:

 func (p *printer) expr1(expr ast.Expr, prec1, depth int) {
 	p.print(expr.Pos())
 
 	switch x := expr.(type) {
 	case *ast.BasicLit:
-		p.print(x)
+		switch x.Kind {
+		case token.INT, token.FLOAT, token.IMAG:
+			p.print(normalizeNumber(x))
+		default:
+			p.print(x)
+		}
 }
// normalizeNumber rewrites base prefixes and exponents to
// use lower-case letters, and removes leading 0's from
// integer imaginary literals. It leaves hexadecimal digits
// alone.
func normalizeNumber(lit *ast.BasicLit) *ast.BasicLit {
	if len(lit.Value) < 2 {
		return lit // only one digit (common case) - nothing to do
	}
	// len(lit.Value) >= 2

	// ... handle all the cases, return lit if there's no change, or a new &ast.BasicLit if there is
}

What I found so far:

  1. The aforementioned code transformation seems to produce correct results. All cmd/gofmt and go/... tests are passing (after adjusting one test case in go/printer to accommodate the new formatting behavior). I have not spotted any correctness problems yet.

  2. I am concerned that one of the tasks that normalizeNumbers does: removing leading 0's from integer imaginary literals. This operation modifies the length of the basic literal value by shortening it. I worry this may cause problems because the positions of all AST nodes after the modified one may become incorrect. I don't know if this is safe or will cause problems, and whether this may invalidate the approach of performing this formatting task as part of printing the AST in go/printer. Needs more investigation.

  3. If the above concern is resolved, the next step will be to run some benchmarks and see how much of a performance difference this implementation would incur (including whether allocating more *ast.BasicLit values is a significant problem).

@griesemer

This comment has been minimized.

Copy link
Contributor

@griesemer griesemer commented Feb 27, 2020

@dmitshur Thanks for the analysis. I am not concerned about 2). The whole point of go/printer is to change the layout of space which will invariably change all positions of tokens after that. The printer can handle that. Also, note that with respect to this issue, whether the rewrite happens in gofmt or in go/printer doesn't change the outcome, does it? We must have this situation already now.

What I am concerned is that go/printer's task is to print, not to update or change the AST. Which is why I suggested a flag. That flag would basically mean: you're allowed to make approved changes. If the flag is not set, the tree's contents are not modified in any way.

@dmitshur

This comment has been minimized.

Copy link
Member Author

@dmitshur dmitshur commented Feb 28, 2020

I am not concerned about 2). The whole point of go/printer is to change the layout of space which will invariably change all positions of tokens after that. The printer can handle that. Also, note that with respect to this issue, whether the rewrite happens in gofmt or in go/printer doesn't change the outcome, does it? We must have this situation already now.

Makes sense, thanks. The current implementation in cmd/gofmt modifies the AST (by walking over it with ast.Inspect) before printing it with go/printer, so you're right that nothing would change.

What I am concerned is that go/printer's task is to print, not to update or change the AST. Which is why I suggested a flag. That flag would basically mean: you're allowed to make approved changes. If the flag is not set, the tree's contents are not modified in any way.

I'm open to the possibility of adding a flag, if it is needed. I just have some questions that I'd like to clarify first.

Note that the implementation I described above does not update or modify the AST that a user passes into Fprint. This is because new *ast.BasicLit nodes are created when a number needs to be printed with different formatting:

func normalizeNumber(lit *ast.BasicLit) *ast.BasicLit {
	// ...

	// don't modify the original lit, return a copy with a modified Value
	return &ast.BasicLit{ValuePos: lit.ValuePos, Kind: lit.Kind, Value: x}
}

I agree one of primary tasks of go/printer is to "pretty-print" Go code, optionally configured by a non-default printer.Config. To quote the documentation of Fprint (both the function and method):

Fprint "pretty-prints" an AST node to output [...]

Given that go/printer already produces output that is modified by the "pretty-printing" process (whitespace is modified to be well-aligned, semicolons are removed, imports are sorted, etc.), can you help me understand why you're suggesting a flag to control this aspect of the "pretty-printing" process specifically? Is it only because it's a recent addition, or is there another reason?

Edit: Import sorting is not done as part of the "pretty-printing" process in go/printer, it's done externally.

@griesemer

This comment has been minimized.

Copy link
Contributor

@griesemer griesemer commented Feb 28, 2020

  1. The fact that the incoming AST is not modified anymore is a clear improvement; we should definitively do that.

  2. Pretty-printing is "just" about formatting space. It's true that semicolons are omitted, but that seems to be on a different level than actually changing literals. (But then we also adjust comments in cases, something we probably should not do, either).

The only reason for the flag would be that some go/printer or go/format clients might expect the old behavior. Basically, I think it makes sense to have two modes: one where the AST is untouched but for space (and semicolons), and one where it might do some minor cleanups. There's already a Mode bit available in printer.Config - if not set, nothing changes and existing code will not notice. But gofmt and go/format will set it. Seems pretty clean while reducing the chance for surprises in clients.

(Semicolons are not even recorded in the AST, so it's harder to argue that we changed something. The printer in the compiler's syntax package - while far from the power of go/printer - has a mode where one can choose to have everything on one line (if possible), in which case semicolons are inserted. That's useful for printing AST snippets in error messages. Though admittedly, I'm not relying on it much at the moment.)

@dmitshur

This comment has been minimized.

Copy link
Member Author

@dmitshur dmitshur commented Feb 28, 2020

Thanks for elaborating, that's very helpful. I'll think more about this. I have another question for now.

Suppose we add a new Mode for turning on this behavior now, and in a future Go version there is another gofmt change that formats string literals in some way (that doesn't change behavior of course). Do you anticipate the same Mode bit would also control that future gofmt behavior, or would it require a second Mode bit to be added?

Put differently, you said:

That flag would basically mean: you're allowed to make approved changes. If the flag is not set, the tree's contents are not modified in any way.

Is "approved changes" meant to include only the current non-space/semicolon modifications, or is it expected to include future ones too?

@griesemer

This comment has been minimized.

Copy link
Contributor

@griesemer griesemer commented Feb 28, 2020

I would use the same flag. Basically, the flag means that go/printer can do the changes that are "gofmt-approved" if you will, going forward. Without setting the flag, the printer just prints w/o any changes.

@dmitshur dmitshur self-assigned this Feb 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants
You can’t perform that action at this time.