Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: bounds and overflow checks erroneously applied to non-constant expressions #24760

Closed
mdempsky opened this issue Apr 7, 2018 · 15 comments

Comments

Projects
None yet
5 participants
@mdempsky
Copy link
Member

commented Apr 7, 2018

This source file is valid Go code according to my reading of the spec and go/types accepts it:

package p

import "unsafe"

var _ = string([]byte(nil))[0]
var _ = uintptr(unsafe.Pointer(uintptr(1))) << 100

But cmd/compile rejects it with:

r.go:5:28: invalid string index 0 (out of bounds for 0-byte string)
r.go:6:45: constant 1267650600228229401496703205376 overflows uintptr

This is because cmd/compile tracks constant values for a larger set of expressions than the Go spec defines as constant expressions.

Theoretically, this is best fixed by just not folding expressions that aren't Go constants is the frontend. This should simplify const.go, and the SSA backend has a more thorough constant folding pass anyway.

@mdempsky mdempsky added the NeedsFix label Apr 7, 2018

@josharian

This comment has been minimized.

Copy link
Contributor

commented Apr 8, 2018

The front-end folding does play an important role, e.g. in doing early deadcode elimination, which impacts escape analysis and inlining. So simply punting constant folding to SSA might still have performance downsides. We’d have to evaluate case by case.

@mdempsky

This comment has been minimized.

Copy link
Member Author

commented Apr 8, 2018

Yeah, we'd still have to fold Go constants, because the language requires that.

I'm only suggesting we could stop folding expressions that involve conversion through non-Go-constant types (e.g., pointers and other nullable types). Off hand, I can't imagine real-world scenarios where dead code analysis might depend on doing that.

As far as I can tell, users would have to write comparisons like:

if []byte(nil) == nil { ... }
if string([]byte(nil)) == "" { ... }
if uintptr(unsafe.Pointer(nil)) == 0 { ... }

to be affected by this.

@cznic

This comment has been minimized.

Copy link
Contributor

commented Apr 8, 2018

It seems to me it's assumed that the compiler must accept all valid code, but I'm not sure about that. Consider:

func main() {
        a := 0
        println(5 / a)
}

No problem wrt the specification, but a hypothetical compiler can prove that the code will panic on division by zero and no other side effects exists. Some compilers may give a warning, but gc does not (https://play.golang.org/p/PG7Tgkm_TCX). In this particular case, I'd see no problem if the compiler rejected the code with an error not due to specs violation.

Is there a reason to limit the compiler reported errors only to invalid-by-specs? Especially when the compiler already does some such useful checks?

@mdempsky

This comment has been minimized.

Copy link
Member Author

commented Apr 8, 2018

@cznic That code is valid Go code. It raises a divide-by-zero panic.

@cznic

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

That code is valid Go code. It raises a divide-by-zero panic.

Well, AFAICT, that's what I said. I'd rather like to know your thoughts about

In this particular case, I'd see no problem if the compiler rejected the code with an error not due to specs violation.

Is there a reason to limit the compiler reported errors only to invalid-by-specs? Especially when the compiler already does some such useful checks?

By "error not due to specs violation." I mean "valid Go code", which was perhaps not written clearly enough.

@mdempsky

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2018

Is there a reason to limit the compiler reported errors only to invalid-by-specs?

I think in its default configuration, a Go compiler should accept all valid Go programs and reject all invalid ones.

Especially when the compiler already does some such useful checks?

I'm not sure what checks you have in mind. The ones I can think of are allowed by the spec.

@cznic

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

I think in its default configuration, a Go compiler should accept all valid Go programs and reject all invalid ones.

That's an opinion I have no problem to respect. However, the question was if there's a reason to limit the compiler reported errors only to the invalid-by-specs ones. (A superset, invalid code is of course always rejected).

I'm not sure what checks you have in mind. The ones I can think of are allowed by the spec.

In the OP you wrote:

This source file is valid Go code according to my reading of the spec and go/types accepts it:

package p

import "unsafe"

var _ = string([]byte(nil))[0]
var _ = uintptr(unsafe.Pointer(uintptr(1))) << 100

But cmd/compile rejects it with:

r.go:5:28: invalid string index 0 (out of bounds for 0-byte string)
r.go:6:45: constant 1267650600228229401496703205376 overflows uintptr

This is because cmd/compile tracks constant values for a larger set of expressions than the Go spec defines as constant expressions.

(Emphasizes mine)

Have I understood it correctly that it sums up as "The compiler does more checks than required by the specs and thus it rejects a valid Go program"? Because that's what I'm talking about, but maybe it's just me...?

@mdempsky

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2018

Have I understood it correctly that it sums up as "The compiler does more checks than required by the specs and thus it rejects a valid Go program"?

Yes, I think that's a correct summary.

I'm afraid I don't understand your position here, and I worry we're talking past each other. Let me elaborate on my position to make sure we're on common ground.

  1. The Go spec defines what a valid Go program is. Also, programs that panic at runtime are still valid Go programs.

  2. In default configuration, Go compilers should accept all valid Go programs without emitting errors, and should reject all invalid Go programs with at least one error.

  3. As a matter of long-term convention, the cmd/compile Go compiler (again by default) does not emit warnings either. (go/types does this too; I'm not sure about gccgo.)

Applying 1, the Go snippet I posted above is valid Go code. Applying 2, cmd/compile should accept it without emitting any errors. But as demonstrated, cmd/compile instead rejects it, and hence this issue.

I can appreciate the interest in detecting a set of "valid but probably undesirable" Go code, but that task is outside of cmd/compile's intended scope, and is what tools like cmd/vet are meant to fill.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

For what it's worth, gccgo compiles the initial function with no errors.

@cznic

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

I'm afraid I don't understand your position here, and I worry we're talking past each other.

It's safe to assume it's my fault. I'm not a native English speaker and for the better part of my life the English I'm exposed to comes from mostly texts about programming only.

The Go spec defines what a valid Go program is. Also, programs that panic at runtime are still valid Go programs.

Agreed, but I don't think anyone here claimed otherwise.

In default configuration, Go compilers should accept all valid Go programs without emitting errors, and should reject all invalid Go programs with at least one error.

That was stated before. My response was that I respect that opinion and I asked if there's a reason to do so. Or for that matter, if there's a reason to not reject a valid program that's proven both to not have any side effects and to produce a run-time panic and which proof the gc compiler obviously is partially already capable of to a certain extent in some cases (hence this issue exists). I'm not aware of any answer to this question.

As a matter of long-term convention, the cmd/compile Go compiler (again by default) does not emit warnings either. (go/types does this too; I'm not sure about gccgo.)

Agreed, but I don't think anyone here asked for any change about this.

I can appreciate the interest in detecting a set of "valid but probably undesirable" Go code, but that task is outside of cmd/compile's intended scope, and is what tools like cmd/vet are meant to fill.

I respect the opinion that it's outside of the compiler's scope. I am trying to get the rationale for what's discussed above.

Once again, I apologize for most probably not understanding properly your writings. Thanks for your patience.

tl;dr: No problem with the current compiler's policy. Just asking about the technical reasons for limiting the compiler errors to only those caused by violation of the specs. Even in cases where any useful effect of a program provably cannot exist and the compiler can already/could easily detect that.

@mdempsky

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2018

My response was that I respect that opinion and I asked if there's a reason to do so.

If a Go compiler doesn't accept all valid Go programs, then by definition it's not a Go compiler. It would only be a most-of-Go compiler.

There's nothing inherently wrong with instead building a most-of-Go compiler, but that's not our goal with cmd/compile. By accepting exactly the set of programs specified by the Go language spec, we don't need to separately document all the cases cmd/compile allows/rejects. It also means users only need to read one document to understand how programs will (or at least should) behave when compiled with cmd/compile.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

And one reason to accept all valid Go programs is to simplify life for program generators. They can generate any random valid Go code that won't be executed, without having to worry about stepping around compiler restrictions. It's best when everybody has a clear agreement as to what is valid and what is not.

@cznic

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2018

@mdempsky Thanks for the answer.

If a Go compiler doesn't accept all valid Go programs, then by definition it's not a Go compiler. It would only be a most-of-Go compiler.

That refers to some definition not seen here. To accept seems here to mean "produce an executable binary", right? Not nitpicking, just verifying if that's what you mean by "to accept all valid Go programs".

That's of course a possible definition, even though a different one, like "able to correctly analyze all valid Go programs etc." would work for me as well. IOW, I would not classify a compiler not producing an otherwise unusuable binary not to be by definition a compiler-for-language-x. But that's just a different opinion.

@ianlancetaylor Your example usage is actually a pretty good reason to not reject any valid program. It reminded me of my own experience with a C to Go translator and CSmith produced test inputs. I'm now convinced that @mdempsky's position is the correct one.

Thanks to both of you.

@mdempsky

This comment has been minimized.

Copy link
Member Author

commented Apr 10, 2018

To accept seems here to mean "produce an executable binary", right?

Correct, with the further restriction that the produced binary must implement the semantics specified for the language.

@ianlancetaylor ianlancetaylor added this to the Go1.11 milestone Apr 18, 2018

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Go1.12 Jun 30, 2018

@gopherbot

This comment has been minimized.

Copy link

commented Nov 26, 2018

Change https://golang.org/cl/151338 mentions this issue: cmd/compile: don't constant-fold non-Go constants in the frontend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.