Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: clarify sequencing of function calls within expressions #48105

Open
mdempsky opened this issue Aug 31, 2021 · 16 comments
Open

spec: clarify sequencing of function calls within expressions #48105

mdempsky opened this issue Aug 31, 2021 · 16 comments
Assignees
Labels
Documentation NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@mdempsky
Copy link
Contributor

The program below tests the order-of-evaluation semantics of return *p, f(p). With gccgo, it prints 0 (i.e., evaluating *p before calling f(p)), and with cmd/compile it prints 2 (i.e., evaluating *p after f(p) returns).

But is it allowed to print 1? I.e., can *p be evaluated in the middle of f(p) being evaluated, after the *p = 1 and before the *p = 2?

The spec says simply:

For example, in the (function-local) assignment

y[f()], ok = g(h(), i()+x[j()], <-c), k()

the function calls and communication happen in the order f(), h(), i(), j(), <-c, g(), and k(). However, the order of those events compared to the evaluation and indexing of x and the evaluation of y is not specified.

I don't think there's any other text in the spec that requires evaluation of a function call to within an expression to be handled atomically with respect to other expressions either.

@griesemer and I agree only 0 or 2 should be allowed (i.e., 1 should not be allowed), but that the spec could be clearer about this.

package main

func main() {
	x, _ := g()
	println(x)
}

func g() (int, int) {
	p := new(int)
	return *p, f(p)
}

func f(p *int) int {
	*p = 1
	*p = 2
	return 0
}
@mdempsky mdempsky added Documentation NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Aug 31, 2021
@griesemer griesemer added this to the Backlog milestone Aug 31, 2021
@mdempsky
Copy link
Contributor Author

mdempsky commented Sep 1, 2021

FWIW, if we agree the intention is to only allow 0 or 2, I think one easy way to at least informally codify that would be to update the example from:

f := func() int { a++; return a }

to

f := func() int { a++; a++; return a }

@ianlancetaylor
Copy link
Contributor

I don't think the spec currently prevents printing 1. And my first reaction is that I don't think it should prevent that. I think that the language should either specify the order of evaluation, or it should not. And right now it does not. If f is inlined, then we have three expressions/statements to evaluate:

  1. *p
  2. *p = 1
  3. *p = 2

It is clear that *p = 1 must precede *p = 2. But there is no ordering defined between *p and the two assignment statements. Therefore, any ordering is permissible.

We could certainly add a rule saying something like "all function calls must be fully evaluated with respect to all other operations." But I don't think that helps people writing Go code. This code is already indeterminate. Making it less indeterminate, while still leaving it indeterminate, doesn't make anything clearer or easier to understand. It would help to make it fully determined, but that isn't what seems to be suggested here.

If we don't make this kind of expression fully determined, I'm inclined to think that we should have a guideline like "if a single statement both reads and writes the same variable, and there is no order of evaluation specified between the reads and writes, then the read may observe the original value of the variable or it may observe any of the values written to the variable, and exactly which value it observes is unspecified."

(I also think we should have a vet check for statements that both read and write the same variable with no ordering specified, but unfortunately that seems like a difficult check to write.)

@mdempsky
Copy link
Contributor Author

I don't think the spec currently prevents printing 1. And my first reaction is that I don't think it should prevent that.

Does your reasoning extend to something like *(*int)(nil) + f()? Does f need to be defensively coded because the nil-pointer-dereference panic can happen at any time within it?

@ianlancetaylor
Copy link
Contributor

Thanks, that's a plausible argument for introducing the rule "all function calls must be fully evaluated with respect to all other operations."

@scott-cotton
Copy link

In practice I am not bitten by these things, but looking at examples like this makes me wonder why.

I would not be opposed to specifying ordering, especially w.r.t multiple assignments and _.

(I also think we should have a vet check for statements that both read and write the same variable with no ordering specified, but unfortunately that seems like a difficult check to write.)

+1. I think we are not far from the pointer part of the complexity for go 1.17. Combine that with happens-before, especially with the memory model work going on, and things get harder. Approximations can help. Type params + pointer analysis (and + x/tools/go/ssa) is an as-of-yet unsolved problem, at least for library code AFAIK.

@mdempsky
Copy link
Contributor Author

In practice I am not bitten by these things, but looking at examples like this makes me wonder why.

There are occasional issue reports about it. E.g., cmd/compile and gccgo compile return x, f(&x) differently.

@zpavlinovic
Copy link
Contributor

In practice I am not bitten by these things, but looking at examples like this makes me wonder why.

There are occasional issue reports about it. E.g., cmd/compile and gccgo compile return x, f(&x) differently.

Regarding the vet check, perhaps an approach focusing on simple patterns like this one could work. @mdempsky, how frequent is occasional in your opinion?

A more general approach relying on pointer analysis might have a false positive rate above the vet threshold.

@mdempsky
Copy link
Contributor Author

@mdempsky, how frequent is occasional in your opinion?

Somewhere between "once ever" and "once per year". I can recall it being independently reported by at least 2 people, but I can't recall beyond that. I doubt it's frequent enough to merit a vet check.

@tdakkota
Copy link

Another example

package main

var x = 0

func f() int {
	x = 3
	return x
}

func main() {
	x = 0
	a, _ := x, f()

	x = 0
	var b, _ = x, f()
	println(a, b)
}

gc (go version devel go1.18-3da0ff8 Fri Oct 15 02:02:50 2021 +0000 linux/amd64):

3 0

gccgo (gccgo 11.2.0):

0 0

https://go.godbolt.org/z/arcv49Gzd

@cristaloleg
Copy link

Regarding mentioned vet check, there is a similar check in go-critic linter, I think it was added because such patterns make code less clear but the example above regarding gc and gccgo sounds not so good. CC: @quasilyte

ydnar added a commit to bytecodealliance/wasm-tools-go that referenced this issue Oct 17, 2023
ydnar added a commit to ydnar/scratch that referenced this issue Oct 17, 2023
@dgryski
Copy link
Contributor

dgryski commented Oct 18, 2023

Since TinyGo depends on the SSA package to parse Go code, this bug is also affecting TinyGo code generation. We just hit this with real-world code.

@mdempsky
Copy link
Contributor Author

@dgryski Can you elaborate on how this problem affects x/tools/go/ssa and TinyGo? I didn't think x/tools/go/ssa performed function inlining, so I wouldn't expect it to be affected.

@dgryski
Copy link
Contributor

dgryski commented Oct 18, 2023

Ah yes, I realize this bug is about the spec and not the SSA package. However, the SSA package implements one version of the spec and the Go compiler seems to implement another interpretation.

Here is a small Go program with different behaviour under Go and TinyGo, but note that TinyGo is doing a faithful translation of the SSA produced by the x/tools/go/ssa (via ssadump).

~/go/src/github.com/dgryski/bug/multi $ cat main.go
package main

import (
	"strings"
)

func main() {
	s1 := "hello"
	s2, err := s1, toUpper(&s1)
	println(s1, s2, err)
}

func toUpper(s *string) error {
	*s = strings.ToUpper(*s)
	return nil
}
~/go/src/github.com/dgryski/bug/multi $ tinygo run main.go
HELLO hello (0:nil)
~/go/src/github.com/dgryski/bug/multi $ go run main.go
HELLO HELLO (0x0,0x0)
~/go/src/github.com/dgryski/bug/multi $ ssadump -build=F main.go |sed -n '/func main/,/^$/p'
func main():
0:                                                                entry P:0 S:0
	t0 = new string (s1)                                            *string
	*t0 = "hello":string
	t1 = *t0                                                         string
	t2 = toUpper(t0)                                                  error
	t3 = *t0                                                         string
	t4 = println(t3, t1, t2)                                             ()
	return

Note that if we replace main with the statements produced by SSA, we get the same result with Go and TinyGo, so the issue really is coming from the SSA package not matching Go's behaviour.

func main() {
	t0 := new(string)
	*t0 = "hello"
	t1 := *t0
	t2 := toUpper(t0)
	t3 := *t0
	println(t3, t1, t2)
}

@mdempsky
Copy link
Contributor Author

@dgryski I see, thanks for elaborating. Yes, that's a known difference between cmd/compile and gccgo too, and I think there are other issues filed about it. Both behaviors are spec compliant, and our response to date has been that applications should not depend on which behavior the compiler implements.

I think it's outside the scope of what this issue is about though (namely whether expression evaluation can be interleaved with executing a function call's body, not merely scheduled before or after).

@timothy-king
Copy link
Contributor

@mdempsky beat me to responding. I don't have too much more to add other than here is a godbolt link where you can confirm gccgo and tinygo agree on the output HELLO hello: https://go.godbolt.org/z/dPesK1T8b .

@zigo101
Copy link

zigo101 commented Oct 18, 2023

As @mdempsky said, it is user's responsibility to avoid the unspecified behavior.
Now that code line can be written as

s2, err := strings.Repeat(s1, 1), toUpper(&s1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

10 participants