New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: internal compiler error: bad live variable at entry of f #28445

Closed
ALTree opened this Issue Oct 28, 2018 · 12 comments

Comments

Projects
None yet
6 participants
@ALTree
Member

ALTree commented Oct 28, 2018

$ gotip version
go version devel +80b8377049 Sat Oct 27 22:12:17 2018 +0000 linux/amd64

The following program (reduced from a gosmith-generated one):

package p

var fp = (**float64)(nil)

func f() {
	switch fp {
	case new(*float64):
		println()
	}
}

crashes the tip compiler with the following error:

# command-line-arguments
<autogenerated>:1: internal compiler error: bad live variable at entry of f: .autotmp_2 (type *float64)

goroutine 37 [running]:
runtime/debug.Stack(0x0, 0x0, 0x0)
	/home/alberto/go/src/runtime/debug/stack.go:24 +0xa7
cmd/compile/internal/gc.Fatalf(0xce2afe, 0x24, 0xc0004f1970, 0x2, 0x2)
	/home/alberto/go/src/cmd/compile/internal/gc/subr.go:182 +0x1e7
cmd/compile/internal/gc.(*Liveness).epilogue(0xc000518000)
	/home/alberto/go/src/cmd/compile/internal/gc/plive.go:1005 +0xba8
cmd/compile/internal/gc.liveness(0xc0004c6060, 0xc0004d4160, 0x0, 0x0, 0x0)
	/home/alberto/go/src/cmd/compile/internal/gc/plive.go:1548 +0xc4
cmd/compile/internal/gc.genssa(0xc0004d4160, 0xc0004b76c0)
	/home/alberto/go/src/cmd/compile/internal/gc/ssa.go:5048 +0x85
cmd/compile/internal/gc.compileSSA(0xc000358000, 0x3)
	/home/alberto/go/src/cmd/compile/internal/gc/pgen.go:288 +0x1bc
cmd/compile/internal/gc.compileFunctions.func2(0xc0004ba900, 0xc0004bc190, 0x3)
	/home/alberto/go/src/cmd/compile/internal/gc/pgen.go:342 +0x49
created by cmd/compile/internal/gc.compileFunctions
	/home/alberto/go/src/cmd/compile/internal/gc/pgen.go:340 +0x132

It compiles fine on go1.11.

@odeke-em

This comment has been minimized.

Member

odeke-em commented Oct 29, 2018

@cherrymui

This comment has been minimized.

Contributor

cherrymui commented Oct 29, 2018

Before lower,

v7 (7) = VarDef <mem> {.autotmp_2} v1
v8 (7) = LocalAddr <**float64> {.autotmp_2} v2 v7
v9 (7) = Store <mem> {*float64} v8 v6 v7
v10 (7) = LocalAddr <**float64> {.autotmp_2} v2 v9
v12 (+7) = EqPtr <bool> v5 v10

which looks correct.

After lower,

v7 (7) = VarDef <mem> {.autotmp_2} v1
v8 (7) = LEAQ <**float64> {.autotmp_2} v2 <DEAD>
v9 (7) = MOVQstoreconst <mem> {.autotmp_2} [val=0,off=0] v2 v7
v10 (7) = LEAQ <**float64> {.autotmp_2} v2
v11 (+7) = CMPQload <flags> {"".fp} v3 v10 v1

There is no more memory relationship between the LEAQ and MOVQstoreconst. Note that CMPQload has mem arg v1 (InitMem) instead of v7.

Later, schedule puts the LEAQ and the CMPQload (!) before the VarDef and the store,

v8 (7) = LEAQ <**float64> {.autotmp_2} v2
v11 (+7) = CMPQload <flags> {"".fp} v3 v8 v1
v7 (7) = VarDef <mem> {.autotmp_2} v1
v9 (7) = MOVQstoreconst <mem> {.autotmp_2} [val=0,off=0] v2 v7

which is bad.

Maybe the problem is to choose the correct memory arg at CMPQload generation?

@randall77

This comment has been minimized.

Contributor

randall77 commented Oct 29, 2018

Yuck. Even if the CMPQload has the right memory op, the LEAQ might still float up before the VARDEF.
Another possibility is to stop treating taking the address of something as a read.
(plive.go:312, remove SymAddr from the masking).
It fixes this immediate bug, but I'm not sure it is safe. Presumably all the reads after the address op depend on the memory state generated by the VARDEF, so maybe that's ok...
This would mean that we could remove the memory argument from LocalAddr also.

It might be safe since stack objects went in. plive now really only handles direct accesses, and all accesses through the pointers are indirect.

@cherrymui

This comment has been minimized.

Contributor

cherrymui commented Oct 29, 2018

Another possibility is to stop treating taking the address of something as a read.

Yeah, with stack objects, this might be safe.

@randall77

This comment has been minimized.

Contributor

randall77 commented Oct 29, 2018

I think my idea doesn't quite work either.
If the LEAQ moves up past a safepoint, then it points to junk. We can't let the GC see the results of the LEAQ until the stack object being referenced is at least zeroed.

Maybe we need an LEAQ with memory argument? That would definitely be a verbose CL, I'd like to avoid it if anyone can think of a better way.

@ALTree

This comment has been minimized.

Member

ALTree commented Dec 1, 2018

@andybons A compiler crash on a valid 8 lines program, introduced during this cycle, is not a release-blocker?

@cherrymui

This comment has been minimized.

Contributor

cherrymui commented Dec 1, 2018

LEAQ is rematerializeable, and I think this is true for similar instructions on all other architectures. It might be fine if we guarantee to always generate LEAQ as late as we can (and we never spill it). (In this particular example, the CMPQload makes the LEAQ being generated too early.)

I think if the LEAQ's result is stored into a stack object, the memory should already be initialized, as the store instruction has the correct memory dependency. I'm not sure how we perform this check in the compiler though.

@andybons

This comment has been minimized.

Member

andybons commented Dec 1, 2018

@ALTree When triaging, we decided that while the code is valid, it’s likely not common (at least in the form presented). I will leave it to @cherrymui and @randall77 to decide its severity.

@randall77

This comment has been minimized.

Contributor

randall77 commented Dec 4, 2018

I agree that the repro isn't likely code. The underlying problem, though, might be more common. Hard to be sure. I'm going to mark it as a release blocker for now.

@andybons

This comment has been minimized.

Member

andybons commented Dec 4, 2018

Thanks, Keith.

@randall77

This comment has been minimized.

Contributor

randall77 commented Dec 11, 2018

I looked at this some more. My patch to just treat LEAQ as not reading doesn't work because of this case:

var x T = ...
p := &x
f(p)
runtime.GC()
q := &x
f(q)

At the runtime.GC, we need x to be live. Otherwise, x is not scanned and its referents not retained. Later, after we do q := &x, x is live again but it still contains its old contents, including its old pointer. That pointer is no longer valid - it may point to a freed span, for example.

This live-dead-live transition is bad. Treating &x as reading x prevents this situation.

@gopherbot

This comment has been minimized.

gopherbot commented Dec 12, 2018

Change https://golang.org/cl/153641 mentions this issue: cmd/compile: don't combine load+op if the op has SymAddr arguments

@gopherbot gopherbot closed this in 05bbec7 Dec 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment