Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: sigpanic can make dead pointer live again #27518

Closed
aclements opened this issue Sep 5, 2018 · 10 comments

Comments

Projects
None yet
4 participants
@aclements
Copy link
Member

commented Sep 5, 2018

What version of Go are you using (go version)?

Demonstrated with Go 1.11, 1.10 and, 1.9. Doesn't reproduce with Go 1.8.

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/austin/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/austin/r/go"
GOPROXY=""
GORACE=""
GOROOT="/home/austin/.cache/gover/1.11"
GOTMPDIR=""
GOTOOLDIR="/home/austin/.cache/gover/1.11/pkg/tool/linux_amd64"
GCCGO="/usr/bin/gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build713199772=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://play.golang.org/p/2EdeQAVbZQK

What did you expect to see?

Nothing.

What did you see instead?

runtime: pointer 0xc00007a000 to unallocated span span.base()=0xc00007a000 span.limit=0xc00007c000 span.state=3

I knew there some something fishy about unwinding the PC back to the defer in the sigpanic parent frame. Fixing this is messy.

/cc @RLH @randall77 @ianlancetaylor

@aclements

This comment has been minimized.

Copy link
Member Author

commented Sep 5, 2018

Accidentally clicked submit before I was done writing.

Fixing this is messy. I'm planning on adding support for conservative stack frames for preemption, but we could use the same thing for the parent of a sigpanic frame. That could fix the stack scanning problem.

But what if the defer grows the stack? That might still be okay combined with conservative scanning: stack copying will think x is live and will either update it (if it points to the stack), or not update it (if it points to the heap), but nothing can actually observe x, so both of these actions are harmless.

@aclements aclements added this to the Go1.12 milestone Sep 5, 2018

@cherrymui

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2018

Will stack tracing solve this?

When it panics, the frame is essentially dead, except the variables captured by the defer closure, and the closure struct itself. Those variables are address-taken. If they are actually live in the defer closure, stack tracing will find them. The non-address-taken variables are not accessible, so we can treat them dead. So, we can just let stack tracing take care of it, without adjusting the PC?

@randall77

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2018

I don't think stack tracing will solve this. x points to heap-allocated memory. x itself doesn't have its address taken, so it is tracked using the old rules. It will certainly be collected at the GC at line 29, and be live again at the the point of the defer.
Or are you suggesting that you no longer have to walk an implicit panic back to the last defer? It may work to just use all the defer closure pointers as roots and use stack tracing to find all the live variables at that point.
We do pass some variables by value instead of reference into closures. Not sure if that would matter or not...
You'd also have to treat output arguments as live, in case of a recover.

@cherrymui

This comment has been minimized.

Copy link
Contributor

commented Sep 10, 2018

Or are you suggesting that you no longer have to walk an implicit panic back to the last defer?

Yes, this is what I was suggesting. We can treat all the non-address-taken variables dead (except the return values). So x in the example will not be live again. And we use stack tracing to find address-taken live variables.

We do pass some variables by value instead of reference into closures. Not sure if that would matter or not...

If I understand correctly, for capture-by-value variables there is a copy of the value in the closure struct. So we don't need the original variable from the dead frame.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

Looking at this a bit more, this should work. One complication is that there are locals that are live after a panic: the PAUTOHEAP variables that point to return values that escape. The code that copies these values back into the return slots on the stack run after the defers (after deferreturn), so if the defer recovers those variables are needed.

I think the right answer is to move the pc to the deferreturn call. That should have precisely the bitmap we need (args are dead, return values are live, except the ones which moved to the heap, in which case their PAUTOHEAP variables are live instead).

@aclements

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2018

I think the right answer is to move the pc to the deferreturn call.

I haven't checked, but doesn't this also include variables captured in deferred closures? If so, then if we panicked before all of the defers in the function were pushed, the liveness map at deferretrun could include variables that aren't initialized yet. We could combine this with stack tracing and perhaps stop marking those as live at the deferreturn.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2018

With stack tracing, we won't be including variables captured in deferred closures in the live map, because those are all captured by reference. Those variables will be stack objects and be tracked by pointer, so they only need to be initialized before their address is put in a closure.

@aclements

This comment has been minimized.

Copy link
Member Author

commented Sep 11, 2018

Sounds good.

@gopherbot

This comment has been minimized.

Copy link

commented Sep 11, 2018

Change https://golang.org/cl/134637 mentions this issue: runtime: on a signal, set traceback address to a deferreturn call

@gopherbot gopherbot closed this in 9dac0a8 Oct 3, 2018

@gopherbot

This comment has been minimized.

Copy link

commented Oct 4, 2018

Change https://golang.org/cl/139898 mentions this issue: cmd/link: fix deferreturn location on wasm

gopherbot pushed a commit that referenced this issue Oct 5, 2018

cmd/link: fix deferreturn location on wasm
On wasm, pcln tables are indexed by "resumption point ID" instead of
by pc offset. When finding a deferreturn call, we must find the
associated resumption point ID for the deferreturn call.

Update #27518
Fixes wasm bug introduced in CL 134637.

Change-Id: I3d178a3f5203a06c0180a1aa2309bfb7f3014f0f
Reviewed-on: https://go-review.googlesource.com/c/139898
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.