Skip to content

runtime: fatal error: invalid stack pointer #12253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
anacrolix opened this issue Aug 21, 2015 · 15 comments
Closed

runtime: fatal error: invalid stack pointer #12253

anacrolix opened this issue Aug 21, 2015 · 15 comments

Comments

@anacrolix
Copy link
Contributor

go version devel +d931716 Tue Aug 18 00:55:16 2015 +0000 linux/amd64

runtime: bad pointer in frame github.com/anacrolix/torrent/dht.(*Server).processPacket at 0xc8
290e9e38: 0x10
fatal error: invalid stack pointer

runtime stack:
runtime.throw(0xacee10, 0x15)
        /root/src/go.master/src/runtime/panic.go:527 +0x90 fp=0x7f951fffe6f8 sp=0x7f951fffe6e0
runtime.adjustpointers(0xc8290e9de0, 0x7f951fffe870, 0x7f951fffea60, 0xc76968)
        /root/src/go.master/src/runtime/stack1.go:438 +0x2d4 fp=0x7f951fffe7f0 sp=0x7f951fffe6f8
runtime.adjustframe(0x7f951fffe980, 0x7f951fffea60, 0xc800000001)
        /root/src/go.master/src/runtime/stack1.go:505 +0x1b5 fp=0x7f951fffe8a8 sp=0x7f951fffe7f0
runtime.gentraceback(0x429cb0, 0xc8290e9d98, 0x0, 0xc820100c00, 0x0, 0x0, 0x7fffffff, 0xb6bdb8, 0x7f951fffea60, 0x0, ...)
        /root/src/go.master/src/runtime/traceback.go:336 +0xa69 fp=0x7f951fffe9d8 sp=0x7f951fffe8a8
runtime.copystack(0xc820100c00, 0x1000)
        /root/src/go.master/src/runtime/stack1.go:616 +0x1a7 fp=0x7f951fffeac8 sp=0x7f951fffe9d8
runtime.shrinkstack(0xc820100c00)
        /root/src/go.master/src/runtime/stack1.go:880 +0x148 fp=0x7f951fffeaf8 sp=0x7f951fffeac8
runtime.markroot(0xc82001e000, 0x17)
        /root/src/go.master/src/runtime/mgcmark.go:130 +0x1ac fp=0x7f951fffeb98 sp=0x7f951fffeaf8
runtime.parfordo(0xc82001e000)
        /root/src/go.master/src/runtime/parfor.go:110 +0x1d4 fp=0x7f951fffec20 sp=0x7f951fffeb98
runtime.gchelper()
        /root/src/go.master/src/runtime/mgc.go:1700 +0x56 fp=0x7f951fffec68 sp=0x7f951fffec20
runtime.stopm()
        /root/src/go.master/src/runtime/proc1.go:1131 +0x146 fp=0x7f951fffec90 sp=0x7f951fffec68
runtime.gcstopm()
        /root/src/go.master/src/runtime/proc1.go:1321 +0xf8 fp=0x7f951fffecc0 sp=0x7f951fffec90
runtime.schedule()
        /root/src/go.master/src/runtime/proc1.go:1596 +0x9c fp=0x7f951fffecf8 sp=0x7f951fffecc0
runtime.goschedImpl(0xc820100c00)
        /root/src/go.master/src/runtime/proc1.go:1713 +0x12a fp=0x7f951fffed10 sp=0x7f951fffecf8
runtime.gopreempt_m(0xc820100c00)
        /root/src/go.master/src/runtime/proc1.go:1728 +0x32 fp=0x7f951fffed20 sp=0x7f951fffed10
runtime.newstack()
        /root/src/go.master/src/runtime/stack1.go:786 +0xa92 fp=0x7f951fffee98 sp=0x7f951fffed20
runtime.morestack()
        /root/src/go.master/src/runtime/asm_amd64.s:330 +0x7f fp=0x7f951fffeea0 sp=0x7f951fffee98
@bradfitz
Copy link
Contributor

/cc @aclements (probably a dup of something you're already tracking, but... more data)

@bradfitz bradfitz added this to the Go1.5.1 milestone Aug 21, 2015
@aclements
Copy link
Member

@anacrolix, is there any use of cgo or unsafe in this program? Could you paste the rest of the stacks, or at least the stack that immediately followed the runtime stack? Did this happen just once or does it happen regularly, and if so, with what frequency? Is what you're running all open source and, if so, can you paste the commands I could use to get to the same place (and potentially reproduce the failure)?

@anacrolix
Copy link
Contributor Author

The main package isn't open source, most of the rest of it is: https://gist.github.com/anacrolix/f3434b0b3c587097ae31

I've just snipped out a few bits. It's only happened once, I've been tracking master on and off for months. This process instance ran for ~13 hours before this happened. I've used that go version previously. I dont use cgo or unsafe in any of my code, and I don't think any of my dependencies do.

The process is a server, and I don't know what triggered it. But if you have any good guesses I can try to run those code paths.

@aclements
Copy link
Member

Thanks. Looks like the stack that's being adjusted is

goroutine 34 [copystack]:
runtime.assertE2T2(0x8a7780, 0x8a7780, 0xc8289dc990, 0xc828a3fe48, 0xc825a2b988)
    /root/src/go.master/src/runtime/iface.go:233 fp=0xc8290e9da0 sp=0xc8290e9d98
github.com/anacrolix/torrent/dht.(*Server).processPacket(0xc820082480, 0xc822f14000, 0x46, 0x10000, 0x7f9527b5cc90, 0xc823d56d80)
    /root/gopath/src/github.com/anacrolix/torrent/dht/dht.go:651 +0x278 fp=0xc8290e9e88 sp=0xc8290e9da0
github.com/anacrolix/torrent/dht.(*Server).serve(0xc820082480, 0x0, 0x0)
    /root/gopath/src/github.com/anacrolix/torrent/dht/dht.go:687 +0x312 fp=0xc8290e9f60 sp=0xc8290e9e88
github.com/anacrolix/torrent/dht.NewServer.func1(0xc8200f2040)
    /root/gopath/src/github.com/anacrolix/torrent/dht/dht.go:147 +0x28 fp=0xc8290e9f98 sp=0xc8290e9f60
runtime.goexit()
    /root/src/go.master/src/runtime/asm_amd64.s:1696 +0x1 fp=0xc8290e9fa0 sp=0xc8290e9f98
created by github.com/anacrolix/torrent/dht.NewServer
    /root/gopath/src/github.com/anacrolix/torrent/dht/dht.go:156 +0x271

The bad pointer is at fp-80 or sp+152 in processPacket (the assembly offsets will be slightly different, but I can never remember quite how).

Looking at master (ef098c4), anacrolix/torrent/dht/dht.go:651 isn't an interface conversion (I actually don't see any interface conversions in that function, but maybe something got inlined). What commit of anacrolix/torrent were you running?

If you still have the binary that crashed, can you run `go tool objdump -s Server..processPacket ' and paste the output?

Could be bad stack maps around assertE2T2. @rsc

@aclements
Copy link
Member

Also, if you still have the binary, could you run

readelf --debug-dump=info <binary> | awk '/Server..processPacket/{S=1} S{print L} {L=$0} S&&/<1>/{exit}'

This will give us the DWARF information for processPacket so we don't have to reverse-engineer where the variables are. :)

@aclements
Copy link
Member

Could also be another manifestation of the bug fixed by c4092ac.

@anacrolix
Copy link
Contributor Author

I think I still have the binary, I'll get you that dump soon.

@mikioh mikioh changed the title invalid stack pointer runtime: fatal error: invalid stack pointer Aug 21, 2015
@anacrolix
Copy link
Contributor Author

@aclements
Copy link
Member

@anacrolix, thanks for the DWARF dump. I'll need the objdump, too (#12253 (comment)).

@anacrolix
Copy link
Contributor Author

@aclements
Copy link
Member

Yep, looks like another instance of the bug from c4092ac.

    dht.go:651  0x6376b6    488d9c2498000000    LEAQ 0x98(SP), BX
    dht.go:651  0x6376be    48895c2418      MOVQ BX, 0x18(SP)
    dht.go:651  0x6376c3    e8e825dfff      CALL runtime.assertE2T2(SB)

0x98(SP) was never zeroed.

@rsc, do you know off the top of your head where this zeroing is missing? If not, I can dig.

@rsc
Copy link
Contributor

rsc commented Aug 24, 2015

Something like this should help:

f$ git diff
diff --git a/src/cmd/compile/internal/gc/walk.go b/src/cmd/compile/internal/gc/walk.go
index ce73018..c58ef0c 100644
--- a/src/cmd/compile/internal/gc/walk.go
+++ b/src/cmd/compile/internal/gc/walk.go
@@ -3219,10 +3219,16 @@ func walkcompare(np **Node, init **NodeList) {

    if l != nil {
        x := temp(r.Type)
+       if haspointers(r.Type) {
+           a := Nod(OAS, x, nil)
+           typecheck(&a, Etop)
+           init = list(init, a)
+       }
+
        ok := temp(Types[TBOOL])

        // l.(type(r))
-       a := Nod(ODOTTYPE, l, nil)
+       a = Nod(ODOTTYPE, l, nil)

        a.Type = r.Type

f$

@aclements
Copy link
Member

@anacrolix, can you try applying https://go-review.googlesource.com/13872 to see if that fixes the problem for you?

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/13872 mentions this issue.

@gopherbot
Copy link
Contributor

CL https://golang.org/cl/14242 mentions this issue.

aclements added a commit that referenced this issue Sep 8, 2015
…re of interface value

A comparison of the form l == r where l is an interface and r is
concrete performs a type assertion on l to convert it to r's type.
However, the compiler fails to zero the temporary where the result of
the type assertion is written, so if the type is a pointer type and a
stack scan occurs while in the type assertion, it may see an invalid
pointer on the stack.

Fix this by zeroing the temporary. This is equivalent to the fix for
type switches from c4092ac.

Fixes #12253.

Change-Id: Iaf205d456b856c056b317b4e888ce892f0c555b9
Reviewed-on: https://go-review.googlesource.com/13872
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-on: https://go-review.googlesource.com/14242
Reviewed-by: Austin Clements <austin@google.com>
@golang golang locked and limited conversation to collaborators Sep 4, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants