Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: crash on 1.14 with unexpected return pc, fatal error: unknown caller pc #37664

apmckinlay opened this issue Mar 4, 2020 · 17 comments


Copy link

@apmckinlay apmckinlay commented Mar 4, 2020

What version of Go are you using (go version)?

$ go version
go1.14 windows/amd64

Also happens on darwin

Works with 1.13.8

Does this issue reproduce with the latest release?

Yes, and also with tip as of Mar 4, 2020

What operating system and processor architecture are you using (go env)?

Happens on Windows and Mac OS X (darwin)
I haven't tried Linux

go env Output This is showing 1.13.8 since it's my "main" installation. I am testing with go1.14 and gotip
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\Andrew\AppData\Local\go-build
set GOENV=C:\Users\Andrew\AppData\Roaming\go\env
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Users\Andrew\go
set GOPROXY=,direct
set GOROOT=c:\go
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set GOMOD=C:\Dropbox\gsuneido\go.mod
set CGO_CFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\Andrew\AppData\Local\Temp\go-build176488038=/tmp/go-build -gno-record-gcc-switches
GOROOT/bin/go version: go version go1.13.8 windows/amd64
GOROOT/bin/go tool compile -V: compile version go1.13.8
gdb --version: GNU gdb (GDB) 8.1

What did you do?

I built my program with 1.14 and now it crashes. Works fine with 1.13

There is no cgo or assembler involved.
(There is some cgo in the project, but I'm building without it.)
There is minor (read-only) use of unsafe, but it does not appear to be involved.
It is running a single go routine, no concurrency (other than internal Go stuff).
Still crashes with GOGC=off

It is consistent and repeatable - running the same thing crashes the same way every time.
Crashes the same way on darwin, so presumably it's a cross platform issue.
However it is "touchy". Making slight changes to what I'm running can eliminate or change the crash.

It appears to be related to panic/defer/recover
Possibly something to do with re-panic the result of recover
The function doing the panic/defer/recover is recursive if that makes any difference.
Possibly something to do with the defer changes in 1.14
This code makes heavier than normal use of panic/defer/recover which may be why I'm running into this and other people are not.

Unfortunately, it's a large complex system and so far I have not been able to come up with a small Go example that reproduces it. (I could provide the necessary files and configuration if desired.)

I searched the issues but couldn't find anything that looked related.

I assume that nothing I do in normal single-threaded Go code should cause this?

I would welcome any suggestions on how to debug this issue.
e.g. Is there any way to control the compilation of defer handling?

What did you expect to see?

no crash

What did you see instead?


example crash output
runtime: unexpected return pc for*Thread).interp
called from 0xc000011610
stack: frame={sp:0xc000127208, fp:0xc0001275d0} stack=[0xc000120000,0xc000128000)
000000c000127108:  0000000000618e00  000000000064bec8
000000c000127118:  000000c00017ba40  000000c0001271a0
000000c000127128:  00000000004f0504   000000c000198000
000000c000127138:  00000000000003ff  00000000006a93a0
000000c000127148:  000000c000169a80  0000000000000000
000000c000127158:  000000c0001271f8  000000c0001c8000
000000c000127168:  0000000000000000  000000c00019c020
000000c000127178:  0000000000000000  00000000000003ff
000000c000127188:  0000000000000000  0000000000000000
000000c000127198:  0000000000000000  000000c000127240
000000c0001271a8:  0000000000507414   000000c000198000
000000c0001271b8:  000000c0001c8000  0000000000000000
000000c0001271c8:  0000000000000000  00000000000003ff
000000c0001271d8:  00000000000003ff  000000c000188500
000000c0001271e8:  000000c00019c020  0000000000894b40
000000c0001271f8:  000000c0001275c0  00000000004f5e5a 
000000c000127208: <000000c000198000  00000000000003ff
000000c000127218:  000000c000127610  0000000000000000
000000c000127228:  000000c0001275e8  000000c00019c020
000000c000127238:  0000000000894b40  000000c000127608
000000c000127248:  00000000004f513c   000000c0001c8000
000000c000127258:  000000c000198000  0000000000000000
000000c000127268:  0000000000000000  000000000084e3c0
000000c000127278:  000000c000186b80  000000c0001272a8
000000c000127288:  00000000006a7f00  000000c00017e470
000000c000127298:  000000c000186b80  000000c000169ac0
000000c0001272a8:  000001003e6a7ea0  00000000005e9fe6 
000000c0001272b8:  000000c000127330  0000000000000000
000000c0001272c8:  0000000000000000  00000000006a95e0
000000c0001272d8:  000000c0001c80e0  000000c000127320
000000c0001272e8:  000000000044d4ce   00000000001273a0
000000c0001272f8:  000000c000127338  000000c000127320
000000c000127308:  00000000004eff88   000000000084d780
000000c000127318:  0000000000000012  000000c0001273f0
000000c000127328:  00000000005e9ad4   000000c0000114b1
000000c000127338:  000000000000000b  000000c0001c80e0
000000c000127348:  0000000000000000  000000c00013c180
000000c000127358:  0000000000000077  00000000006a93a0
000000c000127368:  000000c0001c80e0  0000000000000012
000000c000127378:  0000000000000000  00000000006aa060
000000c000127388:  000000000054f95b   0000000000000000
000000c000127398:  0000000000000005  0000000000000000
000000c0001273a8:  757465526b636f6c  0000000000006e72
000000c0001273b8:  000000000051d9b6   0000000000000040
000000c0001273c8:  0000000000000000  0000000000000000
000000c0001273d8:  0000000000669a00  0000000000000009
000000c0001273e8:  000000c000127508  0000000000000000
000000c0001273f8:  00000000004ef890   0000000000000001
000000c000127408:  0000000000000040  000000c0000114b1
000000c000127418:  000000000000000b  00000000006a95e0
000000c000127428:  0000000000010000  0000000000000000
000000c000127438:  0000000000000000  0000000000000000
000000c000127448:  0000000000000000  0000000000000000
000000c000127458:  0000000000000000  000000c0001c80e0
000000c000127468:  000000000040ccd8   000000c00017d540
000000c000127478:  000000c0001274a8  0000000000669910
000000c000127488:  000000c0001274b8  00000000004ef714 
000000c000127498:  000000c000198000  0000000000000069
000000c0001274a8:  00000000006a95e0  000000c0001c80e0
000000c0001274b8:  000000c000127530  0000000000547543 
000000c0001274c8:  000000c000169a80  0000000000010000
000000c0001274d8:  000000c00017e250  0000000000000000
000000c0001274e8:  000000c000198040  000000c000011500
000000c0001274f8:  000000000084e3c0  0000000000000000
000000c000127508:  0000000000010000  0000000000000000
000000c000127518:  0000000000000000  0000000000000000
000000c000127528:  0000000000000000  000000c000198040
000000c000127538:  000000c000011610  000000000051dae0 
000000c000127548:  000000c000127263  000000c000198000
000000c000127558:  0000000000000000  0000000000000000
000000c000127568:  0000000000000000  000000000051da70 
000000c000127578:  000000c000198040  000000c000011500
000000c000127588:  000000000051dae0   000000c0001272ab
000000c000127598:  000000c000198000  000000c000127610
000000c0001275a8:  000000c000198040  000000c000198000
000000c0001275b8:  000000000051da70   000000c000198040
000000c0001275c8: !000000c000011610 >0000000000000009
000000c0001275d8:  000000c000127630  000000c000127558
000000c0001275e8:  000000c000127658  000000c000198040
000000c0001275f8:  000000c000198000  00000000006698a0
000000c000127608:  000000c000127690  00000000004f06d6 
000000c000127618:  000000c000198000  000000c000127658
000000c000127628:  000000c000127650  0000000000000000
000000c000127638:  0000000000000000  0000000000550227 
000000c000127648:  0000000000000001  ffffffffffffffff
000000c000127658:  0000000000000000  00000000006a9be0
000000c000127668:  0000000000000002  0000000000000001
000000c000127678:  0000000000000000  0000000000000000
000000c000127688:  000000c00017b960  000000c000127710
000000c000127698:  00000000004f0504   000000c000198000
000000c0001276a8:  0000000000000400  0000000000411c54 
000000c0001276b8:  000000c0001276f0  0000000028df0dc3
000000c0001276c8:  5bfc77cc605e10fb
fatal error: unknown caller pc

runtime stack:
runtime.throw(0x65e504, 0x11)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:1112 +0x79
runtime.gentraceback(0x4f5d02, 0xc000127208, 0x0, 0xc000056000, 0x0, 0x0, 0x7fffffff, 0xc5feb0, 0x0, 0x0, ...)
C:/Users/Andrew/sdk/gotip/src/runtime/traceback.go:273 +0x1a09
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:719 +0x98
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:370 +0x6b

goroutine 1 [running]:
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:330 fp=0xc0001269d0 sp=0xc0001269c8 pc=0x4603b0
runtime.addOneOpenDeferFrame(0xc000056000, 0x0, 0x0)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:718 +0x82 fp=0xc000126a10 sp=0xc0001269d0 pc=0x435132
panic(0x64cbc0, 0xc000004780)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:969 +0x344 fp=0xc000126ab8 sp=0xc000126a10 pc=0x435a24*Thread).interp.func6(0xc000198000, 0xc000198080,
0xc0001270e8, 0xc000126fe8, 0xc0001270c0)
C:/Dropbox/gsuneido/runtime/interp.go:140 +0x33a fp=0xc000126b28 sp=0xc000126ab8 pc=0x51df8a
runtime.call64(0x0, 0x6698a0, 0xc00012c898, 0x2800000028)
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:540 +0x42 fp=0xc000126b78 sp=0xc000126b28 pc=0x460802
runtime.reflectcallSave(0xc000126c98, 0x6698a0, 0xc00012c898, 0xc000000028)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:879 +0x5f fp=0xc000126ba8 sp=0xc000126b78 pc=0x43562f
runtime.runOpenDeferFrame(0xc000056000, 0xc00012c850, 0xc000126ce0)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:853 +0x2c0 fp=0xc000126c38 sp=0xc000126ba8 pc=0x4354f0
panic(0x64cbc0, 0xc000004780)
C:/Users/Andrew/sdk/gotip/src/runtime/panic.go:967 +0x16b fp=0xc000126ce0 sp=0xc000126c38 pc=0x43584b*Thread).interp(0xc000198000, 0xc0001270e8, 0xc0001270e0, 0x0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:438 +0x5c64 fp=0xc0001270a8 sp=0xc000126ce0
pc=0x4f65e4*Thread).run(0xc000198000, 0x3ff, 0x6a93a0)
C:/Dropbox/gsuneido/runtime/interp.go:67 +0x156 fp=0xc000127130 sp=0xc0001270a8 pc=0x4f06d6*Thread).Start(0xc000198000, 0xc0001c8000, 0x0, 0x0, 0x3ff, 0x3ff)
C:/Dropbox/gsuneido/runtime/interp.go:31 +0x204 fp=0xc0001271b0 sp=0xc000127130 pc=0x4f0504*SuFunc).Call(0xc0001c8000, 0xc000198000, 0x0, 0x0, 0x84e3c0, 0xc000186b80, 0xc0001272a8)
C:/Dropbox/gsuneido/runtime/sufunc.go:56 +0x2d4 fp=0xc000127250 sp=0xc0001271b0 pc=0x507414*Thread).interp(0xc000198000, 0xc000127658, 0xc000127650, 0x0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:450 +0x47bc fp=0xc000127618 sp=0xc000127250
pc=0x4f513c*Thread).run(0xc000198000, 0x400, 0x411c54)
C:/Dropbox/gsuneido/runtime/interp.go:67 +0x156 fp=0xc0001276a0 sp=0xc000127618 pc=0x4f06d6*Thread).Start(0xc000198000, 0xc0001c80e0, 0x0, 0x0, 0x400, 0x400)
C:/Dropbox/gsuneido/runtime/interp.go:31 +0x204 fp=0xc000127720 sp=0xc0001276a0 pc=0x4f0504*SuFunc).Call(0xc0001c80e0, 0xc000198000, 0x0, 0x0, 0x84e3c0, 0x6a9be0, 0x89f41b)
C:/Dropbox/gsuneido/runtime/sufunc.go:56 +0x2d4 fp=0xc0001277c0 sp=0xc000127720 pc=0x507414*Thread).interp(0xc000198000, 0xc000127bc8, 0xc000127bc0, 0x0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:450 +0x47bc fp=0xc000127b88 sp=0xc0001277c0
pc=0x4f513c*Thread).run(0xc000198000, 0xc000127c80, 0x550835)
C:/Dropbox/gsuneido/runtime/interp.go:67 +0x156 fp=0xc000127c10 sp=0xc000127b88 pc=0x4f06d6*Thread).Start(0xc000198000, 0xc0001c81c0, 0x0, 0x0, 0xc0001c81c0, 0x0)
C:/Dropbox/gsuneido/runtime/interp.go:31 +0x204 fp=0xc000127c90 sp=0xc000127c10 pc=0x4f0504
main.eval(0xc00000bc00, 0x17)
C:/Dropbox/gsuneido/gsuneido.go:200 +0x1ff fp=0xc000127d40 sp=0xc000127c90 pc=0x5e972f
C:/Dropbox/gsuneido/gsuneido.go:150 +0x34e fp=0xc000127eb0 sp=0xc000127d40 pc=0x5e926e
C:/Dropbox/gsuneido/gsuneido.go:81 +0x23f fp=0xc000127f88 sp=0xc000127eb0 pc=0x5e89cf
C:/Users/Andrew/sdk/gotip/src/runtime/proc.go:204 +0x212 fp=0xc000127fe0 sp=0xc000127f88 pc=0x438512
C:/Users/Andrew/sdk/gotip/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc000127fe8 sp=0xc000127fe0 pc=0x462531

goroutine 6 [syscall]:
C:/Users/Andrew/sdk/gotip/src/runtime/sigqueue.go:147 +0xa3
C:/Users/Andrew/sdk/gotip/src/os/signal/signal_unix.go:23 +0x29
created by os/signal.Notify.func1
C:/Users/Andrew/sdk/gotip/src/os/signal/signal.go:127 +0x4b

Copy link

@randall77 randall77 commented Mar 4, 2020

You can try -gcflags=-N which will turn off the new defer optimizations. It turns off a lot of other stuff also, so it's not a perfect test. But if the problem remains then it wasn't the new defer stuff.


Copy link

@apmckinlay apmckinlay commented Mar 4, 2020

It still crashes the same way.
So presumably not the new defer stuff, but something else that changed in 1.14 ?
Thanks for the suggestion.

@dmitshur dmitshur added this to the Go1.15 milestone Mar 4, 2020
@dmitshur dmitshur changed the title crash on 1.14 with unexpected return pc, fatal error: unknown caller pc runtime: crash on 1.14 with unexpected return pc, fatal error: unknown caller pc Mar 4, 2020
Copy link

@dmitshur dmitshur commented Mar 4, 2020

Copy link

@danscales danscales commented Mar 4, 2020

@apmckinlay I'm happy to work on debugging this if you can create a code example that you are able to share. Even though it is not specifically related to the defer changes, I also have recently worked with the panic/recover implementation. One change that also went into Go 1.14 relating to panic/recover is . It should only affect behavior if you did a panic/recover after initiating a Goexit(). I assuming that was not your scenario, but the change could have had some other unintended side effect.

Copy link

@apmckinlay apmckinlay commented Mar 4, 2020

@danscales Thank you very much! It's open source so no problem sharing.
I will put together instructions/files to recreate the problem.

There shouldn't be an exit involved from my code.

It appears to be a two step process - a first panic/defer/recover works properly, but then a second one crashes. Running the second one by itself is fine. It's as if the first one leaves something behind that affects the second one. The second can be quite simple, but the first has to be more complex to cause the later crash.

Another point that may or may not be relevant is that the panic can be from the same frame as the defer/recover/re-panic (perhaps not typical?)

Call stack depth also appears to be relevant. Possibly stack movement is a factor?
Is there any way to trace stack movement to see when it occurs?
Or a way to set a larger initial default stack size to avoid movement?

Copy link

@apmckinlay apmckinlay commented Mar 4, 2020

@danscales Here is a set of files that should allow you to recreate the crash.
Instructions in README.txt

@danscales danscales self-assigned this Mar 5, 2020
Copy link

@danscales danscales commented Mar 5, 2020

@apmckinlay Thanks for setting up the repro case! It actually reproduces on Linux, though I had to fix the sys_nix.go file (syscall.Sysctl no longer exists on Linux).

The bug is actually related to the new open-coded defers and their interaction with panic/recover. Confusingly, 'go build' doesn't recompile all the sub-packages with the -N option unless you do:

go build -gcflags="all=-N"

The bug goes away if you do that, since the problem is a defer in interp.go. As a more targeted work-around, you can put a 1-iteration for loop around the defer statement in interp() (and no -N option needed) and the problem goes away:

for i := 0; i < 1; i++ {
    defer func() {
        // this is an optimization ...

The bug does require several sequences of panics and recovers with a further re-panic, before doing the final recover.

I think that I have the actual fix in the Go runtime, which is fairly simple, but I'm still working to verify it is the full fix, do more testing, etc.

Copy link

@apmckinlay apmckinlay commented Mar 5, 2020

@danscales That's great, thanks! I'll add the work around so I can move to 1.14

Do you think the fix will get cherrypicked to 1.14.1 ?

The build issue crossed my mind, but I'm pretty sure I did go build -a -v and saw the package listed so I thought that covered it. Maybe cleaning the build cache would have been safer?

PS. Just listened to the GoTime podcast you were on. Including, coincidentally, the challenge of testing these kinds of changes.

Copy link

@danscales danscales commented Mar 5, 2020

Currently, it seems like it could be a good option for cherrypicking for 1.14.1, but will have to confirm the fix and check with release folks, etc. I'll update as I learn more.

Copy link

@dmitshur dmitshur commented Mar 5, 2020

@danscales Once you know more, if you believe this meets the criteria for backporting documented at, feel free to follow the process described there to open backport issues. Thanks.

Copy link

@apmckinlay apmckinlay commented Mar 6, 2020

I added the workaround to the defer's that re-panic.
But that didn't solve all the problems.
I added the workaround to a few more defer's that call recover.
But I still have some lingering issues.
Do I need the workaround on every defer with a recover? Or every defer?
But would that mean I'd still have issues with defer's in the Go runtime?
Or should I just stick to 1.13 until the fix comes out? (hopefully in 1.14.1 rather than 1.15)

Note: compiling with -gcflags="all=-N" does eliminate the problems

Copy link

@danscales danscales commented Mar 7, 2020

I would have expected that you would only need the workaround for every defer that has a recover, then possible re-panic, or is likely to be on the stack when such panic-recover-re-panics are happening. I don't expect you would have any trouble with the defer in the go Runtime (really just in runtime.main, I think, at the very start of the program). However, it may be a little hard to catch all such defers.

It would be helpful if you happen to try a bit more to apply the workaround to the various defers that seem to be related to the panic-recover-repanic loop, and let me know through the bug if you are successful (and how many defers you tried fixing, whether successful or not).

I have the fix and a sample test that I'm about to put out for review.

Copy link

@gopherbot gopherbot commented Mar 7, 2020

Change mentions this issue: runtime: fix problem with repeated panic/recover/re-panics and open-coded defers

Copy link

@apmckinlay apmckinlay commented Mar 9, 2020

@danscales I made another pass over the code and added the workaround to all the defers with recover that I thought might end up on the call stack. (now 14 places) It's tricky because it's a language implementation so the behavior is very dynamic and it's hard to statically determine what might be nested. With the additional workarounds, everything seems to be working fine, although I haven't done extensive testing. I will roll it out to some beta users and see what happens.

@gopherbot gopherbot closed this in fae87a2 Mar 10, 2020
Copy link

@danscales danscales commented Mar 10, 2020

@gopherbot please consider this for backport to 1.14, it's a regression (and the fix is quite simple).

Copy link

@gopherbot gopherbot commented Mar 10, 2020

Backport issue(s) opened: #37782 (for 1.14).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to

Copy link

@gopherbot gopherbot commented Mar 10, 2020

Change mentions this issue: [release-branch.go1.14] runtime: fix problem with repeated panic/recover/re-panics and open-coded defers

gopherbot pushed a commit that referenced this issue Mar 11, 2020
…ver/re-panics and open-coded defers

In the open-code defer implementation, we add defer struct entries to the defer
chain on-the-fly at panic time to represent stack frames that contain open-coded
defers. This allows us to process non-open-coded and open-coded defers in the
correct order. Also, we need somewhere to be able to store the 'started' state of
open-coded defers. However, if a recover succeeds, defers will now be processed
inline again (unless another panic happens). Any defer entry representing a frame
with open-coded defers will become stale once we run the corresponding defers
inline and exit the associated stack frame. So, we need to remove all entries for
open-coded defers at recover time.

The current code was only removing the top-most open-coded defer from the defer
chain during recovery. However, with recursive functions that do repeated
panic-recover-repanic, multiple stale entries can accumulate on the chain. So, we
just adjust the loop to process the entire chain. Since this is at panic/recover
case, it is fine to scan through the entire chain (which should usually have few
elements in it, since most defers are open-coded).

The added test fails with a SEGV without the fix, because it tries to run a stale
open-code defer entry (and the stack has changed).

Updates #37664.
Fixes #37782.

Change-Id: I8e3da5d610b5e607411451b66881dea887f7484d
Run-TryBot: Dan Scales <>
TryBot-Result: Gobot Gobot <>
Reviewed-by: Keith Randall <>
(cherry picked from commit fae87a2)
Run-TryBot: Dmitri Shuralyov <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.