Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: "morestack on g0" in TestSegv on darwin-amd64 builders #39457

bcmills opened this issue Jun 8, 2020 · 13 comments

runtime: "morestack on g0" in TestSegv on darwin-amd64 builders #39457

bcmills opened this issue Jun 8, 2020 · 13 comments
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin


Copy link

bcmills commented Jun 8, 2020


--- FAIL: TestSegv (0.00s)
    --- FAIL: TestSegv/Segv (0.02s)
        crash_test.go:105: /var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/tmp/go-build172134279/testprogcgo.exe SegvInCgo exit status: exit status 2
        crash_cgo_test.go:569: fatal: morestack on g0
            SIGTRAP: trace trap
            PC=0x406b702 m=0 sigcode=1
            goroutine 0 [idle]:
            	/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/asm_amd64.s:860 +0x2
            	/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/asm_amd64.s:416 +0x25
            goroutine 19 [syscall]:
            runtime.cgocall(0x4123600, 0xc00003a7c0, 0x4123600)
            	/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/cgocall.go:133 +0x5b fp=0xc00003a790 sp=0xc00003a758 pc=0x400503b
            	_cgo_gotypes.go:329 +0x45 fp=0xc00003a7c0 sp=0xc00003a790 pc=0x411a2a5
            	/private/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/testdata/testprogcgo/segv.go:46 +0x30 fp=0xc00003a7d8 sp=0xc00003a7c0 pc=0x41224b0
            	/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00003a7e0 sp=0xc00003a7d8 pc=0x406b8e1
            created by main.SegvInCgo
            	/private/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/testdata/testprogcgo/segv.go:43 +0x5c
            goroutine 1 [sleep]:
            	/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/time.go:188 +0xbf
            	/private/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/testdata/testprogcgo/segv.go:55 +0x9c
            	/private/var/folders/kh/5zzynz152r94t18yzstnrwx80000gn/T/workdir-host-darwin-10_15/go/src/runtime/testdata/testprogcgo/main.go:34 +0x1da
            rax    0x17
            rbx    0xc00003a710
            rcx    0x4265d40
            rdx    0x0
            rdi    0x2
            rsi    0xc00003a6b0
            rbp    0xc00003a780
            rsp    0xc00003a738
            r8     0x4265d40
            r9     0x0
            r10    0xc00003a710
            r11    0x202
            r12    0xf1
            r13    0x0
            r14    0x418de44
            r15    0x0
            rip    0x406b702
            rflags 0x202
            cs     0x2b
            fs     0x0
            gs     0x0
        crash_cgo_test.go:571: expected crash from signal
FAIL	runtime	69.144s

CC @aclements @ianlancetaylor @cherrymui

@bcmills bcmills added OS-Darwin NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 8, 2020
@bcmills bcmills added this to the Backlog milestone Jun 8, 2020
@golang golang deleted a comment Jun 10, 2020
Copy link

On darwin/amd64, to work around a kernel issue we rewrite SI_USER SIGEGV to kernel-generated: . So, in this case, an actual user-sent SIGSEGV will be treated as kernel-generated signal, and cause it to inject a sigpanic. If the signal lands at a bad time, e.g. we're right in the middle of a stack switch, where the g and the stack don't match, bad things will happen.

I'm not sure what the best solution is. A few possibilities:

  • do nothing (maybe skip/relax the test). It isn't too bad in that it will crash the program anyway (sigpanic will throw for this particular bad address), unless PanicOnFault is set.
  • remove the workaround (at least the sigcode part, we could still change the faulting address). A malformed address will be treated as user-sent SIGSEGV, which will crash the program now. PanicOnFault is still a problem.

Not sure what to do with PanicOnFault. Due to the kernel issue, it seems we cannot distinguish malformed address vs. user-sent SIGSEGV. We have to make both recoverable or non-recoverable...

(The workaround was added for OS X 10.9. The kernel issue seems still there for macOS 10.15...)

Copy link

Another possibility: when switching from user stack to system stack (e.g. in systemstack, asmcgocall, etc.), we always do (1) set user g's g.throwsplit to true, (2) change SP, (3) change the g register to g0. And do it in the opposite order when switching back. This might solve the immediate SIGSEGV-landing-in-stack-switch problem. Not sure if there is any other problem. Seems pretty complicated, though.

@bcmills bcmills changed the title runtime: "morestack on g0" in TestSegv on darwin-amd64-race builder runtime: "morestack on g0" in TestSegv on darwin-amd64 builders Aug 31, 2020
Copy link
Member Author

bcmills commented Aug 31, 2020

Hmm... Why do we ignore user-generated SIGSEGV signals in the first place? I explicitly sent a program SIGSEGV on the command line, I would generally expect to get a core dump (since that is the SIG_DFL behavior of the signal to begin with).

Copy link

We don't ignore user-generated SIGSEGV signals. That's the point of the test. I'm not sure what you are saying here.

The test failure logs suggest that the problem is that we somehow think that we are out of stack space while handling a signal. I'm not sure how that could happen.

Copy link

In my experience, "morestack on g0" is usually not we are actually running out of stack space on g0, but somehow the SP and and the G (and so the stack bounds) don't match. My comment above mentioned some possibilities, e.g. signal lands right in the middle of a stack switch.

As @ianlancetaylor said, we don't ignore user-generated SIGSEGV (we did it in the past, but not now). The specialness of darwin is that we treat user-generated SIGSEGV (which should crash the runtime) as kernel-generated (which causes a panic), due to a kernel bug ( ). Because of that, we inject a sigpanic call, instead of just throw, and somewhere down the panic path there are non-nosplit functions that check stack bounds. If the G and stack bounds don't match, it could crash like this.

Copy link
Member Author

bcmills commented Feb 2, 2022

darwin/amd64 is a first class port, and this test has been failing intermittently on the builder for over a year and a half.

If the behavior covered by this test is important then we really ought to find a solution for it; otherwise, the test should be skipped to reduce noise on the builders.

Copy link

Change mentions this issue: runtime: skip TestSegv failures with "morestack on g0" on darwin/amd64

gopherbot pushed a commit that referenced this issue Feb 3, 2022
This failure mode has been present since at least 2020-06-08. We have
enough information to diagnose it, and further failures don't seem to
be adding any new information at this point: they can only add noise,
both on the Go project's builders and in users' own modules (for
example, when run as part of 'go test all').

For #39457

Change-Id: I2379631da0c8af69598fa61c0cc5ac0ea6ba8267
Trust: Bryan Mills <>
Run-TryBot: Bryan Mills <>
TryBot-Result: Gopher Robot <>
Reviewed-by: Cherry Mui <>
Copy link

evanw commented Feb 18, 2022

I have a user report about this error on a Windows machine with esbuild, which is written in Go but doesn't use cgo: evanw/esbuild#2031. I searched and found this issue and that report seemed potentially related, so I'm posting about it here in case it helps.

Copy link

@evanw that looks like a different issue. Could you open a new issue? Thanks.

This issue is very specific to darwin (macOS) when a program receives an external SIGSEGV signal (e.g. by kill command or syscall).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Status: Triage Backlog

No branches or pull requests

5 participants