Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: nosplit stack overflow on AIX #29572

theonewolf opened this issue Jan 4, 2019 · 9 comments

cmd/link: nosplit stack overflow on AIX #29572

theonewolf opened this issue Jan 4, 2019 · 9 comments


Copy link

@theonewolf theonewolf commented Jan 4, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12beta1 linux/amd64

Installed with go get

Does this issue reproduce with the latest release?

It reproduces with the latest released Beta for 1.12.

What operating system and processor architecture are you using (go env)?

I've put my Linux env below, but this also happens on Windows 10.

go env Output
$ go env
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build101719635=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Created a simple main.go:

package main

func main() {

then tried to build:

GOARCH=ppc64 GOOS=aix /home/wolf/sdk/go1.12beta1/bin/go build
# _/home/wolf/testgo-aix
runtime.fatalthrow: nosplit stack overflow
        752     assumed on entry to runtime.sigtrampgo (nosplit)
        528     after runtime.sigtrampgo (nosplit) uses 224
        464     after runtime.sigfwdgo (nosplit) uses 64
        344     after runtime.setsig (nosplit) uses 120
        256     after runtime.sigaction (nosplit) uses 88
        184     after runtime.syscall3 (nosplit) uses 72
        152     after runtime.asmcgocall (nosplit) uses 32
        104     after runtime.badctxt (nosplit) uses 48
        40      after runtime.throw (nosplit) uses 64
        -32     after runtime.fatalthrow (nosplit) uses 72

What did you expect to see?

I expected to see no direct output from the command on stdout and stderr. I expected to see a binary on the file system created for the AIX environment and PowerPC architecture.

What did you see instead?

A runtime traceback.

@ianlancetaylor ianlancetaylor changed the title Compiling for AIX Causing Stack Overflow cmd/link: nosplit stack overflow Jan 4, 2019
Copy link

@ianlancetaylor ianlancetaylor commented Jan 4, 2019

@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Jan 4, 2019
@ianlancetaylor ianlancetaylor changed the title cmd/link: nosplit stack overflow cmd/link: nosplit stack overflow on AIX Jan 4, 2019
Copy link

@theonewolf theonewolf commented Jan 4, 2019

@ianlancetaylor I wasn't sure if it was expected to work out of the box in this beta release or not. I'd be happy right now compiling the simplest of programs just to test the functionality out, even with limitations.

Copy link

@Helflym Helflym commented Jan 7, 2019


This error is due to the fact that StackGuard is increased on AIX:

This change appears in src/cmd/internal/objabi/zbootstrap.go, which is created by cmd/dist according to GOOS and GOARCH. Therefore, on linux/amd64, you will have stackGuardMultiplier = 1 while on aix/ppc64, I have stackGuardMultiplier = 2.

@ianlancetaylo objabi.StackLimit is used in cmd/link even if a cross-compilation is requested. Therefore, its value depends on the host machine and not on the target one. Is it intended ? If not, is it possible to create a StackLimit value for all GOOS/GOARCH in src/cmd/internal/objabi/zstack.go or something like this ?

Ps: Reducing stackGuardMultiplier to 1 is really hard if not impossible on AIX, because of syscalls using asmcgocall.

Copy link

@randall77 randall77 commented Jan 7, 2019

stackGuardMultiplier was not intended to depend on the host. It was originally supposed to depend just on -N. I see a dependence on GOOS has been added, but that wasn't done correctly for cross-compilation.

I think a better way to add extra stack space for AIX is to increase _StackSystem in src/runtime/stack.go. We do that already for other os/arch combinations. (It's an additive extra amount, not a multiplicative one.)

Copy link

@Helflym Helflym commented Jan 7, 2019

I've added this GOOS dependence because I haven't found any other way to change the stacklimit.

Values defined in src/runtime/stack.go aren't used inside cmd/link. Therefore, modify them won't fix these "nosplit stack overflow" which are triggered by cmd/link according to StackLimit value in src/cmd/internal/objabi/stack.go

Copy link

@randall77 randall77 commented Jan 7, 2019

I see now, the stack overflow is because of the path up to asmcgocall, not the space used by the thing being called. Yes, that won't be fixed by _StackSystem.

It sounds like you would need to make StackGuard and stackGuardMultiplier variables instead of constants in objabi. Then modify them based on the target architecture (perhaps by making them functions that take the target arch as input?). The corresponding changes should be made in runtime/stack.go (but those can still be constants).

Perhaps we can fix the underlying problem though. Reducing StackGuard space from 880 to 600 on linux/amd64 gets me the corresponding overflow trace:

runtime.sysSigaction: nosplit stack overflow
	464	assumed on entry to runtime.sigtrampgo (nosplit)
	296	after runtime.sigtrampgo (nosplit) uses 168
	288	on entry to runtime.sigfwdgo (nosplit)
	248	after runtime.sigfwdgo (nosplit) uses 40
	240	on entry to runtime.dieFromSignal (nosplit)
	216	after runtime.dieFromSignal (nosplit) uses 24
	208	on entry to runtime.setsig (nosplit)
	144	after runtime.setsig (nosplit) uses 64
	136	on entry to runtime.sigaction (nosplit)
	48	after runtime.sigaction (nosplit) uses 88
	40	on entry to runtime.sysSigaction (nosplit)
	-8	after runtime.sysSigaction (nosplit) uses 48

I see two issues comparing this trace to the ppc64 one posted above. First, the ppc64 frames are generally bigger. That may be a difficult one to fix in general, not sure, but might be worth looking at a couple of the big ones to see if there's anything simple to fix. Second, there's a bunch of frames after sigaction that don't occur on linux (syscall3 + callees). Perhaps those can be wrapped in a systemstack call?

Copy link

@Helflym Helflym commented Jan 8, 2019

The ppc64 frames are always bigger because 32 bits are reserved for each new frame (cf
A systemstack wrapper won't work either because a syscall cannot switch to g0 inside the signal handler.

All these checks/switches are actually done by asmcgocall. As far as I know stackcheck is only needed for nosplits in a "classic g stack". As functions called by asmcgocall are always on g0 or gsignal, should stackcheck stop when asmcgocall is found ?

Copy link

@ianlancetaylor ianlancetaylor commented Jan 8, 2019

In this case the complaint seems to be about the call from asmcgocall to badctxt via gsave<>, which happens before any stack switch is done.

Copy link

@gopherbot gopherbot commented Jan 9, 2019

Change mentions this issue: cmd: fix stack size when cross-compiling aix/ppc64

@gopherbot gopherbot closed this in 20ac64a Jan 9, 2019
@golang golang locked and limited conversation to collaborators Jan 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.