Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: crash since Go 1.22.5 on Alpine (newstack at runtime.printlock) #68285

Open
dunglas opened this issue Jul 3, 2024 · 31 comments
Open
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@dunglas
Copy link
Contributor

dunglas commented Jul 3, 2024

Go version

go version go1.22.5 linux/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/root/.cache/go-build'
GOENV='/config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/root/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/root/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_arm64'
GOVCS=''
GOVERSION='go1.22.5'
GCCGO='gccgo'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/go/src/app/go.mod'
GOWORK=''
CGO_CFLAGS='-DFRANKENPHP_VERSION=v1.2.1 -fstack-protector-strong -fpic -fpie -O2 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
CGO_CPPFLAGS='-fstack-protector-strong -fpic -fpie -O2 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-lssl -lcrypto -lreadline -largon2 -lcurl -lonig -lz -Wl,-O1 -pie'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2366048768=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Reproducer:

docker pull dunglas/frankenphp:1-builder-alpine
docker run -it dunglas/frankenphp:1-builder-alpine sh
go test

What did you see happen?

runtime: newstack at runtime.printlock+0x78 sp=0x4000167be0 stack=[0x4000166000, 0x4000168000]
        morebuf={pc:0x407920 sp:0x4000167be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000167be0 lr:0x407920 ctxt:0x0}
runtime.callbackUpdateSystemStack(runtime: newstack at runtime.printlock+0x78 sp=0x4000365be0 stack=[0x4000364000, 0x4000366000]
        morebuf={pc:0x407920 sp:0x4000365be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000365be0 lr:0x407920 ctxt:0x0}
runtime: newstack at runtime.printlock+0x78 sp=0x4000361be0 stack=[0x4000360000, 0x4000362000]
        morebuf={pc:0x407920 sp:0x4000361be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000361be0 lr:0x407920 ctxt:0x0}
runtime.callbackUpdateSystemStack(0x4000501808, 0xffff44e1fb80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000361c40 sp=0x4000361be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44e1fc70, 0x0)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000361cc0 sp=0x4000361c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44e1fc70, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000361cf0 sp=0x4000361cc0 pc=0x47b1bc
runtime.cgocallback(0x4000361d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000361d20 sp=0x4000361cf0 pc=0x479630
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000361d30 sp=0x4000361d20 pc=0x4771f8
runtime.cgocall(0x739e70, 0x4000361da8)
        /usr/local/go/src/runtime/cgocall.go:175 +0x70 fp=0x4000361d70 sp=0x4000361d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff43eb29c0)
        _cgo_gotypes.go:791 +0x34 fp=0x4000361da0 sp=0x4000361d70 pc=0x6dad64
github.com/dunglas/frankenphp.go_execute_script(0x76ab01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000361e10 sp=0x4000361da0 pc=0x6dd7ccruntime: newstack at runtime.printlock+0x78 sp=0x4000165be0 stack=[0x4000164000, 0x4000166000]
        morebuf={pc:0x407920 sp:0x4000165be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000165be0 lr:0x407920 ctxt:0x0}
runtime.callbackUpdateSystemStack(0x40001e4808, 0xffff44b07b80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000165c40 sp=0x4000165be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44b07c70, 0x0)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000165cc0 sp=0x4000165c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44b07c70, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000165cf0 sp=0x4000165cc0 pc=0x47b1bc
runtime.cgocallback(0x4000165d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000165d20 sp=0x4000165cf0 pc=0x479630
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000165d30 sp=0x4000165d20 pc=0x4771f8
runtime.cgocall(0x739e70, 0x4000165da8)
        /usr/local/go/src/runtime/cgocall.go:175 +0x70 fp=0x4000165d70 sp=0x4000165d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff4477cb60)
        _cgo_gotypes.go:791 +0x34 fp=0x4000165da0 sp=0x4000165d70 pc=0x6dad64
github.com/dunglas/frankenphp.go_execute_script(0x4000165e01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000165e10 sp=0x4000165da0 pc=0x6dd7cc
_cgoexp_a0107ffcccc7_go_execute_script(0x4000165e88?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000165e30 sp=0x4000165e10 pc=0x6e1670
runtime.cgocallbackg1(0x6e1650, 0xffff44b0a5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000165f00 sp=0x4000165e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44b0a5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000165f80 sp=0x4000165f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44b0a5e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000165fb0 sp=0x4000165f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000165fe0 sp=0x4000165fb0 pc=0x479630
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000165fe0 sp=0x4000165fe0 pc=0x479724
fatal error: runtime: stack split at bad time
runtime: newstack at runtime.printlock+0x78 sp=0x4000459be0 stack=[0x4000458000, 0x400045a000]
        morebuf={pc:0x407920 sp:0x4000459be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000459be0 lr:0x407920 ctxt:0x0}
runtime.callbackUpdateSystemStack(0x4000400808, 0xffff44eabb80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000459c40 sp=0x4000459be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44eabc70, 0x0)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000459cc0 sp=0x4000459c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44eabc70, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000459cf0 sp=0x4000459cc0 pc=0x47b1bc
runtime.cgocallback(0x4000459d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000459d20 sp=0x4000459cf0 pc=0x479630
runtime.systemstack_switch
_cgoexp_a0107ffcccc7_go_execute_script(0x4000494e88?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000361e30 sp=0x4000361e10 pc=0x6e1670
runtime.cgocallbackg1(0x6e1650, 0xffff44e225e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000361f00 sp=0x4000361e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44e225e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000361f80 sp=0x4000361f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44e225e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000361fb0 sp=0x4000361f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000361fe0 sp=0x4000361fb0 pc=0x479630
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000361fe0 sp=0x4000361fe0 pc=0x479724
fatal error: runtime: stack split at bad time
0x4000500008, 0xffff44bb6b80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000167c40 sp=0x4000167be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44bb6c70, 0x0)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000167cc0 sp=0x4000167c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44bb6c70, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000167cf0 sp=0x4000167cc0 pc=0x47b1bc
runtime.cgocallback(0x4000167d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000167d20 sp=0x4000167cf0 pc=0x479630
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000167d30 sp=0x4000167d20 pc=0x4771f8
runtime.cgocall(0x739e70, 0x4000167da8)
        /usr/local/go/src/runtime/cgocall.go:175 +0x70 fp=0x4000167d70 sp=0x4000167d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff44ee3430)
        _cgo_gotypes.go:791 +0x34 fp=0x4000167da0 sp=0x4000167d70 pc=0x6dad64
github.com/dunglas/frankenphp.go_execute_script(0x4000490e01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000167e10 sp=0x4000167da0 pc=0x6dd7cc
runtime.callbackUpdateSystemStack(0x400009a808, 0xffff44b70b80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000365c40 sp=0x4000365be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44b70c70, 0x0)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000365cc0 sp=0x4000365c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44b70c70, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000365cf0 sp=0x4000365cc0 pc=0x47b1bc
runtime.cgocallback(0x4000365d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130_cgoexp_a0107ffcccc7_go_execute_script +0xb0 fp=0x4000365d20 sp=0x4000365cf0 pc=0x479630
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000365d30 sp=0x4000365d20 pc=0x4771f8
(runtime: newstack at runtime.printlock+0x78 sp=0x4000251be0 stack=[0x4000250000, 0x4000252000]
        morebuf={pc:0x407920 sp:0x4000251be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000251be0 lr:0x407920 ctxt:0x0}
runtime.callbackUpdateSystemStack(0x4000300808, 0xffff44bfcb80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000251c40 sp=0x4000251be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44bfcc70, runtime: newstack at runtime.printlock+0x78 sp=0x4000253be0 stack=[0x4000252000, 0x4000254000]
        morebuf={pc:0x407920 sp:0x4000253be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000253be0 lr:0x407920 ctxt:0x0}
runtime.callbackUpdateSystemStack(0x4000500808, 0xffff44e88b80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000253c40 sp=0x4000253be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44e88c70, 0x0)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000253cc0 sp=0x4000253c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44e88c70, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000253cf0 sp=0x4000253cc0 pc=0x47b1bc
runtime.cgocallback(0x4000253d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000253d20 sp=0x4000253cf0 pc=0x479630
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000253d30 sp=0x4000253d20 pc=0x4771f8
runtime.cgocall(0x739e70, 0x4000253da8)
        /usr/local/go/src/runtime/cgocall.go:175 +0x70 fp=0x4000253d70 sp=0x4000253d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff43b57b20)
        _cgo_gotypes.go:791 +0x34 fp=0x4000253da0 sp=0x4000253d70 pc=0x6dad64
0x4000490e88github.com/dunglas/frankenphp.go_execute_script(0x40002d0e01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000253e10 sp=0x4000253da0 pc=0x6dd7cc
_cgoexp_a0107ffcccc7_go_execute_script(0x8309c0?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000253e30 sp=0x4000253e10 pc=0x6e1670
runtime.cgocallbackg1(0x6e1650, 0xffff44e8b5e0, runtime: newstack at runtime.printlock+0x78 sp=0x4000255be0 stack=[0x4000254000, 0x4000256000]
        morebuf={pc:0x407920 sp:0x4000255be0 lr:0x0}
        sched={pc:0x43f858 sp:0x4000255be0 lr:0x407920 ctxt:0x0}
runtime.callbackUpdateSystemStack(0x4000480808, 0xffff44bd9b80, 0x0)
        /usr/local/go/src/runtime/cgocall.go:241 +0x90 fp=0x4000255c40 sp=0x4000255be0 pc=0x407920
runtime.cgocallbackg(0x6e1a90, 0xffff44bd9c70, 0x0)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000255cc0 sp=0x4000255c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44bd9c70, 0x0)
        <autogenerated>:1?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000167e30 sp=0x4000167e10 pc=0x6e1670
runtime.cgocallbackg1(0x6e1650, 0xffff44bb95e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000167f00 sp=0x4000167e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44bb95e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000167f80 sp=0x4000167f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44bb95e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000167fb0 sp=0x4000167f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000167fe0 sp=0x4000167fb0 pc=0x479630
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000167fe0 sp=0x4000167fe0 pc=0x479724
fatal error: runtime: stack split at bad time
()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000459d30 sp=0x4000459d20 pc=0x4771f8
runtime.cgocall(0x739e70, 0x4000459da8)
        /usr/local/go/src/runtime/cgocall.go:1750x0 +0x70 fp=0x4000459d70 sp=0x4000459d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff43bfa450)
        _cgo_gotypes.go:791 +0x34 fp=0x4000459da0 sp=0x4000459d70 pc=0x6dad64
github.com/dunglas/frankenphp.go_execute_script(0x40001cfe01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000459e10 sp=0x4000459da0 pc=0x6dd7cc
_cgoexp_a0107ffcccc7_go_execute_script(0x40001cfe88?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000459e30 sp=0x4000459e10 pc=0x6e1670
runtime.cgocallbackg1(0x6e1650, 0xffff44eae5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000459f00 sp=0x4000459e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44eae5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000459f80 sp=0x4000459f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44eae5e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000459fb0 sp=0x4000459f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000459fe0 sp=0x4000459fb0 pc=0x479630
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000459fe0 sp=0x4000459fe0 pc=0x479724
fatal error: runtime: stack split at bad time
runtime.cgocall(0x739e70, 0x4000365da8)
0x0     /usr/local/go/src/runtime/cgocall.go:175 +0x70)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000253f00 sp=0x4000253e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44e8b5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000253f80 sp=0x4000253f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44e8b5e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000253fb0 sp=0x4000253f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000253fe0 sp=0x4000253fb0 pc=0x479630
 fp=0x4000365d70 sp=0x4000365d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff43f6eb10)
        _cgo_gotypes.go:791 +0x34 fp=0x4000365da0 sp=0x4000365d70 pc=0x6dad64
github.com/dunglas/frankenphp.go_execute_script(0x40000dde01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000365e10 sp=0x4000365da0 pc=0x6dd7cc
_cgoexp_a0107ffcccc7_go_execute_script(0x40000dde88?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000365e30 sp=0x4000365e10 pc=0x6e1670
)
        /usr/local/go/src/runtime/cgocall.go:306 +0x68 fp=0x4000251cc0 sp=0x4000251c40 pc=0x407ae8
runtime.cgocallbackg(0x6e1a90, 0xffff44bfcc70, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000251cf0 sp=0x4000251cc0 pc=0x47b1bc
runtime.cgocallback(0x4000251d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000251d20 sp=0x4000251cf0 pc=0x479630
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000251d30 sp=0x4000251d20 pc=0x4771f8
runtime.cgocall(0x739e70, 0x4000251da8)
        /usr/local/go/src/runtime/cgocall.go:175 +0x70 fp=0x4000251d70 sp=0x4000251d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff44618050)
        _cgo_gotypes.go:791 +0x34 fp=0x4000251da0 sp=0x4000251d70 pc=0x6dad64
github.com/dunglas/frankenphp.go_execute_script(0x40002c3e01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000251e10 sp=0x4000251da0 pc=0x6dd7cc
_cgoexp_a0107ffcccc7_go_execute_script(0x40002c3e88?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000251e30 sp=0x4000251e10 pc=0x6e1670
runtime.cgocallbackg1(0x6e1650, 0xffff44bff5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000251f00 sp=0x4000251e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44bff5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000251f80 sp=0x4000251f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44bff5e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000251fb0 sp=0x4000251f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000251fe0 sp=0x4000251fb0 pc=0x479630
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000251fe0 sp=0x4000251fe0 pc=0x479724
fatal error: runtime: stack split at bad time
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000253fe0 sp=0x4000253fe0 pc=0x479724
fatal error: runtime: stack split at bad time
runtime.cgocallbackg1(0x6e1650, 0xffff44b735e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000365f00 sp=0x4000365e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44b735e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000365f80 sp=0x4000365f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44b735e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000365fb0 sp=0x4000365f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000365fe0 sp=0x4000365fb0 pc=0x479630
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000365fe0 sp=0x4000365fe0 pc=0x479724
fatal error: runtime: stack split at bad time
 +0x1c fp=0x4000255cf0 sp=0x4000255cc0 pc=0x47b1bc
runtime.cgocallback(0x4000255d68, 0x6dad64, 0x739e70)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000255d20 sp=0x4000255cf0 pc=0x479630
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_arm64.s:200 +0x8 fp=0x4000255d30 sp=0x4000255d20 pc=0x4771f8
runtime.cgocall(0x739e70, 0x4000255da8)
        /usr/local/go/src/runtime/cgocall.go:175 +0x70 fp=0x4000255d70 sp=0x4000255d30 pc=0x407830
github.com/dunglas/frankenphp._Cfunc_frankenphp_execute_script(0xffff446ca1d0)
        _cgo_gotypes.go:791 +0x34 fp=0x4000255da0 sp=0x4000255d70 pc=0x6dad64
github.com/dunglas/frankenphp.go_execute_script(0x4000495e01?)
        /go/src/app/frankenphp.go:511 +0x11c fp=0x4000255e10 sp=0x4000255da0 pc=0x6dd7cc
_cgoexp_a0107ffcccc7_go_execute_script(0x4000495e88?)
        _cgo_gotypes.go:923 +0x20 fp=0x4000255e30 sp=0x4000255e10 pc=0x6e1670
runtime.cgocallbackg1(0x6e1650, 0xffff44bdc5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:420 +0x228 fp=0x4000255f00 sp=0x4000255e30 pc=0x407e68
runtime.cgocallbackg(0x6e1650, 0xffff44bdc5e0, 0x0)
        /usr/local/go/src/runtime/cgocall.go:339 +0x10c fp=0x4000255f80 sp=0x4000255f00 pc=0x407b8c
runtime.cgocallbackg(0x6e1650, 0xffff44bdc5e0, 0x0)
        <autogenerated>:1 +0x1c fp=0x4000255fb0 sp=0x4000255f80 pc=0x47b1bc
runtime.cgocallback(0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/asm_arm64.s:1130 +0xb0 fp=0x4000255fe0 sp=0x4000255fb0 pc=0x479630
runtime.goexit({})
        /usr/local/go/src/runtime/asm_arm64.s:1222 +0x4 fp=0x4000255fe0 sp=0x4000255fe0 pc=0x479724
fatal error: runtime: stack split at bad time

runtime stack:
runtime.throw({0x805cd1?, 0xffff44b07a10?})
        /usr/local/go/src/runtime/panic.go:1023 +0x40 fp=0xffff44b079c0 sp=0xffff44b07990 pc=0x43cf70
runtime.newstack()
        /usr/local/go/src/runtime/stack.go:995 +0x7d0 fp=0xffff44b07b70 sp=0xffff44b079c0 pc=0x45b6d0
runtime.morestack()
        /usr/local/go/src/runtime/asm_arm64.s:341 +0x70 fp=0xffff44b07b70 sp=0xffff44b07b70 pc=0x477380

...

exit status 2

Full stack trace: https://github.com/dunglas/frankenphp/actions/runs/9778553362/job/26995650580?pr=898

What did you expect to see?

No crash.

The test suite wasn't crashing with previous Go versions.
It also doesn't crash on Debian and macOS, only Alpine (musl?) is affected.

The project (FrankenPHP) heavily relies on cgo.
We change the default Alpine stack trace -ldflags "-w -s -extldflags '-Wl,-z,stack-size=0x80000'.
The binary is compressed with UPX.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 3, 2024
@dunglas
Copy link
Contributor Author

dunglas commented Jul 3, 2024

This problem has likely been introduced in 3560cf0 for #67298 (cc @prattmic, @joedian).

@mauri870 mauri870 added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 3, 2024
@ianlancetaylor
Copy link
Contributor

Do you have any local patches to the Go standard library? I know that some Alpine systems use local patches.

@prattmic
Copy link
Member

prattmic commented Jul 3, 2024

You are hitting the fatal error at https://cs.opensource.google/go/go/+/refs/tags/go1.22.5:src/runtime/cgocall.go;l=241 due to an apparently bogus SP.

runtime: newstack at runtime.printlock comes from the print also checking SP and getting upset. I think https://go.dev/cl/585817 will fix that so you get the actual error out (it will still crash, but with a more useful message). Perhaps patch that in to get better logs.

From the mention of FrankenPHP, I assume this is just a continuation of #62130? In #62130 (comment) and @dunglas' reply, we established that FrankenPHP changes stacks between a Go-to-C call and subsequent C-to-Go callback on the same thread. Go does not support this, and we never made a concrete decision on whether we should support this or not.

Seems like this should be closed as a duplicate of #62130?

@dunglas
Copy link
Contributor Author

dunglas commented Jul 3, 2024

@ianlancetaylor we're using the official Docker images for Go (Alpine variants), I assume the standard library is not patched.

@prattmic this doesn't seem to be the same problem as #62130. #62130 is triggered only in a special case that is not very popular (Fibers, a new PHP feature that changes stack). Since yesterday, all FrankenPHP installations (including ones not using Fibers) on Alpine are crashing, it wasn't the case before Go 1.22.5, so I suppose this is a problem introduced in this version.

@dunglas
Copy link
Contributor Author

dunglas commented Jul 3, 2024

We have an easy way to reproduce and debug, feel free to contact me in private (kevin@dunglas.dev, X, etc) if you want assistance to setup a reproducer.

@prattmic
Copy link
Member

prattmic commented Jul 3, 2024

this doesn't look as the same problem as #62130. #62130 is triggered only in a very special case that is not very popular (Fibers, a new PHP feature that changes stack). Since yesterday, all FrankenPHP installations (including ones not using Fibers) on Alpine are crashing.

Thanks for the clarification, let's keep this separate then.

@prattmic
Copy link
Member

prattmic commented Jul 3, 2024

cc @golang/runtime @cherrymui

@thanm
Copy link
Contributor

thanm commented Jul 3, 2024

Not sure if this is actionable / supported, but assigning for the moment to Michael.

@dunglas
Copy link
Contributor Author

dunglas commented Jul 10, 2024

Here is the stack trace with http://go.dev/cl/585817, as suggested by @prattmic:

M 11 procid 2986 runtime: cgocallback with sp=0x7f43bbdc9ee0 out of bounds [0x7f43bbda6ba0, 0x7f43bbdc6ba0]
fatal error: cgocallback SP out of bounds

runtime stack:
runtime.throw({0xada75f?, 0x0?})
	/root/sdk/gotip/src/runtime/panic.go:1027 +0x48 fp=0xc00003ff98 sp=0xc00003ff68 pc=0x43fca8
runtime.callbackUpdateSystemStack.func1()
	/root/sdk/gotip/src/runtime/cgocall.go:244 +0x10c fp=0xc00003ffe0 sp=0xc00003ff98 pc=0x4095cc
runtime.switchToCrashStack0()
	/root/sdk/gotip/src/runtime/asm_amd64.s:563 +0x32 fp=0xc00003fff0 sp=0xc00003ffe0 pc=0x47c372

goroutine 149 gp=0xc000348380 m=11 mp=0xc00010ca08 [syscall, locked to thread]:
runtime.cgocall(0x9ccc60, 0xc00038fd18)
	/root/sdk/gotip/src/runtime/cgocall.go:158 +0x5b fp=0xc00038fcf0 sp=0xc00038fcb8 pc=0x4092db
github.com/***/frankenphp._Cfunc_frankenphp_execute_script(0x7f43bb2e87a0)
	_cgo_gotypes.go:791 +0x47 fp=0xc00038fd18 sp=0xc00038fcf0 pc=0x914527
github.com/***/frankenphp.go_handle_request()
	/go/src/app/frankenphp.go:491 +0x445 fp=0xc00038fe60 sp=0xc00038fd18 pc=0x918c45
_cgoexp_7180794f8083_go_handle_request(0x7f43bbdcc6f7)
	_cgo_gotypes.go:915 +0x25 fp=0xc00038fe80 sp=0xc00038fe60 pc=0x91ea25
runtime.cgocallbackg1(0x91ea00, 0x7f43bbdcc6f7, 0x0)
	/root/sdk/gotip/src/runtime/cgocall.go:423 +0x298 fp=0xc00038ff40 sp=0xc00038fe80 pc=0x409a18
runtime.cgocallbackg(0x91ea00, 0x7f43bbdcc6f7, 0x0)
	/root/sdk/gotip/src/runtime/cgocall.go:342 +0x11a fp=0xc00038ff90 sp=0xc00038ff40 pc=0x4096fa
runtime.cgocallbackg(0x91ea00, 0x7f43bbdcc6f7, 0x0)
	<autogenerated>:1 +0x29 fp=0xc00038ffb8 sp=0xc00038ff90 pc=0x480e49
runtime.cgocallback(0x0, 0x0, 0x0)
	/root/sdk/gotip/src/runtime/asm_amd64.s:1083 +0xcc fp=0xc00038ffe0 sp=0xc00038ffb8 pc=0x47e04c
runtime.goexit({})
	/root/sdk/gotip/src/runtime/asm_amd64.s:1699 +

...

To be honest, I've no idea what's going on. It may be a bug in our code, but it's surprising that the problem only occurs when using musl. I tested and can confirm that the problem only happens when using musl (both with Alpine and with a static build) and not with glibc.

@dunglas
Copy link
Contributor Author

dunglas commented Jul 10, 2024

I created a custom Go build that reverts 3560cf0, and this makes our test suite green. So I can confirm that the problem has been introduced by this commit: https://github.com/dunglas/frankenphp/actions/runs/9877453302/job/27279127840?pr=913

Would it be possible to revert it while a better fix isn't available? I'm not confident relying on a custom build, and the latest patch version contains security fixes in net/http that are necessary for our use cases.

@cherrymui
Copy link
Member

@dunglas thanks for the update! The reason why this is musl specific is probably that the code getting the stack bounds https://source.corp.google.com/h/go/go/+/release-branch.go1.22:src/runtime/cgo/gcc_stack_unix.c;l=24 uses different code paths for glibc and musl. With glibc, we get accurate stack bounds from pthread. With musl, it goes to the fallback path (the #else branch) which just gets an estimate using the current SP and the default stack bounds.

It seems musl also supports pthread_getattr_np. But it doesn't seem easy to test (musl doesn't define a macro like __MUSL__). If you'd like, you could try changing that line to #if 1 and see if that makes any difference.

But there is still an issue on systems that we couldn't get the accurate bounds. It seems this is a C created thread, which calls into Go, which then calls back into C, which calls back to Go again, which throws. Presumably during the first C->Go call it gets the estimated stack bounds. At the inner C-> Go call, the SP is slightly above the estimated upper bound, which is weird. I don't see how this could happen. The inner callback should be at a lower stack address than the outer callback... @dunglas could you confirm if the code does C->Go->C->Go calls? And does it adjust the stacks in anyway, or using non-local control flow like longjmp or setcontext? And would you mind add e.g. some print debugging to see what the SP is on each cgocallback? Thanks.

Perhaps in the case that we can't get the accurate bounds we don't throw if the SP is out of the estimated bounds? But that can also be tricky as if we update the bounds in the inner callback, when it returns, we may have weird stack bounds for the outer callback...

@cherrymui
Copy link
Member

Maybe what happens is that

  1. C created thread calls into Go, which calls back to C
  2. it uses some C stack, then calls back into Go, which updates the stack bounds
  3. callback1 returns to C
  4. calls back to Go again (callback2) at a shallower stack

The callback at step 4 is at a shallower stack (higher SP), but it sees the stack bounds from step 2, which is at a lower address, then throws.

@dunglas could you confirm that the code does C->Go calls like that? Thanks.

@dunglas
Copy link
Contributor Author

dunglas commented Jul 10, 2024

@cherrymui thank you very much for investigating!

Yes: FrankenPHP creates a C thread (in C: https://github.com/dunglas/frankenphp/blob/4fab5a3169357f6c6cadee1aa77ff8fa8480419f/frankenphp.c#L810), and the C code calls back to Go (and the Go code then calls the C code, etc).

The C code uses libphp, which relies on longjmp (often). libphp may also use setcontext (for Fibers), but not in our case (that's #62130).

Let me know if you want me to try some patches (I can try to use pthread_getattr_np() tomorrow French time) and I can also help you set an environment to play with the code (it's easy, everything uses Docker) if it helps.

@dunglas
Copy link
Contributor Author

dunglas commented Jul 12, 2024

Using pthread_getattr_np() with musl does fix the issue 🎉

Regarding the detection problem, a solution could be to check if the function is available (using a shell script similar to AC_CHECK_FUNCS from autotools) and define a constant at build time if that's the case.

I can work on a patch doing this if you agree on the approach.

@dunglas
Copy link
Contributor Author

dunglas commented Jul 12, 2024

Here are my experiments so far:

diff --git a/src/make.bash b/src/make.bash
index 76ad51624a..c6c3b87157 100755
--- a/src/make.bash
+++ b/src/make.bash
@@ -138,6 +138,13 @@ if [[ "$(uname -s)" == "GNU/kFreeBSD" ]]; then
 	export CGO_ENABLED=0
 fi
 
+# Check pthread_getattr_np() availability
+if [[ "${CGO_ENABLED}" != "0" ]] && ${CC:-gcc} -S runtime/cgo/testdata/detect_pthread_getattr_np.c -o /dev/null 2> /dev/null; then
+	export CGO_CFLAGS="-DHAVE_PTHREAD_GETATTR_NP ${CGO_CFLAGS}"
+	export CFLAGS="-DHAVE_PTHREAD_GETATTR_NP ${CFLAGS}"
+	echo $CGO_CFLAGS
+fi
+
 # Clean old generated file that will cause problems in the build.
 rm -f ./runtime/runtime_defs.go
 
diff --git a/src/runtime/cgo/gcc_stack_unix.c b/src/runtime/cgo/gcc_stack_unix.c
index 884281dc15..39d3402eca 100644
--- a/src/runtime/cgo/gcc_stack_unix.c
+++ b/src/runtime/cgo/gcc_stack_unix.c
@@ -21,7 +21,7 @@ x_cgo_getstackbound(uintptr bounds[2])
 	// Needed before pthread_getattr_np, too, since before glibc 2.32
 	// it did not call pthread_attr_init in all cases (see #65625).
 	pthread_attr_init(&attr);
-#if defined(__GLIBC__) || (defined(__sun) && !defined(__illumos__))
+#if defined(HAVE_PTHREAD_GETATTR_NP) || defined(__GLIBC__) || (defined(__sun) && !defined(__illumos__))
 	// pthread_getattr_np is a GNU extension supported in glibc.
 	// Solaris is not glibc but does support pthread_getattr_np
 	// (and the fallback doesn't work...). Illumos does not.
diff --git a/src/runtime/cgo/testdata/detect_pthread_getattr_np.c b/src/runtime/cgo/testdata/detect_pthread_getattr_np.c
new file mode 100644
index 0000000000..f15fcd0e25
--- /dev/null
+++ b/src/runtime/cgo/testdata/detect_pthread_getattr_np.c
@@ -0,0 +1,10 @@
+/* Dummy program used to check if pthread_getattr_np() is available. */
+
+#define _GNU_SOURCE
+#include <pthread.h>
+
+int main(void) {
+    pthread_getattr_np(pthread_self(), NULL);
+
+    return 0;
+}

The detection code of pthread_getattr_np() seems to work, but I don't manage to have the CFLAGS passed to the compiler when compiling gcc_stack_unix.c. Would you have any guidance on this (if the approach is sensitive) @cherrymui?

@cherrymui
Copy link
Member

cherrymui commented Jul 12, 2024

Thanks for the update!

I don't think we want to introduce any autoconf-like things into go build, to keep it simple and understandable. So we'd have to use some other mechanism, if possible.

Either way, I think we still want to make the fallback path (if pthread_getattr_np is not available) work better. If we can get the accurate bounds, it should always be in bounds. If we can't, perhaps we can be a bit more permissive. Maybe allow the SP the be out of bounds, always update to the new estimated bounds, and restore the old bounds when the callback returns?

longjmp can be tricky. Given that pthread_getattr_np works, I guess it only longjmps within the pthread-allocated stack, i.e. not coroutines, fibers, etc.. Maybe it longjmps up-stack? Does it jump over Go frames? That is, C1 calls Go1, which calls C2, and C2 longjmps to C1, so there is no normal return from Go1? That would probably mess up Go runtime's bookkeeping.

@dunglas
Copy link
Contributor Author

dunglas commented Jul 15, 2024

Does it jump over Go frames? That is, C1 calls Go1, which calls C2, and C2 longjmps to C1, so there is no normal return from Go1?

PHP doesn't manage threads itself, it's up to the webserver to create and manage them. In our case, we use simple ad-hoc code: https://github.com/dunglas/frankenphp/blob/323edefc4b7cb69bcb318745efac4dec9b7ac488/frankenphp.c#L810.

It indeed only longjmps in the pthread-allocated stack. In our case, which is pretty simple, we're sure that the Go code always returns: only input/output is delegated to Go, and the C code blocks while waiting for the Go code to return.

Maybe allow the SP the be out of bounds, always update to the new estimated bounds, and restore the old bounds when the callback returns?

This seems indeed a good fix. By the way, @withinboredom instigated #62130, and it appears that having more permissive checks fixes this issue too: dunglas/frankenphp#46 (comment)
Maybe could we relax these checks?

@dunglas
Copy link
Contributor Author

dunglas commented Jul 15, 2024

What do you think about just skipping the initial bound check in callbackUpdateSystemStack if we're in a thread that isn't managed by Go? This would fix both this issue and #62130, and this looks safe.

@cherrymui
Copy link
Member

I don't think we can just skip the bounds check. If C1 calls Go1, which calls C2, which calls Go2, if at Go2 (where mp.ncgo > 0) we update the bounds, when Go2 and C2 return, Go1 will see incorrect bounds and may panic. So I suggested we restore the old bounds, i.e. keep a stack of bounds. If cgo calls/callbacks are well nested, this probably will work. If they are not well nested (e.g. longjmp across Go frames), it won't work, but it may already not work well anyway. I'll try implementing this and see.

@withinboredom
Copy link

When I was originally approaching this problem (before I simply deleted the check to see how it crashed -- it didn't), my idea was to "trick" go into thinking it was a C1' to G1' type call via a stack -- probably a less refined version of what you are thinking. My thinking was that we could basically ignore the original G1 call until we returned back to C2.

chenxiaolong added a commit to chenxiaolong/RSAF that referenced this issue Jul 20, 2024
Since go 1.22.5, this results in a panic. There is an upstream bug
report at [1] for musl, but the same thing happens with bionic. The same
fix of using pthread_getattr_np() on bionic also works, but it's better
to avoid needing folks to recompile the go toolchain.

We were just using these calls as a small hack to pass a list of structs
to the Java side, which we can easily do another way.

[1] golang/go#68285

Signed-off-by: Andrew Gunnerson <accounts+github@chiller3.com>
@chenxiaolong
Copy link

I encountered this crash with bionic libc on Android as well (calling Java functions over JNI via gomobile). It looks like bionic also supports pthread_getattr_np, at least with API level 34, so I tried adding defined(__BIONIC__) to the #if guard for using pthread_getattr_np and that did the trick.

@cherrymui
Copy link
Member

cherrymui commented Jul 22, 2024

CL 600296 implements the update+restore logic. Could you try if this works with your case? Thanks.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/600296 mentions this issue: runtime: update and restore g0 stack bounds at cgocallback

@cherrymui
Copy link
Member

@chenxiaolong thanks. We could add __BIONIC__ to the condition. We don't seem to have a policy for which versions of Android we support. Do we require API level 34 for Android? Is there a way to check the API version with a macro? Or, is pthread_getattr_np supported long enough that we can assume it is available on all Android?

@chenxiaolong
Copy link

@chenxiaolong thanks. We could add __BIONIC__ to the condition. We don't seem to have a policy for which versions of Android we support. Do we require API level 34 for Android? Is there a way to check the API version with a macro? Or, is pthread_getattr_np supported long enough that we can assume it is available on all Android?

I took a quick look and I think we should be good with unconditionally enabling it for Android. pthread_getattr_np existed since at least API level 3 (in NDK version r9d, the oldest I could find). If needed, there is a __ANDROID_API__ macro that evaluates to the API version as an integer.

@withinboredom
Copy link

@cherrymui I will give this a test!

@dunglas
Copy link
Contributor Author

dunglas commented Jul 25, 2024

@cherrymui CL 600296 fixes this issue with FrankenPHP: https://github.com/dunglas/frankenphp/actions/runs/10091715703/job/27903861001?pr=938 🎉

@cavokz
Copy link

cavokz commented Aug 25, 2024

I think I'm also hitting this on FreeBSD 14.1 (amd64).

I'm in a C->Go->C path that happens right after another apparently stack-hungry C->Go->C path that shares the first C->Go part.

I verified that everything was fine with go1.21.11 and go1.22.4 but got wrong with go1.21.12 and go1.22.5, it's also wrong with go1.23.0 and go1.24-96d8ff00c2.

I applied CL 600296 to go1.23.0 and go1.24-96d8ff00c2, backported it go1.21.13 and go1.22.6: everything is fine again.

As a side note, the stack-hungry C->Go->C path (which initializes numpy), triggers fatal: morestack on g0 on Fedora 35 (go1.16.15), Fedora 36 (go1.19.8), Ubuntu 22.04 (go1.18.1), macOS 13 (go1.20.5), macOS 12 (go1.19.2 but only in some configurations: rosetta2 and Python from pyenv). Everything works fine on other 30 Linux/macOS/Windows systems with Go as old as go1.10.4 (Ubuntu 18.04).

So aside from testing the way ahead I'm in the search of a good workaround for going backwards. Is there any way to see how far are the stack bounds?

@inkeliz
Copy link

inkeliz commented Sep 9, 2024

I think I hit the same issue, with Gio, on Android. Which does a lot of C -> Go and Go -> C calls. It shows "fatal error: runtime: stack split at bad time" since I update to Go 1.22.6. I rollback to 1.22.4 and it's working fine.

Since the issue still open, I'm not sure if CL 600296 landed in any version of Go.

@Wieku
Copy link

Wieku commented Oct 3, 2024

I'm hitting the same issue on Windows with go1.23.2 and go-gl/glfw in glfw.PoolEvents that calls to C code which can generate events consumed by callbacks in Go. It only happens if my project is build with -buildmode=c-shared, I guess because it's then C -> Go -> C -> Go since building the project as an executable doesn't experience those crashes.

@cherrymui Applying CL 600296 fixes the issue so far.

stoffen pushed a commit to stoffen/go that referenced this issue Oct 3, 2024
Currently, at a cgo callback where there is already a Go frame on
the stack (i.e. C->Go->C->Go), we require that at the inner Go
callback the SP is within the g0's stack bounds set by a previous
callback. This is to prevent that the C code switches stack while
having a Go frame on the stack, which we don't really support. But
this could also happen when we cannot get accurate stack bounds,
e.g. when pthread_getattr_np is not available. Since the stack
bounds are just estimates based on the current SP, if there are
multiple C->Go callbacks with various stack depth, it is possible
that the SP of a later callback falls out of a previous call's
estimate. This leads to runtime throw in a seemingly reasonable
program.

This CL changes it to save the old g0 stack bounds at cgocallback,
update the bounds, and restore the old bounds at return. So each
callback will get its own stack bounds based on the current SP,
and when it returns, the outer callback has the its old stack
bounds restored.

TODO: not sure if this works if inner callback panics and outer
callback recovers. Maybe we need to do something in unwindm?

TODO: do we do this only when we cannot get the accurate bounds,
and still throw if we can? This makes it not throw if the C code
switches stacks. What else could go wrong?

For golang#68285.

Change-Id: I3422badd5ad8ff63e1a733152d05fb7a44d5d435
sparrc added a commit to aws/aws-for-fluent-bit that referenced this issue Oct 7, 2024
This reverts commit 7ea28b0.

This commit was originally added to pin the golang build version to
1.20.7 because of an issue in golang related to CGO: golang/go#62130 (comment)

This issue was fixed for most systems, although the issue is still open
because it appears to have opened up a separate issue with MUSL-based
systems (such as alpine): golang/go#68285

Since the fluent-bit container is on a glibc-based system (Amazon
Linux), we should be safe to revert back to building with the latest
version of golang.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests