Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: musl libc setxid/setgroups signals clobber stacks / do not use SA_ONSTACK #39857

Open
nmeum opened this issue Jun 25, 2020 · 17 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@nmeum
Copy link

nmeum commented Jun 25, 2020

This is a follow up to #39343 where I already briefly mentioned this problem. This issue is probably related to musl libc I can reliably reproduce it on Alpine Linux which uses musl.

What version of Go are you using (go version)?

$ go version
go version go1.14.3 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/soeren/.cache/go-build"
GOENV="/home/soeren/.config/go/env"
GOEXE=""
GOFLAGS="-buildmode=pie"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/soeren/src/go"
GOPRIVATE=""
GOPROXY="direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build802004994=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Started the TestCrossPackageTests from misc/cgo/test/pkg_test.go:

misc/cgo/test$ go test -run TestCrossPackageTests

What did you expect to see?

A successful test run.

What did you see instead?

An error message:

--- FAIL: TestCrossPackageTests (1.95s)
    pkg_test.go:67: go test: exit status 1
        --- FAIL: Test9400 (0.00s)
            issue9400_linux.go:55: entry 804 of test pattern is wrong; 0x7fc60b3b0cf4 != 0x123456789abcdef
        FAIL
        exit status 1
        FAIL	cgotest	0.005s
FAIL
exit status 1
FAIL	misc/cgo/test	1.958s
@cagedmantis cagedmantis changed the title TestCrossPackageTests fails on musl (Alpine Linux Edge) cmd/go: TestCrossPackageTests fails on musl (Alpine Linux Edge) Jun 29, 2020
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 29, 2020
@cagedmantis cagedmantis added this to the Backlog milestone Jun 29, 2020
@jayconrod jayconrod changed the title cmd/go: TestCrossPackageTests fails on musl (Alpine Linux Edge) cmd/cgo: TestCrossPackageTests fails on musl (Alpine Linux Edge) Jul 24, 2020
@oflebbe
Copy link

oflebbe commented Aug 8, 2020

I didn't run any of the tests, but read a bunch of comments, since this seems an interesting problem...

As far as I understood: test9400 is checking if the handling of setxid (i.e. setuid, setgid, ...) class of system calls may smash the go stack, as setxid() is implemented by sending signals to all threads to fullfill POSIX requirements. (see linux man setgid)

In order to prevent stack overrun by signals one usually creates an alternate signal stack and providing the SA_ONSTACK while installing signal handlers, to use the alternate stack. This way one cannot overrun the original stack.

glibc doesn't set it, but it installs a signal handler for SIGSETXID at startup in nptl-init.c . In PR#9400 therefore it is possible to enumerate all signal handlers and add a missing SA_ONSTACK flag, fixing the issue on glibc.

musl doesn't implement an alternate stack and SA_ONSTACK for their internal signal implementation of setxid either. This is actually confirmed by @richfelker in a somewhat related issue #19938 (comment) . Unfortunately the fix of #9400 doesn't apply to musl, since the signal handler is dynamically installed when setxid is called by the __synccall() function in src/thread/synccall.c of musl .

I would vote for adding SA_ONSTACK to musl's __synccall implementation.

@richfelker
Copy link

richfelker commented Aug 8, 2020

There's a thread from 2019 on this topic: sigaltstack for implementation-internal signals? that never reached a conclusion. Basically I'm unclear whether it's arguably conforming for the implementation to use the alternate signal stack for implementation-internal signals, since it may have observable side effects on the application in the absence of any signals/signal-handlers setup to run on the alternate stack.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Aug 10, 2020

If musl doesn't use SA_ONSTACK for the signal handler that it installs, that won't work with any program that uses sigaltstack.

I don't see any way to fix this in Go. I don't see what change we could make that would make things work better.

@richfelker
Copy link

richfelker commented Aug 10, 2020

@ianlancetaylor I don't follow how it "doesn't work with any program that uses sigaltstack". It just doesn't work with any program that has a severely undersized stack, which is a known constraint. But it would be nice to be able to use the alt stack when it's available, assuming it's more likely to always have sufficient space for the signal handler to run.

@richfelker
Copy link

richfelker commented Aug 10, 2020

Note that I've reopened the topic on the musl list: https://www.openwall.com/lists/musl/2020/08/09/1

@oflebbe
Copy link

oflebbe commented Aug 10, 2020

Hi @richfelker , I tested: It is sufficient to have a patch like this on musl
PATCH.txt
to resolve. This will fix go, and will not harm musl.

If there is an alternate stack, it will use it. Go does create an alternate stack.
If there is no alternate stack, kernel will ignore SA_ONSTACK. That's it.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Aug 10, 2020

@richfelker I'm assuming that a program that calls sigaltstack does so for some good reason. I'm not sure why a program that calls sigaltstack would want to receive signals on the normal stack.

@richfelker
Copy link

richfelker commented Aug 10, 2020

Nobody said anything about wanting to receive signals on the normal stack. From the relevant perspective these aren't signals. They are asynchronous use of the alt signal stack by the implementation in a way the application isn't and can't be aware of.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Aug 10, 2020

That is a valid perspective.

But it also a valid perspective for a program to say "I am in control of my stack, and do not use my stack for any unexpected purpose. In particular, don't use it to catch signals."

In any case I'm not sure there is anything we can do here in the Go standard library. If musl decides not to change, then as far as I can see code like this can't work on musl. So perhaps we should close this issue.

@richfelker
Copy link

richfelker commented Aug 11, 2020

@ianlancetaylor: I have been wanting to change this for a while (see the 2019 thread), but I'm making sure we actually consider the consequences of such a change and whether they break anything that someone can reasonably expect to work. (My leaning is that they don't, but I like to explore this kind of thing thoroughly since making hasty decisions has bitten us in the past.) The point of my bringing these things up is not to argue against the change, but to make sure it's well-supported when (technically if, but most likely when) it's made.

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Aug 11, 2020

Understood.

(I suppose musl could also change to act as glibc does. Is there an advantage to only installing the signal handler when a relevant libc function is called?)

@richfelker
Copy link

richfelker commented Aug 11, 2020

Yes, it avoids syscall spam (strace) and wasted time in processes (the vast, vast majority) that don't need the handler. And I don't see how the glibc behavior makes it any easier unless you're poking at implementation internals which are not a stable interface. The signal numbers used for these internal signals are not a public interface, and they're not even pokable via public interfaces (as far as the public interfaces are concerned, the reserved signal numbers simply are not existant signals). The only way you can poke at them is via directly making syscalls, and this will break if signal handling is ever wrapped (which has been considered at times, but turned out we could always get by without it).

@dmitshur dmitshur changed the title cmd/cgo: TestCrossPackageTests fails on musl (Alpine Linux Edge) misc/cgo/test: TestCrossPackageTests fails on musl (Alpine Linux Edge) Jan 26, 2021
jspc added a commit to vinyl-linux/vin-packages-stable that referenced this issue Feb 12, 2021
From: golang/go#39857

This test always fails on musl, because certain things behave differently to glibc. So... sod it, ignore the test
@gopherbot
Copy link

gopherbot commented Jul 29, 2022

Change https://go.dev/cl/419995 mentions this issue: all: disable tests that fail on Alpine

gopherbot pushed a commit that referenced this issue Aug 2, 2022
These changes are enough to pass all.bash using the
disabled linux-amd64-alpine builder via debugnewvm.

For #19938.
For #39857.

Change-Id: I7d160612259c77764b70d429ad94f0864689cdce
Reviewed-on: https://go-review.googlesource.com/c/go/+/419995
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
jproberts pushed a commit to jproberts/go that referenced this issue Aug 10, 2022
These changes are enough to pass all.bash using the
disabled linux-amd64-alpine builder via debugnewvm.

For golang#19938.
For golang#39857.

Change-Id: I7d160612259c77764b70d429ad94f0864689cdce
Reviewed-on: https://go-review.googlesource.com/c/go/+/419995
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
@prattmic prattmic changed the title misc/cgo/test: TestCrossPackageTests fails on musl (Alpine Linux Edge) runtime: musl libc setxid/setgroups signals clobber stacks / do not use SA_ONSTACK Aug 19, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 19, 2022
@prattmic
Copy link
Member

prattmic commented Aug 19, 2022

I rediscovered this issue and posted a summary at #54306 (comment) before discovering this issue, where I came to more-or-less the same conclusions as the discussion in this thread.

@richfelker regarding #39857 (comment), did you ever make any progress on the 2019 thread? It is rather unfortunate to have to tell users that using setxid functions with musl+Go is broken and will cause random memory corruption.

@gopherbot
Copy link

gopherbot commented Aug 19, 2022

Change https://go.dev/cl/425001 mentions this issue: misc/cgo/test: disable setgid tests with musl

@richfelker
Copy link

richfelker commented Aug 19, 2022

Thanks for the ping. I don't think we ever came up with a good reason we can't use the alt stack for this, so I'm inclined to go ahead and switch to using it.

gopherbot pushed a commit that referenced this issue Aug 22, 2022
We don't have a good musl detection mechanism, so we detect Alpine (the
most common user of musl) instead.

For #39857.
For #19938.

Change-Id: I2fa39248682aed75884476374fe2212be4427347
Reviewed-on: https://go-review.googlesource.com/c/go/+/425001
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@fweimer
Copy link
Contributor

fweimer commented Aug 27, 2022

FWIW, glibc has switched on SA_ONSTACK in 2.34, to keep Go working after we removed the old early libpthread initialization code. I'm not aware of any issues caused by SA_ONSTACK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Todo
Development

No branches or pull requests

8 participants