cmd/cgo: TestCrossPackageTests fails on musl (Alpine Linux Edge) #39857
Comments
I didn't run any of the tests, but read a bunch of comments, since this seems an interesting problem... As far as I understood: test9400 is checking if the handling of setxid (i.e. setuid, setgid, ...) class of system calls may smash the go stack, as setxid() is implemented by sending signals to all threads to fullfill POSIX requirements. (see linux man setgid) In order to prevent stack overrun by signals one usually creates an alternate signal stack and providing the SA_ONSTACK while installing signal handlers, to use the alternate stack. This way one cannot overrun the original stack. glibc doesn't set it, but it installs a signal handler for SIGSETXID at startup in nptl-init.c . In PR#9400 therefore it is possible to enumerate all signal handlers and add a missing SA_ONSTACK flag, fixing the issue on glibc. musl doesn't implement an alternate stack and SA_ONSTACK for their internal signal implementation of setxid either. This is actually confirmed by @richfelker in a somewhat related issue #19938 (comment) . Unfortunately the fix of #9400 doesn't apply to musl, since the signal handler is dynamically installed when setxid is called by the __synccall() function in src/thread/synccall.c of musl . I would vote for adding SA_ONSTACK to musl's __synccall implementation. |
There's a thread from 2019 on this topic: sigaltstack for implementation-internal signals? that never reached a conclusion. Basically I'm unclear whether it's arguably conforming for the implementation to use the alternate signal stack for implementation-internal signals, since it may have observable side effects on the application in the absence of any signals/signal-handlers setup to run on the alternate stack. |
If musl doesn't use I don't see any way to fix this in Go. I don't see what change we could make that would make things work better. |
@ianlancetaylor I don't follow how it "doesn't work with any program that uses |
Note that I've reopened the topic on the musl list: https://www.openwall.com/lists/musl/2020/08/09/1 |
Hi @richfelker , I tested: It is sufficient to have a patch like this on musl If there is an alternate stack, it will use it. Go does create an alternate stack. |
@richfelker I'm assuming that a program that calls |
Nobody said anything about wanting to receive signals on the normal stack. From the relevant perspective these aren't signals. They are asynchronous use of the alt signal stack by the implementation in a way the application isn't and can't be aware of. |
That is a valid perspective. But it also a valid perspective for a program to say "I am in control of my stack, and do not use my stack for any unexpected purpose. In particular, don't use it to catch signals." In any case I'm not sure there is anything we can do here in the Go standard library. If musl decides not to change, then as far as I can see code like this can't work on musl. So perhaps we should close this issue. |
@ianlancetaylor: I have been wanting to change this for a while (see the 2019 thread), but I'm making sure we actually consider the consequences of such a change and whether they break anything that someone can reasonably expect to work. (My leaning is that they don't, but I like to explore this kind of thing thoroughly since making hasty decisions has bitten us in the past.) The point of my bringing these things up is not to argue against the change, but to make sure it's well-supported when (technically if, but most likely when) it's made. |
Understood. (I suppose musl could also change to act as glibc does. Is there an advantage to only installing the signal handler when a relevant libc function is called?) |
Yes, it avoids syscall spam (strace) and wasted time in processes (the vast, vast majority) that don't need the handler. And I don't see how the glibc behavior makes it any easier unless you're poking at implementation internals which are not a stable interface. The signal numbers used for these internal signals are not a public interface, and they're not even pokable via public interfaces (as far as the public interfaces are concerned, the reserved signal numbers simply are not existant signals). The only way you can poke at them is via directly making syscalls, and this will break if signal handling is ever wrapped (which has been considered at times, but turned out we could always get by without it). |
This is a follow up to #39343 where I already briefly mentioned this problem. This issue is probably related to musl libc I can reliably reproduce it on Alpine Linux which uses musl.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Started the
TestCrossPackageTests
frommisc/cgo/test/pkg_test.go
:What did you expect to see?
A successful test run.
What did you see instead?
An error message:
The text was updated successfully, but these errors were encountered: