-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os: Setenv leads to SIGSEGV with GODEBUG=netdns=cgo, and other uses of cgo #63567
Comments
Nice blog post describing the larger issue on Linux https://rachelbythebay.com/w/2023/10/16/env/ |
I just spent entirely too much time trying to understand more about what glibc is doing. I think I now understand this particular crash:
So, there is another gross workaround possible: call |
I tend to agree that we should do this. It's rather unfortunate given that there is likely a lot of code running afoul of this, but avoiding C |
Change https://go.dev/cl/538675 mentions this issue: |
I made a first draft documentation change in https://go-review.googlesource.com/c/go/+/538675 Realistically: this is a pretty rare problem, and documenting it is probably a fine solution. This is probably going to cost someone else a couple of days of debugging every couple of years, unless a search turns up the Go Setenv documentation or this issue. That probably means there are better ways to spend engineering time. However: I am still unreasonably annoyed that everyone has been aware of this problem for decades, and we can't seem to fix it. I wish there was some way for Go and Rust to somehow share a thread-safe implementation of environment variables and magically get programs to adopt it, to try to route around the C standard, or encourage the C standard to fix this. It seems to me that the Go and Rust standard libraries are probably more motivated to find a solution, since having a "safe" function that can sometimes crash programs is unsatisfying. |
Thanks for the investigation @evanj, and thanks for the link @dilawar! I share @prattmic's sentiment to be more nuanced, given that it is an implement detail on the *NIX systems, thus I kindly suggested that we instead put up a concise comment/warning // When CGO is enabled, Setenv is not thread-safe. |
In pure Go, the environment-map is mutex-protected. The problematic race is fundamentally in libc (and so only appears in Go programs if cgo is enabled). Still, as a special case, I wonder if Go's race detector can sing out if Go code calls |
I came across this issue too, although it manifested differently. Please see PoC: https://go.dev/play/p/epOSRVQ_9Qc - To the least, it must be a little bit more clear as to what the best practice is. |
I believe we are seeing some test flakes due to this problem. The tests are running in a container in kube (linux/amd64). The top of the crash trace looks like this:
The test uses |
@dnephin Sounds like this is likely this problem. If so, you may be able to avoid the flake by setting The other workaround is to call |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?Linux: Reproduced with Ubuntu 20.04.6, Debian 12.
go env
OutputWhat did you do?
A program that uses the
netdns=cgo
resolver with glibc and callsos.Setenv
in other goroutines can crash withSIGSEGV
. The problem is thatgetaddrinfo
in glibc callsgetenv
, because it can be configured with a number of environment variables (see man resolv.conf). The Csetenv
function is documented as not being thread-safe ("MT-Unsafe const:env"). Currently,os.Setenv
calls Csetenv
, so it is violating the specification. When this happens, C calls togetenv
, either explicitly or in the C library itself (e.g.getaddrinfo
,mktime
, others), can crash.musl libc does not have the specific problem with
netdns=cgo
because in general it is much less configurable, so the resolver does not call getenv ([see list of musl environment variables]). However, Go programs using musl that explicitly use Cgo and then directly or indirectly callgetenv
can still crash.Discussion of possible solutions
Since C
getenv
returns achar*
, it can't be made thread-safe. Unfortunately, glibc does not want to provide a thread-safe alternative for accessing environment variables. Different versions of this bug have caught many people over the past few decades. I think this means Go does not have that many options. Here are some ideas:os.Setenv
as being unsafe. It seems to me we should do this immediately. I'm happy to submit a documentation patch. Rust's standard library std::env::set_var function has a nice warning we could borrow from.os.SetenvGoOnly
function, then markos.Setenv
as deprecated. We would probably want to provide a new way to set C environment variables (e.g.cgo.SynchronizeGoEnvVars
?cgo.Setenv
?). This would allow programs to decide if they have a "safe" use of Setenv or not. Programs will sometimes get this wrong (e.g. goroutines can be created frominit()
functions), but at least the functions could be documented appropriately.setenv
fromos.Setenv
. This would almost certainly break some existing Cgo programs. We would also need to provide some way for Go to set C environment variables. This is similar to the previous choice, but would redefineos.Setenv
, rather than creating a new function.Example reproduction
This example program crashes for me when run inside a Docker container. I do not know why I can't get seem to get it to crash outside the container.
To build and run in a docker container:
In case it is helpful, see https://github.com/evanj/cgogetenvcrash for a Git repository with a complete example, as well as a different example that explicitly calls
getenv
using Cgo, and does crash reliabily for me without Docker.What did you expect to see?
A successful program exit.
What did you see instead?
The following segmentation fault:
The gdb backtrace is the following:
The text was updated successfully, but these errors were encountered: