Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: segfault with statically linked binaries on linux #13470

Closed
tamird opened this issue Dec 3, 2015 · 26 comments

Comments

Projects
None yet
6 participants
@tamird
Copy link
Contributor

commented Dec 3, 2015

Discovered with @tschottdorf.

package main

import (
    "net"
    "os/user"

    "C"
)

func main() {
    for i := 0; i < 1000; i++ {
        _, _ = net.Dial("tcp", "localhost:1337")
        _, _ = user.Current()
    }
}

Note the "C" import is required, otherwise the go tool does not build a real static binary.

$ go run -ldflags '-extldflags "-static"' main.go
fatal error: unexpected signal during runtime execution
[signal 0xb code=0x1 addr=0xe5 pc=0x7fec267f8a5c]

runtime stack:
runtime.throw(0x660380, 0x2a)
    /usr/local/go/src/runtime/panic.go:527 +0x90
runtime.sigpanic()
    /usr/local/go/src/runtime/sigpanic_unix.go:12 +0x5a

goroutine 1 [syscall, locked to thread]:
runtime.cgocall(0x402620, 0xc82004bd30, 0xc800000000)
    /usr/local/go/src/runtime/cgocall.go:120 +0x11b fp=0xc82004bce0 sp=0xc82004bcb0
os/user._Cfunc_mygetpwuid_r(0x0, 0xc8200172c0, 0x7fec180008c0, 0x400, 0xc82002a0b0, 0x0)
    ??:0 +0x39 fp=0xc82004bd30 sp=0xc82004bce0
os/user.lookupUnix(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /usr/local/go/src/os/user/lookup_unix.go:99 +0x723 fp=0xc82004bea0 sp=0xc82004bd30
os/user.current(0x0, 0x0, 0x0)
    /usr/local/go/src/os/user/lookup_unix.go:39 +0x42 fp=0xc82004bee0 sp=0xc82004bea0
os/user.Current(0x62eba8, 0x0, 0x0)
    /usr/local/go/src/os/user/lookup.go:9 +0x24 fp=0xc82004bf00 sp=0xc82004bee0
main.main()
    /go/src/github.com/cockroachdb/cgo_static_boom/main.go:13 +0x55 fp=0xc82004bf50 sp=0xc82004bf00
runtime.main()
    /usr/local/go/src/runtime/proc.go:111 +0x2b0 fp=0xc82004bfa0 sp=0xc82004bf50
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1696 +0x1 fp=0xc82004bfa8 sp=0xc82004bfa0

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:1696 +0x1
exit status 2

This was discovered in a docker image based on golang:1.5.1, but also tested against go1.5.2 and 606d9a7 (tip at the time of writing), both built from source in the container. The segfault reproduces in all three. The docker image was running in a virtualbox VM.

Output of go env:

$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT=""
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2015

I can't reproduce this.

When using glibc, statically linking calls to getpwuid only works if the system has the exact shared libraries available when the program is run as were used when the program was built. If you build your program with go build "-ldflags=-extldflags=-static -v" you should see, along with other debug output, a warning from the C linker. On my system I see this:
/home/iant/go/src/net/cgo_unix.go:57: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking

If there is any discrepancy there--if, for example, you are building on one system and running on a different one--that could be the cause of your problem.

@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Dec 3, 2015

@ianlancetaylor ianlancetaylor changed the title syscall: segfault with statically linked binaries on linux cmd/link: segfault with statically linked binaries on linux Dec 3, 2015

@tamird

This comment has been minimized.

Copy link
Contributor Author

commented Dec 3, 2015

I've updated the description some - the suggested reproduction is now a go run invocation, so the system on which the binary is built and the system on which it is run are the same.

Interestingly, neither of the following invocations produce an error on this machine:

go build -ldflags '-extldflags "-static -v"' main.go
go run -ldflags '-extldflags "-static -v"' main.go
<segfault here>

EDIT: ah, the -v flag is for ldflags, not extldflags. Yes, I see those warnings. Still, segfault with go run.

FWIW, the external linker being used:

ld --version
GNU ld (GNU Binutils for Debian) 2.25
Copyright (C) 2014 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
@tbg

This comment has been minimized.

Copy link

commented Dec 3, 2015

@tamird what's the output with -x? Agreed that it's weird that we're not getting the warning above - I'm used to it, but we never saw it in this example.

@tamird

This comment has been minimized.

Copy link
Contributor Author

commented Dec 3, 2015

Output with -x and -v as suggested above:

$ go run -x -ldflags='-extldflags -static -v' main.go
WORK=/tmp/go-build192324343
mkdir -p $WORK/command-line-arguments/_obj/
mkdir -p $WORK/command-line-arguments/_obj/exe/
cd /go/src/github.com/cockroachdb/cgo_static_boom
CGO_LDFLAGS="-g" "-O2" /usr/local/go/pkg/tool/linux_amd64/cgo -objdir $WORK/command-line-arguments/_obj/ -importpath command-line-arguments -- -I $WORK/command-line-arguments/_obj/ main.go
gcc -I . -fPIC -m64 -pthread -fmessage-length=0 -print-libgcc-file-name
gcc -I . -fPIC -m64 -pthread -fmessage-length=0 -I $WORK/command-line-arguments/_obj/ -g -O2 -o $WORK/command-line-arguments/_obj/_cgo_main.o -c $WORK/command-line-arguments/_obj/_cgo_main.c
gcc -I . -fPIC -m64 -pthread -fmessage-length=0 -I $WORK/command-line-arguments/_obj/ -g -O2 -o $WORK/command-line-arguments/_obj/_cgo_export.o -c $WORK/command-line-arguments/_obj/_cgo_export.c
gcc -I . -fPIC -m64 -pthread -fmessage-length=0 -I $WORK/command-line-arguments/_obj/ -g -O2 -o $WORK/command-line-arguments/_obj/main.cgo2.o -c $WORK/command-line-arguments/_obj/main.cgo2.c
gcc -I . -fPIC -m64 -pthread -fmessage-length=0 -o $WORK/command-line-arguments/_obj/_cgo_.o $WORK/command-line-arguments/_obj/_cgo_main.o $WORK/command-line-arguments/_obj/_cgo_export.o $WORK/command-line-arguments/_obj/main.cgo2.o -g -O2
/usr/local/go/pkg/tool/linux_amd64/cgo -objdir $WORK/command-line-arguments/_obj/ -dynpackage main -dynimport $WORK/command-line-arguments/_obj/_cgo_.o -dynout $WORK/command-line-arguments/_obj/_cgo_import.go
gcc -I . -fPIC -m64 -pthread -fmessage-length=0 -o $WORK/command-line-arguments/_obj/_all.o $WORK/command-line-arguments/_obj/_cgo_export.o $WORK/command-line-arguments/_obj/main.cgo2.o -g -O2 -Wl,-r -nostdlib /usr/lib/gcc/x86_64-linux-gnu/4.9/libgcc.a -Wl,--build-id=none
/usr/local/go/pkg/tool/linux_amd64/compile -o $WORK/command-line-arguments.a -trimpath $WORK -p main -buildid ea5e293f42ddbc025311e27241e4a5de858208fc -D _/go/src/github.com/cockroachdb/cgo_static_boom -I $WORK -pack $WORK/command-line-arguments/_obj/_cgo_gotypes.go $WORK/command-line-arguments/_obj/main.cgo1.go $WORK/command-line-arguments/_obj/_cgo_import.go
pack r $WORK/command-line-arguments.a $WORK/command-line-arguments/_obj/_all.o # internal
cd .
/usr/local/go/pkg/tool/linux_amd64/link -o $WORK/command-line-arguments/_obj/exe/main -L $WORK -w -extld=gcc -buildmode=exe -buildid=ea5e293f42ddbc025311e27241e4a5de858208fc -extldflags -static -v $WORK/command-line-arguments.a
# command-line-arguments
HEADER = -H5 -T0x401000 -D0x0 -R0x1000
searching for runtime.a in $WORK/runtime.a
searching for runtime.a in /usr/local/go/pkg/linux_amd64/runtime.a
 0.00 deadcode
 0.04 pclntab=468931 bytes, funcdata total 97868 bytes
 0.05 dodata
 0.08 reloc
 0.09 asmb
 0.09 codeblk
 0.12 datblk
 0.12 sym
 0.13 symsize = 110616
 0.13 symsize = 111168
 0.13 dwarf
 0.16 headr
host link: "gcc" "-m64" "-gdwarf-2" "-o" "/tmp/go-build192324343/command-line-arguments/_obj/exe/main" "-static" "/tmp/go-link-838518987/000000.o" "/tmp/go-link-838518987/000001.o" "/tmp/go-link-838518987/000002.o" "/tmp/go-link-838518987/000003.o" "/tmp/go-link-838518987/go.o" "-g" "-O2" "-g" "-O2" "-lpthread" "-g" "-O2" "-g" "-O2" "-static"
/tmp/go-link-838518987/000003.o: In function `mygetpwnam_r':
/tmp/workdir/go/src/os/user/lookup_unix.go:33: warning: Using 'getpwnam_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/tmp/go-link-838518987/000003.o: In function `mygetpwuid_r':
/tmp/workdir/go/src/os/user/lookup_unix.go:28: warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/tmp/go-link-838518987/000002.o: In function `_cgo_709c8d94a9f9_C2func_getaddrinfo':
/tmp/workdir/go/src/net/cgo_unix.go:55: warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
 0.23 cpu time
60223 symbols
42348 liveness data
$WORK/command-line-arguments/_obj/exe/main
fatal error: unexpected signal during runtime execution
[signal 0xb code=0x1 addr=0xe5 pc=0x7f1802b13a5c]
...
@tbg

This comment has been minimized.

Copy link

commented Dec 3, 2015

verified on an ec2 instance. Setup go-1.5.2 by copy-pasta from the Dockerfile.

I'll try a source install next, just for kicks.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2015

Do you have LD_LIBRARY_PATH or LD_PRELOAD set in the environment?

@tamird

This comment has been minimized.

Copy link
Contributor Author

commented Dec 3, 2015

No.

$ env
GOLANG_VERSION=1.5.1
HOSTNAME=d1824128699e
TERM=xterm
TSD_GITHUB_TOKEN=
GOLANG_DOWNLOAD_SHA1=46eecd290d8803887dec718c691cc243f2175fe0
PAGER=cat
SKIP_BOOTSTRAP=1
PATH=/go/bin:/usr/local/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/go/src/github.com/cockroachdb/cockroach
SHLVL=1
HOME=/root
GOPATH=/go
CACHE=/buildcache
GOLANG_DOWNLOAD_URL=https://golang.org/dl/go1.5.1.linux-amd64.tar.gz
_=/usr/bin/env
@tbg

This comment has been minimized.

Copy link

commented Dec 3, 2015

ditto for the ec2 instance. Source install behaves the same, fwiw.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 3, 2015

I'm out of ideas. What version of glibc are you using? What does /lib/x86_64-linux-gnu/libc.so.6 print?

@tbg

This comment has been minimized.

Copy link

commented Dec 4, 2015

Output below. I can give you root on the AWS box if that helps?

root@ip-172-31-48-75:/go# /lib/x86_64-linux-gnu/libc.so.6
GNU C Library (Ubuntu GLIBC 2.21-0ubuntu4) stable release version 2.21, by
Roland McGrath et al.
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.9.2.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bugs.

On Thu, Dec 3, 2015 at 6:31 PM Ian Lance Taylor notifications@github.com
wrote:

I'm out of ideas. What version of glibc are you using? What does
/lib/x86_64-linux-gnu/libc.so.6 print?


Reply to this email directly or view it on GitHub
#13470 (comment).

@mwhudson

This comment has been minimized.

Copy link
Contributor

commented Dec 6, 2015

I can reproduce this (on Ubuntu wily, which I guess you are too, judging by glibc version).

I poked with gdb a bit.

It's crashing here https://sourceware.org/git/?p=glibc.git;a=blob;f=nis/nss_compat/compat-pwd.c;h=e3e3dbb308c2cca45fa26a2631dd6deaf9ee3efd;hb=4e42b5b8f89f0e288e68be7ad70f9525aebc2cff#l555 because the second time it's called I think __ctype_b_loc (called from inside the expansion of isspace) returns the wrong value. The second time it's called is from a different thread and the value of $fs is different and maybe that's relevant? I don't know what's changed in glibc that might have caused this.

@tbg

This comment has been minimized.

Copy link

commented Dec 7, 2015

The AWS box was indeed Ubuntu Wily. The Docker image uses 2.19-18+deb8u1, but crash and callsite are the same (though I'm not enough of a gdb wizard to peek into isspace() - if you don't mind sharing what I need to follow up with your findings, I'd appreciate it).

this Dockerfile (run using docker build -t gdb . && docker run -ti gdb) could be useful for anybody who can't repro locally. It has gdb and the sources set up.

@tbg

This comment has been minimized.

Copy link

commented Dec 7, 2015

and, FWIW, running this a bunch of times it seems like there's always at least a call to net.Dial, user.Current(), net.Dial() and only then a crash (or more successful iterations), which is always atuser.Current(), never net.Dial(). The illegal memory access is always 0xe5.

@tbg

This comment has been minimized.

Copy link

commented Dec 7, 2015

The offending instruction is testb $0x20,0x1(%rcx,%rdx,2), and %r{c,d}x are

rcx            0x0  0
rdx            0x72 114

(of course, 0xe5 = 0x1+0x0+0x2*0x72). %rcx is the result of __ctype_b_loc.

   0x00007ffff5bbca44 <+308>:   callq  0x7ffff5bba3a0 <__ctype_b_loc@plt>
   0x00007ffff5bbca49 <+313>:   mov    (%rax),%rcx
   0x00007ffff5bbca4c <+316>:   jmp    0x7ffff5bbca54 <_nss_compat_getpwuid_r+324>
   0x00007ffff5bbca4e <+318>:   xchg   %ax,%ax
   0x00007ffff5bbca50 <+320>:   add    $0x1,%r15
   0x00007ffff5bbca54 <+324>:   movzbl (%r15),%eax
   0x00007ffff5bbca58 <+328>:   movsbq %al,%rdx
=> 0x00007ffff5bbca5c <+332>:   testb  $0x20,0x1(%rcx,%rdx,2)

So to my untrained eye it looks like __ctype_b_loc returns null, when really it should

[...] return a pointer into an array of characters in the current locale that contains characteristics for each character in the current character set. The array shall contain a total of 384 characters, and can be indexed with any signed or unsigned char (i.e. with an index value between -128 and 255). If the application is multithreaded, the array shall be local to the current thread.

And here's what happens inside (at this point, I think I've caught up with @mwhudson), note the %fs register.

000000000051e340 <__ctype_b_loc>:
  51e340:       48 c7 c0 e0 ff ff ff    mov    $0xffffffffffffffe0,%rax
  51e347:       64 48 03 04 25 00 00    add    %fs:0x0,%rax
  51e34e:       00 00
  51e350:       c3                      retq
  51e351:       66 66 66 66 66 66 2e    data16 data16 data16 data16 data16 nopw %cs:0x0(%rax,%rax,1)
  51e358:       0f 1f 84 00 00 00 00
  51e35f:       00

000000000051e3a0 <__ctype_init>:
  51e3a0:       48 c7 c0 a0 ff ff ff    mov    $0xffffffffffffffa0,%rax
  51e3a7:       64 48 8b 00             mov    %fs:(%rax),%rax
  51e3ab:       48 8b 10                mov    (%rax),%rdx
  51e3ae:       48 8b 4a 40             mov    0x40(%rdx),%rcx
  51e3b2:       48 c7 c2 e0 ff ff ff    mov    $0xffffffffffffffe0,%rdx
  51e3b9:       48 81 c1 00 01 00 00    add    $0x100,%rcx
  51e3c0:       64 48 89 0a             mov    %rcx,%fs:(%rdx)
  51e3c4:       48 8b 10                mov    (%rax),%rdx
  51e3c7:       48 8b 4a 48             mov    0x48(%rdx),%rcx
  51e3cb:       48 c7 c2 d0 ff ff ff    mov    $0xffffffffffffffd0,%rdx
  51e3d2:       48 81 c1 00 02 00 00    add    $0x200,%rcx
  51e3d9:       64 48 89 0a             mov    %rcx,%fs:(%rdx)
  51e3dd:       48 8b 00                mov    (%rax),%rax
  51e3e0:       48 8b 50 58             mov    0x58(%rax),%rdx
  51e3e4:       48 c7 c0 d8 ff ff ff    mov    $0xffffffffffffffd8,%rax
  51e3eb:       48 81 c2 00 02 00 00    add    $0x200,%rdx
  51e3f2:       64 48 89 10             mov    %rdx,%fs:(%rax)
  51e3f6:       c3                      retq
  51e3f7:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
  51e3fe:       00 00

From the looks of it %fs has to do with thread-local storage. It's 0 when not crashing, and 99 before the segfault. Maybe Go accidentally clobbers something there from one thread to the other (if one sets up its TLS, the other doesn't but %fs gets mixed up)? Or it's not even a Go issue?

@mwhudson

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

Yes, I think you've followed the same threads as me :-) %fs is definitely related to thread local storage, but I think you've got the cases flipped around: it seems to me it crashes when $fs is 0 and works when it is 99.

The thing is (AIUI), when cgo is involved, the c library is responsible for setting tls up, so I don't know what's going on. It might even by a glibc bug I guess.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

Can you send me a static binary build on Ubuntu Wily?

@tbg

This comment has been minimized.

Copy link

commented Dec 7, 2015

@mwhudson you're right. After the fatal thread switch, it's zero.

Breakpoint 5, internal_getpwuid_r (ent=<optimized out>, errnop=<optimized out>,
    buflen=<optimized out>, buffer=<optimized out>, result=<optimized out>, uid=<optimized out>)
    at nss_compat/compat-pwd.c:961
961       while (isspace (*p))
(gdb) info register fs rcx rdx
fs             0x63 99
rcx            0x7ffff57449c0   140737311427008
rdx            0xfbada489   4222461065
(gdb) c
Continuing.
7 : net
7 : user
[Switching to Thread 0x7ffff6607700 (LWP 2216)]

Breakpoint 5, internal_getpwuid_r (ent=<optimized out>, errnop=<optimized out>,
    buflen=<optimized out>, buffer=<optimized out>, result=<optimized out>, uid=<optimized out>)
    at nss_compat/compat-pwd.c:961
961       while (isspace (*p))
(gdb) stepi 2
(gdb) info register fs rcx rdx
fs             0x0  0
rcx            0x0  0
rdx            0x72 114
(gdb) disas
Dump of assembler code for function _nss_compat_getpwuid_r:
[...]
   0x00007ffff5bbca54 <+324>:   movzbl (%r15),%eax
   0x00007ffff5bbca58 <+328>:   movsbq %al,%rdx
=> 0x00007ffff5bbca5c <+332>:   testb  $0x20,0x1(%rcx,%rdx,2)
[...]
(gdb) stepi

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff5bbca5c in internal_getpwuid_r (ent=<optimized out>, errnop=<optimized out>,
    buflen=<optimized out>, buffer=<optimized out>, result=<optimized out>, uid=<optimized out>)
    at nss_compat/compat-pwd.c:961
961       while (isspace (*p))
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

I can't recreate it on my system but I'm fairly certain it's a glibc bug.

The ctype code relies on TLS variables initialized by a call to __ctype_init.

When you call getpwuid_r in a statically linked program, then, depending on the contents of /etc/nsswitch.conf, in some cases the program will dlopen a supporting shared library. Since the main executable is statically linked and has no dynamic symbol table, the supporting shared library can not refer to the same TLS variables. It has its own TLS variables, and when the library is loaded it will call __ctype_init to initialize them.

However, as far as I can tell there is no code to call __ctype_init on any existing threads. If you then call into the shared library on an existing thread, then any references from the shared library to the TLS ctype variables will crash.

Please try compiling this C program with -static and see what happens. I expect it to crash.

#include <stdio.h>
#include <ctype.h>
#include <sys/types.h>
#include <pwd.h>
#include <pthread.h>

static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

static void *thread(void *arg) {
    struct passwd pwd;
    char buf[1024];
    struct passwd *result;
    pthread_mutex_lock(&mutex);
    getpwuid_r(0, &pwd, buf, sizeof buf, &result);
    return NULL;
}

int main() {
    pthread_t tid;
    struct passwd pwd;
    char buf[1024];
    struct passwd *result;
    void *retval;
    pthread_mutex_lock(&mutex);
    pthread_create(&tid, NULL, thread, NULL);
    getpwuid_r(0, &pwd, buf, sizeof buf, &result);
    pthread_mutex_unlock(&mutex);
    pthread_join(tid, &retval);
    return 0;
}
@tbg

This comment has been minimized.

Copy link

commented Dec 7, 2015

dice (@tamird you're missing the -static):

root@d644730ea408:/go/src/github.com/tschottdorf/goplay/issue_13470# gcc -static -pthread -o cboom test.c
/tmp/ccmgi4zj.o: In function `thread':
test.c:(.text+0x3f): warning: Using 'getpwuid_r' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
root@d644730ea408:/go/src/github.com/tschottdorf/goplay/issue_13470# ./cboom
Segmentation fault
@tamird

This comment has been minimized.

Copy link
Contributor Author

commented Dec 7, 2015

Oops, yeah, missed the -static. Also able to repro after adding that.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

Thanks, I will open a glibc bug. Can you append the contents of /etc/nsswitch.conf on your system?

@tbg

This comment has been minimized.

Copy link

commented Dec 7, 2015

Thanks for taking this upstream. I did a cursory check of the glibc bug tracker and couldn't find an issue for this (I checked mostly for various combinations of TLS, static, etc).

root@d644730ea408:/go/src/github.com/tschottdorf/goplay/issue_13470# cat /etc/nsswitch.conf
# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd:         compat
group:          compat
shadow:         compat
gshadow:        files

hosts:          files dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Dec 7, 2015

Filed as https://sourceware.org/bugzilla/show_bug.cgi?id=19341.

I can see one way to fix this in the Go code: arrange for all calls to getpwuid_r to go through a single goroutine, and have that goroutine call runtime.LockOSThread. That should ensure that that goroutine will always see the correct TLS values.

However, I can't see a compelling reason to penalize all Go programs that use os/user in order to work around a glibc bug that only occurs when linking with -static. Since you want to use -static, I'm going to have to recommend that use that workaround yourself: make all your calls to os/user.Lookup from a single goroutine that calls runtime.LockOSThread. You can drop that workaround when you get a fixed version of glibc.

tamird added a commit to tamird/cockroach that referenced this issue Dec 7, 2015

link against musl instead of glibc in static builds
This works around golang/go#13470 with the
biggest hammer I could find.

tamird added a commit to tamird/cockroach that referenced this issue Dec 9, 2015

Makefile: link against musl instead of glibc on static builds
This works around golang/go#13470 with the
biggest hammer I could find.

Another option is to the route of dynamic linking, but this has
unfortunate implications on our `c-*` dependencies' builds; for
instance, in a dynamic build, c-rocksdb ends up linking against more
than just glibc, greatly increasing the surface of libraries required
to run cockroach. Having run such a build, the list of dependencies
ends up being quite large:

```
$ ldd ./cockroach
	linux-vdso.so.1 (0x00007fff60bcc000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f60d0fc9000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f60d0dac000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f60d0aa0000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f60d079f000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f60d0589000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f60d01df000)
	/lib64/ld-linux-x86-64.so.2 (0x00005620684d4000)
```

We could vigilantly produce static binaries upstream and lint agianst
excessive dynamic dependencies, but the musl route is a shorter path
to correctness.

tamird added a commit to tamird/cockroach that referenced this issue Dec 9, 2015

Makefile: link against musl instead of glibc on static builds
This works around golang/go#13470 with the
biggest hammer I could find.

Another option is to go the route of dynamic linking, but this has
unfortunate implications for our `c-*` dependencies' builds; for
instance, in a dynamic build, c-rocksdb ends up linking against more
than just glibc, greatly increasing the surface of libraries required
to run cockroach. Having run such a build, the list of dependencies
ends up being quite large:

```
$ ldd ./cockroach
	linux-vdso.so.1 (0x00007fff60bcc000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f60d0fc9000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f60d0dac000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f60d0aa0000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f60d079f000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f60d0589000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f60d01df000)
	/lib64/ld-linux-x86-64.so.2 (0x00005620684d4000)
```

We could vigilantly produce static binaries upstream and lint agianst
excessive dynamic dependencies, but the musl route is a shorter path
to correctness.

vanadium-bot pushed a commit to vanadium-archive/go.ref that referenced this issue Mar 26, 2016

runtime/internal: Avoid use of os/user to generate default blessing
names.

The user of os/user causes trouble with statically linked binaries
(see golang/go#13470 and
https://sourceware.org/bugzilla/show_bug.cgi?id=19341)

The default blessing name generated doesn't really need os/user so
remove it.

Change-Id: I7105a269f63c855483c0296ac2919a50dff1e7ac
@sokoow

This comment has been minimized.

Copy link

commented Aug 15, 2016

Here's another way that people deal with this: tamird/cockroach@9c93044

@gopherbot

This comment has been minimized.

Copy link

commented Dec 13, 2016

CL https://golang.org/cl/34175 mentions this issue.

gopherbot pushed a commit to golang/oauth2 that referenced this issue Dec 13, 2016

google: prefer os.Getenv("HOME") over os/user.Current() so as to avoi…
…d SEGV

Due to an issue in handling thread-local storages, os/user can lead to SEGV
when glibc is statically linked with.

So we prefer os.Getenv("HOME") for guessing where is the home directory.

See also: golang/go#13470

Change-Id: I1046ff93a71aa3b11299f7e6cf65ff7b1fb07eb9
Reviewed-on: https://go-review.googlesource.com/34175
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

AkihiroSuda added a commit to AkihiroSuda/docker that referenced this issue Dec 16, 2016

gcplogs: forcibly set HOME on static UNIX binary
Fix moby#29344

If HOME is not set, the gcplogs logging driver will call os/user.Current() via oauth2/google.
However, in static binary, os/user.Current() leads to segfault due to a glibc issue that won't be fixed
in a short term. (golang/go#13470, https://sourceware.org/bugzilla/show_bug.cgi?id=19341)
So we forcibly set HOME so as to avoid call to os/user/Current().

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

AkihiroSuda added a commit to AkihiroSuda/docker that referenced this issue Dec 26, 2016

gcplogs: forcibly set HOME on static UNIX binary
Fix moby#29344

If HOME is not set, the gcplogs logging driver will call os/user.Current() via oauth2/google.
However, in static binary, os/user.Current() leads to segfault due to a glibc issue that won't be fixed
in a short term. (golang/go#13470, https://sourceware.org/bugzilla/show_bug.cgi?id=19341)
So we forcibly set HOME so as to avoid call to os/user/Current().

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

AkihiroSuda added a commit to AkihiroSuda/docker that referenced this issue Dec 26, 2016

gcplogs: forcibly set HOME on static UNIX binary
Fix moby#29344

If HOME is not set, the gcplogs logging driver will call os/user.Current() via oauth2/google.
However, in static binary, os/user.Current() leads to segfault due to a glibc issue that won't be fixed
in a short term. (golang/go#13470, https://sourceware.org/bugzilla/show_bug.cgi?id=19341)
So we forcibly set HOME so as to avoid call to os/user/Current().

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

AkihiroSuda added a commit to AkihiroSuda/docker that referenced this issue Dec 29, 2016

gcplogs: forcibly set HOME on static UNIX binary
Fix moby#29344

If HOME is not set, the gcplogs logging driver will call os/user.Current() via oauth2/google.
However, in static binary, os/user.Current() leads to segfault due to a glibc issue that won't be fixed
in a short term. (golang/go#13470, https://sourceware.org/bugzilla/show_bug.cgi?id=19341)
So we forcibly set HOME so as to avoid call to os/user/Current().

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

janeczku added a commit to janeczku/external-dns that referenced this issue Mar 10, 2017

Build static binary with Go internal linker
Static linking with gcc may cause segfaults at runtime on some systems
(see golang/go#13470)

janeczku added a commit to janeczku/external-dns that referenced this issue Mar 11, 2017

Build static binary with Go internal linker
Static linking with gcc may cause segfaults at runtime on some systems
(see golang/go#13470)

@geek1011 geek1011 referenced this issue Oct 18, 2017

Closed

Seg Fault redux? #2728

2 of 7 tasks complete

srust added a commit to srust/moby that referenced this issue Nov 30, 2017

gcplogs: forcibly set HOME on static UNIX binary
Fix moby#29344

If HOME is not set, the gcplogs logging driver will call os/user.Current() via oauth2/google.
However, in static binary, os/user.Current() leads to segfault due to a glibc issue that won't be fixed
in a short term. (golang/go#13470, https://sourceware.org/bugzilla/show_bug.cgi?id=19341)
So we forcibly set HOME so as to avoid call to os/user/Current().

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

stevvooe added a commit to containerd/ttrpc that referenced this issue Dec 1, 2017

ttrpc: use os.Getuid/os.Getgid directly
Because of issues with glibc, using the `os/user` package can cause when
calling `user.Current()`. Neither the Go maintainers or glibc developers
could be bothered to fix it, so we have to work around it by calling the
uid and gid functions directly. This is probably better because we don't
actually use much of the data provided in the `user.User` struct.

This required some refactoring to have better control over when the uid
and gid are resolved. Rather than checking the current user on every
connection, we now resolve it once at initialization. To test that this
provided an improvement in performance, a benchmark was added.
Unfortunately, this exposed a regression in the performance of unix
sockets in Go when `(*UnixConn).File` is called. The underlying culprit
of this performance regression is still at large.

The following open issues describe the underlying problem in more
detail:

golang/go#13470
https://sourceware.org/bugzilla/show_bug.cgi?id=19341

In better news, I now have an entire herd of shaved yaks.

Signed-off-by: Stephen J Day <stephen.day@docker.com>

@golang golang locked and limited conversation to collaborators Dec 13, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.