Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV when running dockerd on RISCV64 #46225

Closed
ACov96 opened this issue May 18, 2021 · 4 comments
Closed

runtime: SIGSEGV when running dockerd on RISCV64 #46225

ACov96 opened this issue May 18, 2021 · 4 comments

Comments

@ACov96
Copy link

@ACov96 ACov96 commented May 18, 2021

Currently trying to run Docker on a BeagleV Starlight and am running into a fatal error when running the Docker daemon:

fatal error: unexpected signal during runtime execution                                                                                                                                      
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]

What version of Go are you using (go version)?

$ go version
go version devel go1.17-ce92a2023c Sat May 15 02:39:08 2021 +0000 linux/riscv64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="riscv64"
GOBIN=""
GOCACHE="/home/riscv/.cache/go-build"
GOENV="/home/riscv/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="riscv64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/riscv/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/riscv/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_riscv64"
GOVCS=""
GOVERSION="devel go1.17-ce92a2023c Sat May 15 02:39:08 2021 +0000"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2240106131=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Attempting to run Docker, following the steps from this repo. Specifically, I do the following to reproduce:

  1. Build Go. Instructions here, all tests passed.
  2. Download .deb package, install with sudo apt install ./docker-*.deb
  3. Current image I'm using is using an incompatible version of glibc with the Docker build, rebuild glibc 2.32
$ mkdir -p projects/glibc-local && cd projects/glibc-local
$ wget http://mirrors.syringanetworks.net/gnu/libc/glibc-2.32.tar.gz
$ tar xzvf glibc-2.32.tar.gz
$ mkdir out && cd out
$ ../glibc-2.32/configure --prefix=$(pwd)
$ make -j2 && make install
  1. Run dockerd, setting LD_LIBRARY_PATH to link against local glibc.
$ LD_LIBRARY_PATH=/home/riscv/projects/glibc-local/out dockerd

What did you expect to see?

Expected normal Docker daemon output, expected no segmentation violation error.

What did you see instead?

Full output:
INFO[2021-05-18T02:42:29.191930066Z] Starting up                                                                                                                                             
INFO[2021-05-18T02:42:29.212059345Z] libcontainerd: started new containerd process  pid=76622                                                                                                
INFO[2021-05-18T02:42:29.213144348Z] parsed scheme: "unix"                         module=grpc                                                                                               
INFO[2021-05-18T02:42:29.213751092Z] scheme "unix" not registered, fallback to default scheme  module=grpc                                                                                   
INFO[2021-05-18T02:42:29.214542003Z] ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}  module=grpc                
INFO[2021-05-18T02:42:29.215217070Z] ClientConn switching balancer to "pick_first"  module=grpc                                                                                              
fatal error: unexpected signal during runtime execution                                                                                                                                      
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x0]                                                                                                                            
                                                                                                                                                                                             
runtime stack:                                                                                                                                                                               
runtime: unexpected return pc for runtime.sigpanic called from 0x3fd24af30a                                                                                                                  
stack: frame={sp:0x3fff8c5140, fp:0x3fff8c5170} stack=[0x3fff0c6290,0x3fff8c5300)                                                                                                            
0000003fff8c5040:  0000000000000000  0000000000081a14 <runtime.fatalthrow.func1+92>                                                                                                          
0000003fff8c5050:  000000000004f93c <runtime.throw+100>  0000003fff8c5118                                                                                                                    
0000003fff8c5060:  0000000000000000  0000000001dcb1a0                                                                                                                                        
0000003fff8c5070:  010000000004fd24  0000000000000004                                                                                                                                        
0000003fff8c5080:  000000000000001f  0000000000000000                                                                                                                                        
0000003fff8c5090:  0000000000000000  0000000000000001                                                                                                                                        
0000003fff8c50a0:  000000000124efe6  000000000004faec <runtime.fatalthrow+76>                                                                                                                
0000003fff8c50b0:  0000000001dcb1a0  000000000004f93c <runtime.throw+100>                                                                                                                    
0000003fff8c50c0:  0000003fff8c5118  0000000000051280 <runtime.printunlock+96>                                                                                                               
0000003fff8c50d0:  0000003fff8c5118  000000000004f93c <runtime.throw+100>                                                                                                                    
0000003fff8c50e0:  0000000001dcb1a0  000000000004f93c <runtime.throw+100>                                                                                                                    
0000003fff8c50f0:  0000003fff8c50f8  00000000000819b8 <runtime.fatalthrow.func1+0>                                                                                                           
0000003fff8c5100:  0000000001dcb1a0  000000000004f93c <runtime.throw+100>                                                                                                                    
0000003fff8c5110:  0000003fff8c5118  0000000000069360 <runtime.sigpanic+832>                                                                                                                 
0000003fff8c5120:  0000003fff8c5128  0000000000081930 <runtime.throw.func1+0>                                                                                                                
0000003fff8c5130:  0000000001266108  000000000000002a                                                                                                                                        
0000003fff8c5140: <0000003fd24af30a  0000000001266108                                                                                                                                        
0000003fff8c5150:  000000000000002a  0000000001dbe940                                                                                                                                        
0000003fff8c5160:  0000003fd251d4dc  000000000121d380                                                                                                                                        
0000003fff8c5170: >0000003fd24af30a  0000003fff8c5270                                                                                                                                        
0000003fff8c5180:  0000000000000000  00000000013efb06                                                                                                                                        
0000003fff8c5190:  0000000000000fea  0000000000000001                                                                                                                                        
0000003fff8c51a0:  0000003fff8c5300  0000003fff8c5270                                                                                                                                        
0000003fff8c51b0:  0000003fd244750e  0000000000000fea 
0000003fff8c51c0:  0000000000000030  0000000000000010 
0000003fff8c51d0:  0000000000f0c5ea  0000000000000000 
0000003fff8c51e0:  0000000000000000  0000003fff8c5270 
0000003fff8c51f0:  0000003fd2448d32  0000000000000000 
0000003fff8c5200:  0000000000000018  0000000000000000 
0000003fff8c5210:  0000000000f0c5ea  00000000000000c8 
0000003fff8c5220:  00000000000876d4 <runtime.asmcgocall+84>  0000000001dcb1a0 
0000003fff8c5230:  00000000000000c8  0000000000056614 <runtime.newm1+172> 
0000003fff8c5240:  000000000005649c <runtime.newm+172>  0000000000f0c5d8 
0000003fff8c5250:  0000003fff8c5270  0000000000055124 <runtime.mstart1+140> 
0000003fff8c5260:  0000000000fbd3a0  00000000012d11a0 
runtime.throw(0x1266108, 0x2a)
        /usr/local/go/src/runtime/panic.go:1117 +0x64
runtime: unexpected return pc for runtime.sigpanic called from 0x3fd24af30a
stack: frame={sp:0x3fff8c5140, fp:0x3fff8c5170} stack=[0x3fff0c6290,0x3fff8c5300)
0000003fff8c5040:  0000000000000000  0000000000081a14 <runtime.fatalthrow.func1+92> 
0000003fff8c5050:  000000000004f93c <runtime.throw+100>  0000003fff8c5118 
0000003fff8c5060:  0000000000000000  0000000001dcb1a0 
0000003fff8c5070:  010000000004fd24  0000000000000004 
0000003fff8c5080:  000000000000001f  0000000000000000 
0000003fff8c5090:  0000000000000000  0000000000000001 
0000003fff8c50a0:  000000000124efe6  000000000004faec <runtime.fatalthrow+76> 
0000003fff8c50b0:  0000000001dcb1a0  000000000004f93c <runtime.throw+100> 
0000003fff8c50c0:  0000003fff8c5118  0000000000051280 <runtime.printunlock+96> 
0000003fff8c50d0:  0000003fff8c5118  000000000004f93c <runtime.throw+100> 
0000003fff8c50e0:  0000000001dcb1a0  000000000004f93c <runtime.throw+100> 
0000003fff8c50f0:  0000003fff8c50f8  00000000000819b8 <runtime.fatalthrow.func1+0> 
0000003fff8c5100:  0000000001dcb1a0  000000000004f93c <runtime.throw+100> 
0000003fff8c5110:  0000003fff8c5118  0000000000069360 <runtime.sigpanic+832> 
0000003fff8c5120:  0000003fff8c5128  0000000000081930 <runtime.throw.func1+0> 
0000003fff8c5130:  0000000001266108  000000000000002a 
0000003fff8c5140: <0000003fd24af30a  0000000001266108 
0000003fff8c5150:  000000000000002a  0000000001dbe940 
0000003fff8c5160:  0000003fd251d4dc  000000000121d380 
0000003fff8c5170: >0000003fd24af30a  0000003fff8c5270 
0000003fff8c5180:  0000000000000000  00000000013efb06 
0000003fff8c5190:  0000000000000fea  0000000000000001 
0000003fff8c51a0:  0000003fff8c5300  0000003fff8c5270 
0000003fff8c51b0:  0000003fd244750e  0000000000000fea 
0000003fff8c51c0:  0000000000000030  0000000000000010 
0000003fff8c51d0:  0000000000f0c5ea  0000000000000000 
0000003fff8c51e0:  0000000000000000  0000003fff8c5270 
0000003fff8c51f0:  0000003fd2448d32  0000000000000000 
0000003fff8c5200:  0000000000000018  0000000000000000 
0000003fff8c5210:  0000000000f0c5ea  00000000000000c8 
0000003fff8c5220:  00000000000876d4 <runtime.asmcgocall+84>  0000000001dcb1a0 
0000003fff8c5230:  00000000000000c8  0000000000056614 <runtime.newm1+172> 
0000003fff8c5240:  000000000005649c <runtime.newm+172>  0000000000f0c5d8 
0000003fff8c5250:  0000003fff8c5270  0000000000055124 <runtime.mstart1+140> 
0000003fff8c5260:  0000000000fbd3a0  00000000012d11a0 
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:718 +0x340

goroutine 1 [running]:
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_riscv64.s:90 +0x8 fp=0x3fa805c788 sp=0x3fa805c780 pc=0x87410
runtime.main()
        /usr/local/go/src/runtime/proc.go:144 +0x8c fp=0x3fa805c7d8 sp=0x3fa805c788 pc=0x51fac
runtime.goexit()
        /usr/local/go/src/runtime/asm_riscv64.s:517 +0x4 fp=0x3fa805c7d8 sp=0x3fa805c7d8 pc=0x8944c
ERRO[2021-05-18T02:42:29.361538395Z] containerd did not exit successfully          error="exit status 2" module=libcontainerd
WARN[2021-05-18T02:42:30.218719924Z] grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout". Reconnecting...  module=grpc
WARN[2021-05-18T02:42:32.883909668Z] grpc: addrConn.createTransport failed to connect to {unix:///var/run/docker/containerd/containerd.sock  <nil> 0 <nil>}. Err :connection error: desc = "transport: error while dialing: dial unix:///var/run/docker/containerd/containerd.sock: timeout". Reconnecting...  module=grpc
^C

Please let me know if you think I should also open an issue with the Docker team, although this looks to be a Golang issue if I am reading the errors correctly.

@seankhliao seankhliao changed the title SIGSEGV when running dockerd on RISCV64 runtime: SIGSEGV when running dockerd on RISCV64 May 18, 2021
@seankhliao
Copy link
Contributor

@seankhliao seankhliao commented May 18, 2021

Loading

@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented May 18, 2021

If we can trust that hex dump, the first thing that I'm thinking is the problem has something to do with your libc. Looking at the hex dump, there's an asmcgocall in there, and I guess Docker must use cgo on Linux (or, maybe it's a RISC-V thing?), so we're calling into libc to create a new thread (newm in the hex dump) and taking the cgo path (see newm1 where we have a direct call to asmcgocall). In there, something goes wrong, and we have a nil/NULL dereference. We end up in a Go panic because we're calling cgo from the runtime (and we don't seem to do any special signal masking like we do for the usual cgo calls) so when the signal lands on that thread, the runtime can't actually unwind through the call stack (indicating the signal triggered while in a C stack frame) and you get a hex dump instead.

But, at first glance, I don't really know why there's a nil pointer dereference as a result of all this. Does it work if you use a different glibc version?

Loading

@prattmic
Copy link
Member

@prattmic prattmic commented May 18, 2021

To add to what Michael said, you are building a custom version of glibc and specifying it at runtime with LD_LIBRARY_PATH. This means that the dynamic linker specified by the dockerd binary will be from the older, system, version of glibc and it will load the newer version of glibc.

Is such a version mismatch between the dynamic linker and libc glibc versions allowed? Does this have the same result if you use ld.so from your custom build of glibc as well?

Loading

@ACov96
Copy link
Author

@ACov96 ACov96 commented May 19, 2021

Does it work if you use a different glibc version?

No, the binary I'm using complains about a mismatch in versions. It throws the error:

containerd: /lib/riscv64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by containerd)

Because the system version is not glibc 2.32 (system version is 2.31), I thought I could build it and link against a separate build of glibc using the LD_LIBRARY_PATH variable.

Is such a version mismatch between the dynamic linker and libc glibc versions allowed?

Right, I don't think version mismatches are allowed, but I forgot that dockerd will use the dynamic linker that was specified when it was compiled and won't just pick up the one in the standalone build after setting LD_LIBRARY_PATH. The binary itself would need to be patched. The better solution would be to just build the entire Docker stack and link it against the system glibc instead, but myself and some others were running into issues with the build system for that project. I think I'll open an issue there next.

Regardless, I think the first issue I need to solve is getting the glibc versions all synced up, which I don't think requires keeping this issue open. Thank you all for help!

Loading

@ACov96 ACov96 closed this May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants