Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: async preemption support on ARM64 breaks ARM64 on QEMU #36981

Open
kortschak opened this issue Feb 2, 2020 · 9 comments
Open

runtime: async preemption support on ARM64 breaks ARM64 on QEMU #36981

kortschak opened this issue Feb 2, 2020 · 9 comments

Comments

@kortschak
Copy link
Contributor

@kortschak kortschak commented Feb 2, 2020

The change in 1b0b980 breaks QEMU support for Go programs, resulting in a variety of crashes ranging from segmentation faults, illegal instructions and scheduler panics (holding locks).

This is partially recovered by the change in 73d20f8 which prevents the crashes, but results in extremely poor performance; processes appear only able to use about 10-15% of a core compared to ~100% prior to 1b0b980.

What version of Go are you using (go version)?

$ go version
go version devel +1b0b980904 Thu Nov 7 19:18:12 2019 +0000 linux/amd64

Does this issue reproduce with the latest release?

No

What operating system and processor architecture are you using (go env)?

Natively on amd64/linux, but qemu-user with GOARCH=arm64.

go env Output
$ go env
GO111MODULE="on"
GOARCH="arm64"
GOBIN="/home/user/bin"
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/user/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/user/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="0"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build167327579=/tmp/go-build -gno-record-gcc-switches"
~ $ lsb_release -d
Description:	Ubuntu 18.04.3 LTS
~ $ dpkg -l qemu-user qemu-system-common qemu-system-arm 
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                          Version             Architecture        Description
+++-=============================-===================-===================-===============================================================
ii  qemu-system-arm               1:2.11+dfsg-1ubuntu amd64               QEMU full system emulation binaries (arm)
ii  qemu-system-common            1:2.11+dfsg-1ubuntu amd64               QEMU full system emulation binaries (common files)
ii  qemu-user                     1:2.11+dfsg-1ubuntu amd64               QEMU user mode emulation binaries

What did you do?

In a checkout of gonum.org/v1/gonum/lapack/gonum, run GOARCH=arm64 go test -run Dgeev

I have tried to make a smaller reproducer, but have not managed yet.

What did you expect to see?

Other expected package failures.

What did you see instead?

A variety of crashes.

@josharian

This comment has been minimized.

Copy link
Contributor

@josharian josharian commented Feb 2, 2020

cc @aclements @cherrymui @ianlancetaylor

Is this something we need to fix for 1.14, since it is a regression?

Tentatively marking release blocker.

@josharian josharian added this to the Go1.14 milestone Feb 2, 2020
@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Feb 3, 2020

See #15050 (comment) and #1508 (comment).

QEMU has historically been fairly buggy for multi-threaded programs, so a regression here is not a big surprise.

Per the porting policy, supported platforms need to have a developer maintaining the port for that platform, and a builder running tests. If folks are relying on QEMU being able to run Go programs, then someone needs to volunteer to diagnose the QEMU bugs, and to either fix them in QEMU or actively maintain workarounds on the Go side.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Feb 3, 2020

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Feb 3, 2020

We have a very minimal "keep arm/arm64 qemu-user working with go binaries" test that I committed at 5095e5d (it only runs a hello world). But in general yeah, I don't think anyone has ever made a serious effort to keep qemu-user working and performant; so it shouldn't be expect to necessarily work well.

@bcmills

This comment has been minimized.

Copy link
Member

@bcmills bcmills commented Feb 3, 2020

@ALTree, we intentionally deleted that test (in CL 206137; see #35457) because it was flaky and nobody had volunteered to support the configuration it was testing.

@ALTree

This comment has been minimized.

Copy link
Member

@ALTree ALTree commented Feb 3, 2020

@bcmills Ah, funny. I didn't have any memory of the deletion, even though you cc'ed me on the CL. So I guess qemu-user really shouldn't be expected to work smoothly.

@cherrymui

This comment has been minimized.

Copy link
Contributor

@cherrymui cherrymui commented Feb 3, 2020

As @bcmills said, QEMU user mode is not actually supported. I think this is a QEMU bug in handling async signals, especially with signal context modifications. And it's not only ARM64, but also other architectures. For reference, for AMD64, qemu-x86_64 doesn't work since at least Go 1.6 (due to other reasons, not async signal).

$ go1.6 build hello.go
$ qemu-x86_64 ./hello
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Feb 3, 2020

Does it work on qemu-arm64 if you set GODEBUG=asyncpreemptoff=1?

@kortschak

This comment has been minimized.

Copy link
Contributor Author

@kortschak kortschak commented Feb 3, 2020

Yes, it runs very happily with that variable set.

@toothrot toothrot modified the milestones: Go1.14, Go1.15 Feb 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants
You can’t perform that action at this time.