Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: a program built with a too-high GOAMD64 value will dump core on startup #49586

Closed
siebenmann opened this issue Nov 15, 2021 · 4 comments
Closed
Labels
okay-after-beta1 release-blocker
Milestone

Comments

@siebenmann
Copy link

@siebenmann siebenmann commented Nov 15, 2021

What version of Go are you using (go version)?

$ go version
go version devel go1.18-5337e53dfa Sun Nov 14 17:38:42 2021 +0000 linux/amd64

Does this issue reproduce with the latest release?

No, since Go 1.17 doesn't support GOAMD64.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/u/cks/.cache/go-build"
GOENV="/u/cks/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/h/281/cks/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/h/281/cks/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/h/281/cks/src/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/h/281/cks/src/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="devel go1.18-5337e53dfa Sun Nov 14 17:38:42 2021 +0000"
GCCGO="gccgo"
GOAMD64="v3"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/tmp/t/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build4250807351=/tmp/go-build -gno-record-gcc-switches"

What did you do?

If I build a Go program (even the playground's "Hello world") with a value of GOAMD64 that's too high for the machine I'm trying to run the program on, the program will dump core on startup. This happens even for GOAMD64 values that work on the machine I build the program on, but not on a machine with an older CPU; I can compile with GOAMD64=v3 on one machine where it works, copy the binary to another, and it dumps core on startup instead of running. On the original test machine, GOAMD64=v4 produces a binary that doesn't even run on the machine itself.

Gdb says that this is happening in:

(gdb) where
#0  0x000000000045c6c0 in runtime.write (fd=2, p=0x4b1680 <bad_cpu_msg>, n=84, ~r0=<optimized out>) at <autogenerated>:1
#1  0x0000000000457bc4 in runtime.rt0_go () at /h/281/cks/src/go/src/runtime/asm_amd64.s:194
[...]

Delve agrees with this trace and shows the relevant runtime source as:

(dlv) bt
0  0x000000000045c6c0 in runtime.write
   at :0
1  0x0000000000457bc4 in runtime.rt0_go
   at /h/281/cks/src/go/src/runtime/asm_amd64.s:194
(dlv) up
Stopped at: 0x45c6c0
=>   1: no source available
Frame 1: /h/281/cks/src/go/src/runtime/asm_amd64.s:194 (PC: 457bc4)
   189: bad_cpu: // show that the program requires a certain microarchitecture level.
   190:         MOVQ    $2, 0(SP)
   191:         MOVQ    $bad_cpu_msg<>(SB), AX
   192:         MOVQ    AX, 8(SP)
   193:         MOVQ    $84, 16(SP)
=> 194:         CALL    runtime·write(SB)
   195:         MOVQ    $1, 0(SP)
   196:         CALL    runtime·exit(SB)
   197:         CALL    runtime·abort(SB)
   198: #endif
   199:

Delve further reports the crash (dis)assembly and fault address as:

(dlv) disassemble
TEXT runtime.write(SB) <autogenerated>
        <autogenerated>:1       0x45c6a0        4883ec20                sub rsp, 0x20
        <autogenerated>:1       0x45c6a4        48896c2418              mov qword ptr [rsp+0x18], rbp
        <autogenerated>:1       0x45c6a9        488d6c2418              lea rbp, ptr [rsp+0x18]
        <autogenerated>:1       0x45c6ae        488b442428              mov rax, qword ptr [rsp+0x28]
        <autogenerated>:1       0x45c6b3        488b5c2430              mov rbx, qword ptr [rsp+0x30]
        .:0                     0x45c6b8        8b4c2438                mov ecx, dword ptr [rsp+0x38]
        .:0                     0x45c6bc        450f57ff                xorps xmm15, xmm15
=>      .:0                     0x45c6c0        644c8b3425f8ffffff      mov r14, qword ptr fs:[0xfffffff8]
        .:0                     0x45c6c9        e85207ffff              call $runtime.write
        .:0                     0x45c6ce        89442440                mov dword ptr [rsp+0x40], eax
        .:0                     0x45c6d2        488b6c2418              mov rbp, qword ptr [rsp+0x18]
        .:0                     0x45c6d7        4883c420                add rsp, 0x20
        .:0                     0x45c6db        c3                      ret
(dlv) regs
    Rip = 0x000000000045c6c0
    Rsp = 0x00007ffcd8aa16b8
    Rax = 0x0000000000000002
    Rbx = 0x00000000004b1680
    Rcx = 0x0000000000000054
    Rdx = 0x0000000000000000
    Rsi = 0x00007ffcd8aa1718
    Rdi = 0x000000000051e540
    Rbp = 0x00007ffcd8aa16d0
     R8 = 0x0000000000000000
     R9 = 0x0000000000000000
    R10 = 0x0000000000000000
    R11 = 0x0000000000000000
    R12 = 0x0000000000000000
    R13 = 0x0000000000000000
    R14 = 0x0000000000000000
    R15 = 0x0000000000000000
 Rflags = 0x0000000000010206    [PF IF IOPL=0 RF]
     Es = 0x0000000000000000
     Cs = 0x0000000000000033
     Ss = 0x000000000000002b
     Ds = 0x0000000000000000
     Fs = 0x0000000000000000
     Gs = 0x0000000000000000
Fs_base = 0x0000000000000000
Gs_base = 0x0000000000000000
@randall77
Copy link
Contributor

@randall77 randall77 commented Nov 15, 2021

This is a tricky one.

The rt0_go code is trying to call runtime.write. It's actually calling runtime.write.abi0. That is a wrapper that calls into runtime.write using the internal ABI. Part of that wrapper sets up r14 by loading it from TLS. But the TLS pointer has not been set up yet, so boom.

How did this used to work? The previous code called runtime.exit.abi0 for these kinds of errors, but that is implemented in assembly, so no wrapper needed.

I don't think we can call into Go code at this point in startup. We could delay the report until after settls runs, maybe?

For some reason it works fine on Darwin - something is different enough about TLS that it survives the bogus load. And nothing in runtime.write happens to use the G register for anything. At least I think, there's a lot of strangeness across platforms here.

@martisch @vpachkov

@randall77 randall77 added this to the Go1.18 milestone Nov 15, 2021
@randall77 randall77 added release-blocker okay-after-beta1 labels Nov 15, 2021
@mknyszek mknyszek changed the title A program built with a too-high GOAMD64 value will dump core on startup runtime: a program built with a too-high GOAMD64 value will dump core on startup Nov 15, 2021
@gopherbot
Copy link

@gopherbot gopherbot commented Nov 16, 2021

Change https://golang.org/cl/364174 mentions this issue: runtime: check GOAMD64 compatibility after setting up TLS

@vpachkov
Copy link
Contributor

@vpachkov vpachkov commented Nov 16, 2021

Seems like https://github.com/golang/go/blob/master/src/runtime/asm_386.s#L128 is another place that may be facing the same issue.

@randall77
Copy link
Contributor

@randall77 randall77 commented Nov 16, 2021

Possibly, yes. I think there is less of an issue because 386 isn't using the register ABI yet, so there's no explicit use of TLS to set up the G register. But the stack check in the preamble of write might be confused.
Also the processors that don't support the 386 minimum are very old at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
okay-after-beta1 release-blocker
Projects
None yet
Development

No branches or pull requests

4 participants