Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: Trampoline insertion breaks DWARF Line Program Table output on Darwin/ARM64 #54320

Closed
jquirke opened this issue Aug 6, 2022 · 9 comments
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin
Milestone

Comments

@jquirke
Copy link
Contributor

jquirke commented Aug 6, 2022

What version of Go are you using (go version)?

$ go version
go version go1.19 darwin/arm64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/qjeremy/Library/Caches/go-build"
GOENV="/Users/qjeremy/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/qjeremy/go-code/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/qjeremy/go-code"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.19"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/qjeremy/go-code/src/code.uber.internal/go.mod"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/0h/nzm2hwq95tq8d9mpsqmr7c6w0000gn/T/go-build3074789463=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

  1. On Darwin ARM64, create a simple hello world:
package main

import "fmt"

func main() {fmt.Printf("Hello world\n")}
  1. build it forcing trampolines on which are used on ARM64 in practice for much larger jumps of 128MB in larger binaries

$go build -ldflags '-debugtramp=2'

  1. See it runs:
 ./hello
Hello world
  1. Debug it with dlv 1.9.0, set breakpoint on main.main

What did you expect to see?

(dlv) b main.main
Breakpoint 1 set at 0x100ee1500 for main.main() ./hello.go:5
(dlv) c
> main.main() ./hello.go:5 (hits goroutine(1):1 total:1) (PC: 0x100ee1500)
Warning: debugging optimized function
     1:	package main
     2:
     3:	import "fmt"
     4:
=>   5:	func main() {fmt.Printf("Hello world\n")}
(dlv) disass
TEXT main.main(SB) /Users/qjeremy/go-code/src/code.uber.internal/marketplace/driver-pricing/hello/hello.go
	hello.go:5	0x100ee14f0	900b40f9	MOVD 16(R28), R16
	hello.go:5	0x100ee14f4	f1030091	MOVD RSP, R17
	hello.go:5	0x100ee14f8	3f0210eb	CMP R16, R17
	hello.go:5	0x100ee14fc	69020054	BLS 19(PC)
=>	hello.go:5	0x100ee1500*	fe0f1bf8	MOVD.W R30, -80(RSP)
	hello.go:5	0x100ee1504	fd831ff8	MOVD R29, -8(RSP)

What did you see instead?

Line information is removed.

(dlv) b main.main
Breakpoint 1 set at 0x104462270 for main.main() :0
(dlv) c
Stopped at: 0x104462270
=>   1:	no source available
(dlv) disass
TEXT main.main(SB)
	.:0	0x104462260	900b40f9	MOVD 16(R28), R16
	.:0	0x104462264	f1030091	MOVD RSP, R17
	.:0	0x104462268	3f0210eb	CMP R16, R17
	.:0	0x10446226c	69020054	BLS 19(PC)
=>	.:0	0x104462270*	fe0f1bf8	MOVD.W R30, -80(RSP)

Anaylsis

The trampoline path in the linker turns internal symbols (cloneToExternal) that are trampolined into external symbols.

Now, the DWARF generation code, e.g. in writelines skips over external symbols with the explicit stated assumption that they would never have auxsyms, which is not true for trampolines.

Indeed, this proof of concept change test appears to fix the problem

master...jquirke:go:Linker_DWARF

Comments

Although the repro forces trampolines; there are many ARM64 binaries at Uber that require relocations more than +/- 124MB, and thus are not debuggable

@jquirke jquirke changed the title affected/package: [Linker] Trampoline insertion breaks DWARF Line Program Table output on Darwin/ARM64 Aug 6, 2022
@jquirke jquirke changed the title [Linker] Trampoline insertion breaks DWARF Line Program Table output on Darwin/ARM64 cmd/link: Trampoline insertion breaks DWARF Line Program Table output on Darwin/ARM64 Aug 6, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 6, 2022
@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 8, 2022
@thanm
Copy link
Contributor

thanm commented Aug 8, 2022

Your patch looks good to me. Looks like the PR is blocked due to CLA, but I'm happy to review when it arrives in Gerrit.

@cherrymui
Copy link
Member

cherrymui commented Aug 8, 2022

I think you can just call (*Loader).auxs ( https://cs.opensource.google/go/go/+/master:src/cmd/link/internal/loader/loader.go;l=1854 ), and drop the IsExternal condition.

Also, see https://go.dev/doc/contribute for how to contribute to Go. We cannot do code review on GitHub. Thanks.

@cherrymui cherrymui added this to the Backlog milestone Aug 8, 2022
@gopherbot
Copy link

gopherbot commented Aug 8, 2022

Change https://go.dev/cl/422154 mentions this issue: cmd/link: fix broken DWARF LPT on trampoline architectures

jquirke added a commit to jquirke/go that referenced this issue Aug 9, 2022
When trampolines are needed (e.g. Darwin ARM64), the DWARF LPT (Line
Program Table - see DWARF section 6.1) generation fails because the
replacement symbols are marked as external symbols and skipped during
the DWARF LPT generation phase.

This PR is still subject to extensive testing but has rectified the
issue for the very large binaries we link at Uber.

Fixes golang#54320
jquirke added a commit to jquirke/go that referenced this issue Aug 9, 2022
When trampolines are needed (e.g. Darwin ARM64), the DWARF LPT (Line
Program Table - see DWARF section 6.1) generation fails because the
replacement symbols are marked as external symbols and skipped during
the DWARF LPT generation phase.

This PR is still subject to extensive testing but has rectified the
issue for the very large binaries we link at Uber.

Fixes golang#54320
jquirke added a commit to jquirke/go that referenced this issue Aug 9, 2022
When trampolines are needed (e.g. Darwin ARM64), the DWARF LPT (Line
Program Table - see DWARF section 6.1) generation fails because the
replacement symbols are marked as external symbols and skipped during
the DWARF LPT generation phase.

Fixes golang#54320
@dmitshur dmitshur modified the milestones: Backlog, Go1.20 Aug 12, 2022
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin arch-arm64 and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 12, 2022
@cagedmantis
Copy link
Contributor

cagedmantis commented Aug 17, 2022

This seems like we also need a backport issue for 1.18. Please close the backport issue if it isn't the case.

@cagedmantis
Copy link
Contributor

cagedmantis commented Aug 17, 2022

@gopherbot please open a backport to 1.18.

@gopherbot
Copy link

gopherbot commented Aug 17, 2022

Backport issue(s) opened: #54502 (for 1.18).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@lizthegrey
Copy link

lizthegrey commented Aug 17, 2022

is this darwin/arm64 only, or linux/arm64 too?

@cherrymui
Copy link
Member

cherrymui commented Aug 17, 2022

Technically it includes linux/arm64 as well. But generally one would not build programs with -debugtramp=2 flag. In the default setting, trampolines may still be used if the program is very large, which may be affected.

This bug does not affect the correctness of the program, only debug info generation.

@jquirke
Copy link
Contributor Author

jquirke commented Aug 17, 2022

This affects all trampoline architectures.

Internally at Uber, we are seeing many SWEs on Apple M1 setups unable to practically debug (line level step through) Go programs that are statically linked over 124MB, which is the trampoline size for Darwin/ARM64.

The debugtramp=2 is just a trivial way to reproduce this. We see real world binaries linked without special flags that require linker trampolines being affected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants