Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: function "decodetypeGcmask" read gcdata from shared object file may not correct when use lld link in linux/arm64 #69466

Open
And-ZJ opened this issue Sep 14, 2024 · 7 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@And-ZJ
Copy link
Contributor

And-ZJ commented Sep 14, 2024

Go version

go version go1.23.1 linux/arm64

Output of go env in your module/workspace:

GO111MODULE='auto'
GOARCH='arm64'
GOBIN=''
GOCACHE='/usr1/GOCACHE'
GOENV='/root/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/usr1/GOPATH/pkg/mod'
GOOS='linux'
GOPATH='/usr1/GOPATH'
GOROOT='/usr1/GoRelease/go1.23.1'
GOSUMDB='off'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr1/GoRelease/go1.23.1/pkg/tool/linux_arm64'
GOVCS=''
GOVERSION='go1.23.1'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/root/.config/go/telemetry'
GCCGO='gccgo'
GOARM64='v8.0'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD=''
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/usr1/tmp/go-build503891466=/tmp/go-build -gno-record-gcc-switches'

What did you do?

temp.go file:

package main

import (
	"errors"
	"runtime"
)

var err error

func main() {
	err = errors.New("111")
	runtime.GC()
	runtime.GC()
	runtime.GC()

	if err == nil {
		panic("err == nil")
	}
	if s := err.Error(); s != "111" {
		println(err)
		println(s)
		panic("s!=111")
	}
}
LLVM=/your_llvm_path
CC=$LLVM/bin/clang CXX=$LLVM/bin/clang++ go install -a -ldflags="-extldflags=-fuse-ld=lld" -buildmode=shared runtime sync/atomic
CC=$LLVM/bin/clang CXX=$LLVM/bin/clang++ go build -a -ldflags="-extldflags=-fuse-ld=lld" -linkshared temp.go
GODEBUG=clobberfree=1 ./temp

What did you see happen?

(0xaaaacd007078,0x400011c010)

panic: s!=111

goroutine 1 [running]:
main.main()
	/usr1/temp/temp.go:22 +0x14c

or

(0xaaaae0687078,0x4000196010)
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x2 addr=0x4000400000 pc=0xffff93014fe8]

goroutine 1 gp=0x40000041c0 m=5 mp=0x4000100008 [running]:
runtime.throw({0xffff92f353d7?, 0x0?})
	/usr1/GoRelease/go1.23.1/src/runtime/panic.go:1067 +0x38 fp=0x40000be5c0 sp=0x40000be590 pc=0xffff92fcade8
runtime.sigpanic()
	/usr1/GoRelease/go1.23.1/src/runtime/signal_unix.go:884 +0x378 fp=0x40000be620 sp=0x40000be5c0 pc=0xffff92fee308
runtime.memmove()
	/usr1/GoRelease/go1.23.1/src/runtime/memmove_arm64.s:159 +0x168 fp=0x40000be630 sp=0x40000be630 pc=0xffff93014fe8
runtime.recordForPanic({0x4000180000, 0x3915fe458886ea, 0x3915fe458886ea})
	/usr1/GoRelease/go1.23.1/src/runtime/print.go:45 +0x124 fp=0x40000be670 sp=0x40000be630 pc=0xffff92fce5f4
runtime.gwrite({0x4000180000, 0x3915fe458886ea, 0x1?})
	/usr1/GoRelease/go1.23.1/src/runtime/print.go:89 +0x2c fp=0x40000be6b0 sp=0x40000be670 pc=0xffff92fce76c
runtime.printstring({0x4000180000?, 0x4000196010?})
	/usr1/GoRelease/go1.23.1/src/runtime/print.go:246 +0x4c fp=0x40000be700 sp=0x40000be6b0 pc=0xffff92fcef3c
main.main()
	/usr1/temp/temp.go:21 +0x130 fp=0x40000be740 sp=0x40000be700 pc=0xaaaae0673a30
runtime.main()
	/usr1/GoRelease/go1.23.1/src/runtime/proc.go:272 +0x2e4 fp=0x40000be7d0 sp=0x40000be740 pc=0xffff92fcf694
runtime.goexit({})
	/usr1/GoRelease/go1.23.1/src/runtime/asm_arm64.s:1223 +0x4 fp=0x40000be7d0 sp=0x40000be7d0 pc=0xffff93013f24

What did you expect to see?

no crash.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Sep 14, 2024
@And-ZJ
Copy link
Contributor Author

And-ZJ commented Sep 14, 2024

By the way, when I use f117d1c9b5951ab2456c1e512ac0423fcf3d7ada in master branch to test.

When I run this command CC=$LLVM/bin/clang CXX=$LLVM/bin/clang++ go build -a -ldflags="-extldflags=-fuse-ld=lld" -linkshared temp.go

It's probably in an infinite loop. Some backtrace:

#0  runtime.growslice (oldPtr=0x406c90c300, newLen=33, oldCap=32, num=1, et=0x6505c0, ~r0=...) at /usr1/go/src/runtime/slice.go:177
#1  0x00000000004743c4 in cmd/go/internal/load.PackageList.func1 (p=0x400036ac08) at /usr1/go/src/cmd/go/internal/load/pkg.go:2684
#2  0x000000000047437c in cmd/go/internal/load.PackageList.func1 (p=0x400007ec08) at /usr1/go/src/cmd/go/internal/load/pkg.go:2682
#3  0x000000000047437c in cmd/go/internal/load.PackageList.func1 (p=0x400007e008) at /usr1/go/src/cmd/go/internal/load/pkg.go:2682
#4  0x000000000047437c in cmd/go/internal/load.PackageList.func1 (p=0x400036b808) at /usr1/go/src/cmd/go/internal/load/pkg.go:2682
#5  0x000000000047437c in cmd/go/internal/load.PackageList.func1 (p=0x4000373808) at /usr1/go/src/cmd/go/internal/load/pkg.go:2682
#6  0x000000000047427c in cmd/go/internal/load.PackageList (roots=..., ~r0=...) at /usr1/go/src/cmd/go/internal/load/pkg.go:2687
#7  0x0000000000476620 in cmd/go/internal/load.setToolFlags (pkgs=...) at /usr1/go/src/cmd/go/internal/load/pkg.go:3118
#8  0x0000000000474b0c in cmd/go/internal/load.LoadPackageWithFlags (path=..., srcDir=..., stk=<optimized out>, importPos=..., mode=<optimized out>, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/load/pkg.go:2748
#9  0x00000000004c59a8 in cmd/go/internal/work.readpkglist (shlibpath=..., pkgs=...) at /usr1/go/src/cmd/go/internal/work/action.go:411
#10 0x00000000004c91a4 in cmd/go/internal/work.(*Builder).linkSharedAction.func1 (~r0=<optimized out>) at /usr1/go/src/cmd/go/internal/work/action.go:895
#11 0x00000000004c5f00 in cmd/go/internal/work.(*Builder).cacheAction (b=0x40001a0360, mode=..., p=0x0, f={void (cmd/go/internal/work.Action *)} 0x403dfb7028, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:424
#12 0x00000000004c8a88 in cmd/go/internal/work.(*Builder).linkSharedAction (b=0x40001a0360, mode=2, depMode=2, shlib=..., a1=<optimized out>, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:891
#13 0x00000000004c80c0 in cmd/go/internal/work.(*Builder).addTransitiveLinkDeps (b=0x40001a0360, a=0x406c8b78c0, a1=0x406c8b7760, shlib=...)
    at /usr1/go/src/cmd/go/internal/work/action.go:840
#14 0x00000000004c95b0 in cmd/go/internal/work.(*Builder).linkSharedAction.func1 (~r0=<optimized out>) at /usr1/go/src/cmd/go/internal/work/action.go:966
#15 0x00000000004c5f00 in cmd/go/internal/work.(*Builder).cacheAction (b=0x40001a0360, mode=..., p=0x0, f={void (cmd/go/internal/work.Action *)} 0x403dfb7498, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:424
#16 0x00000000004c8a88 in cmd/go/internal/work.(*Builder).linkSharedAction (b=0x40001a0360, mode=2, depMode=2, shlib=..., a1=<optimized out>, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:891
#17 0x00000000004c80c0 in cmd/go/internal/work.(*Builder).addTransitiveLinkDeps (b=0x40001a0360, a=0x406c8b7080, a1=0x406c8b6f20, shlib=...)
    at /usr1/go/src/cmd/go/internal/work/action.go:840
#18 0x00000000004c95b0 in cmd/go/internal/work.(*Builder).linkSharedAction.func1 (~r0=<optimized out>) at /usr1/go/src/cmd/go/internal/work/action.go:966
#19 0x00000000004c5f00 in cmd/go/internal/work.(*Builder).cacheAction (b=0x40001a0360, mode=..., p=0x0, f={void (cmd/go/internal/work.Action *)} 0x403dfb7908, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:424
#20 0x00000000004c8a88 in cmd/go/internal/work.(*Builder).linkSharedAction (b=0x40001a0360, mode=2, depMode=2, shlib=..., a1=<optimized out>, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:891
#21 0x00000000004c80c0 in cmd/go/internal/work.(*Builder).addTransitiveLinkDeps (b=0x40001a0360, a=0x406c7a6dc0, a1=0x406c7a6c60, shlib=...)
    at /usr1/go/src/cmd/go/internal/work/action.go:840
#22 0x00000000004c95b0 in cmd/go/internal/work.(*Builder).linkSharedAction.func1 (~r0=<optimized out>) at /usr1/go/src/cmd/go/internal/work/action.go:966
#23 0x00000000004c5f00 in cmd/go/internal/work.(*Builder).cacheAction (b=0x40001a0360, mode=..., p=0x0, f={void (cmd/go/internal/work.Action *)} 0x403dfb7d78, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:424
#24 0x00000000004c8a88 in cmd/go/internal/work.(*Builder).linkSharedAction (b=0x40001a0360, mode=2, depMode=2, shlib=..., a1=<optimized out>, ~r0=<optimized out>)
    at /usr1/go/src/cmd/go/internal/work/action.go:891
#25 0x00000000004c80c0 in cmd/go/internal/work.(*Builder).addTransitiveLinkDeps (b=0x40001a0360, a=0x406c7a6580, a1=0x406c7a6420, shlib=...)
    at /usr1/go/src/cmd/go/internal/work/action.go:840

@And-ZJ
Copy link
Contributor Author

And-ZJ commented Sep 14, 2024

If I use the gold linker, it works correctly.

CC=$LLVM/bin/clang CXX=$LLVM/bin/clang++ go install -a -ldflags="-extldflags=-fuse-ld=gold" -buildmode=shared runtime sync/atomic
CC=$LLVM/bin/clang CXX=$LLVM/bin/clang++ go build -a -ldflags="-extldflags=-fuse-ld=gold" -linkshared temp.go
GODEBUG=clobberfree=1 ./temp

During temp.go compilation, the decodetypeGcmask function reads the type:error symbol data from libruntime,sync-atomic.so, and then obtains the data stored in the GCData field as the address, and obtains the gcmask data from the specified address.

addr := decodetypeGcprogShlib(ctxt, symData)

Check the data of the type:error symbol in libruntime,sync-atomic.so:

$ nm $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so | grep  "type:error"
00000000003166c0 D type:error

$ objdump -s --start-address=0x3166c0  --stop-address=0x3166f0   $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so
objdump: Warning: Corrupt unit length (0x40000000) found in section .debug_info

/usr1/GoRelease/go1.23.1/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so:     file format elf64-littleaarch64

Contents of section .data.rel.ro:
 3166c0 10000000 00000000 10000000 00000000  ................
 3166d0 35ff0330 07080814 100b3400 00000000  5..0......4.....
 3166e0 38862800 00000000 810b0000 206e0100  8.(......... n..

$ objdump -s --start-address=0x288638  --stop-address=0x288648   $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so

Contents of section .rodata:
 288638 02000000 00000000 38000000 00000000  ........8.......

$ readelf -r $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so | grep 3166e0
0000003166e0  000000000403 R_AARCH64_RELATIV                    288638

The address of the GCData field is 0x3166e0, and the stored data is 0x288638. The gcmask obtained from the address 0x288638 is 0x02. This result is correct.

The relocation information (in the .rela.dyn section) also shows that 0x288638 is the correct address.

@And-ZJ
Copy link
Contributor Author

And-ZJ commented Sep 14, 2024

However, when using the lld linker, the data show:

$ nm $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so | grep  "type:error"
000000000032aec0 D type:error

$ objdump -s --start-address=0x32aec0  --stop-address=0x32aef0   $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so

Contents of section .data.rel.ro:
 32aec0 10000000 00000000 10000000 00000000  ................
 32aed0 35ff0330 07080814 00000000 00000000  5..0............
 32aee0 08000000 00000000 810b0000 206e0100  ............ n..

$ readelf -r $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so | grep 32aee0
00000032aee0  000000000403 R_AARCH64_RELATIV                    1bdf38

$ objdump -s --start-address=0x1bdf38  --stop-address=0x1bdf48   $GOROOT/pkg/linux_arm64_dynlink/libruntime,sync-atomic.so

Contents of section .rodata:
 1bdf38 02000000 00000000 38000000 00000000  ........8.......

The address of the GCData field is 0x32aee0, and the stored data is 0x08. This 0x08 is not a correct address, so reading gcmask data from this address is incorrect.

The correct address 0x1bdf38 can be obtained only by viewing the relocation information.

This should be the difference between lld and gold linkers. I haven't found a document or specification that says that in the rela format, the value of the r_addend field of the relocation information should also be stored for the address to be repositioned. (The document like this and page 1-21 )

So I think this may be an undefined behavior, and the go linker may depend on this undefined behavior.

So maybe we should modify the code to read the correct address from the relocation information?

@And-ZJ And-ZJ changed the title cmd/link: function "decodetypeGcmask" read gcdata from shread object file may not correct when use lld link in linux/arm64 cmd/link: function "decodetypeGcmask" read gcdata from shared object file may not correct when use lld link in linux/arm64 Sep 14, 2024
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 16, 2024
@cagedmantis cagedmantis added this to the Backlog milestone Sep 16, 2024
@randall77
Copy link
Contributor

Unfortunately I cannot reproduce this. What version of the llvm tools are you using?
I'm using Debian clang version 16.0.6 (26).

@And-ZJ
Copy link
Contributor Author

And-ZJ commented Sep 28, 2024

Sorry, in the above test, my lld version number was 15.0.4.

I do more test:


I downloaded clang+llvm-16.0.6-aarch64-linux-gnu.tar.xz from https://github.com/llvm/llvm-project/releases/tag/llvmorg-16.0.6

The problem can be reproduced in my ARM64 environment with Go1.23.1 and LLVM 16.0.6 .

My ARM64 machine is linux kernal 5.10.0

But, when I test, the following warning is reported:

/usr1/clang+llvm-16.0.6-aarch64-linux-gnu/bin/clang: /usr/lib64/libtinfo.so.6: no version information available (required by /usr1/clang+llvm-16.0.6-aarch64-linux-gnu/bin/clang)

However, it still generates binary. So I ignore such warning.


I downloaded clang+llvm-15.0.4-x86_64-linux-gnu-rhel-8.4.tar.xz from https://github.com/llvm/llvm-project/releases/tag/llvmorg-15.0.4

The problem can be reproduced in my X86_64 environment with Go1.23.1 and LLVM 15.0.4 .
My X86_64 machine is linux kernal 4.18.0

@And-ZJ
Copy link
Contributor Author

And-ZJ commented Sep 28, 2024

In addition, I simply tested the difference between lld and gold by C code. (use clang+llvm-16.0.6-aarch64-linux-gnu.tar.xz)

The code is exmple.c:

#include<stdio.h>

struct Node {
    int data;
    struct Node *next;
    struct Node *prev;
};
struct Node prev = {0xcc, NULL, NULL};
struct Node next = {0xbb, NULL, NULL};
struct Node head = {0xaa, &next, &prev};

Use gold:

$ export PATH="$LLVM/bin:$PATH"
$ clang -fuse-ld=gold  -shared -o libexample.so example.c && objdump -s -j.data libexample.so && readelf -r libexample.so

libexample.so:     file format elf64-littleaarch64

Contents of section .data:
 20010 10000200 00000000 cc000000 00000000  ................
 20020 00000000 00000000 00000000 00000000  ................
 20030 bb000000 00000000 00000000 00000000  ................
 20040 00000000 00000000 aa000000 00000000  ................
 20050 30000200 00000000 18000200 00000000  0...............
 
// ignore some items

000000020058  000800000101 R_AARCH64_ABS64   0000000000020018 prev + 0
000000020050  000a00000101 R_AARCH64_ABS64   0000000000020030 next + 0

You can see data 200030 and 200018 on the address 0x20030 in .data section.


Use lld:

$ export PATH="$LLVM/bin:$PATH"
$ clang -fuse-ld=lld  -shared -o libexample.so example.c && objdump -s -j.data libexample.so && readelf -r libexample.so

libexample.so:     file format elf64-littleaarch64

Contents of section .data:
 30890 00000000 00000000 cc000000 00000000  ................
 308a0 00000000 00000000 00000000 00000000  ................
 308b0 bb000000 00000000 00000000 00000000  ................
 308c0 00000000 00000000 aa000000 00000000  ................
 308d0 00000000 00000000 00000000 00000000  ................
 
// ignore some items

0000000308d8  000500000101 R_AARCH64_ABS64   0000000000030898 prev + 0
0000000308d0  000600000101 R_AARCH64_ABS64   00000000000308b0 next + 0

You can see zero data on the address 0x308d0 in .data section.

Of course, in this test of the C code, the relocation type here is R_AARCH64_ABS64, not R_AARCH64_RELATIVE as above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

5 participants