Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid relocation in ubsan_signals_standalone.cpp on FreeBSD/aarch64 #63418

Closed
VoxSciurorum opened this issue Jun 20, 2023 · 16 comments · Fixed by llvm/llvm-project-release-prs#632

Comments

@VoxSciurorum
Copy link

VoxSciurorum commented Jun 20, 2023

My build of LLVM main on FreeBSD AArch64 fails:

[ 20%] Linking CXX shared library /usr/obj/llvm/llvm-17/lib/clang/17/lib/aarch64-unknown-freebsd13.2/libclang_rt.ubsan_standalone.so
ld: error: CMakeFiles/RTUbsan_standalone.aarch64.dir/ubsan_signals_standalone.cpp.o:(function __ubsan::InitializeDeadlySignals(): .text._ZN7__ubsan23InitializeDeadlySignalsEv+0x78): improper alignment for relocation R_AARCH64_LD64_GOT_LO12_NC: 0x5E73C is not aligned to 8 bytes

The problem is apparently this line of assembly:

        ldr     x2, [x2, :got_lo12:sigaction]

This is an 8 byte load and the relocation expects it to be aligned to 8 bytes. Earlier in the file I see in the .text segment

.set sigaction, __interceptor_trampoline_sigaction
        .globl  __interceptor_trampoline_sigaction
        .type   __interceptor_trampoline_sigaction,@function
__interceptor_trampoline_sigaction:
        .cfi_startproc
        b       __interceptor_sigaction
        .cfi_endproc
.Ltmp1: 
        .size   __interceptor_trampoline_sigaction, .Ltmp1-__interceptor_trampoline_sigaction   

There is no explicit alignment and the function happens to land at an odd multiple of 4 bytes.

Compile the attached file with

clang++ --target=aarch64-unknown-freebsd13.2 -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -O3 -std=c++17 -march=armv8-a -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fno-stack-protector -fno-sanitize=safe-stack -fvisibility=hidden -fno-lto -gline-tables-only -Wno-gnu -Wno-variadic-macros -Wno-c99-extensions -nostdinc++ -fno-rtti -S ubsan_signals_standalone.ii

ubsan_signals_standalone.txt

@VoxSciurorum
Copy link
Author

This failure was introduced by commit 74b0ac5.

@brooksdavis brooksdavis added this to the LLVM 17.0.X Release milestone Aug 7, 2023
@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 7, 2023

This is an 8 byte load and the relocation expects it to be aligned to 8 bytes. Earlier in the file I see in the .text segment

But it's using :got_lo12: not :lo12:. Somehow the GOT entry is only 4-byte aligned, which is strange, because that's linker-generated.

freebsd-git pushed a commit to freebsd/freebsd-ports that referenced this issue Aug 9, 2023
For changes see:
	https://discourse.llvm.org/t/llvm-17-0-0-rc2-released/72658

COMPILER_RT is currently disabled for aarch64 as it is broken due to
llvm/llvm-project#63418.  Hopefully a fix will
be found before the full release or I'll attempt use a smaller hammer
that only disables ubsan.
@tru
Copy link
Collaborator

tru commented Aug 21, 2023

Is this something that still should be fixed for 17.x?

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2023

Ideally, if it's still broken, but it's not clear from the error what's broken, because it doesn't make sense...

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2023

Reproducible with cmake -G Ninja ../runtimes -DLLVM_ENABLE_RUNTIMES="compiler-rt" on ref13-aarch64.freebsd.org running 13.2-STABLE building release/17.x, including using LLD built from that branch.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2023

The problem is a result of dodginess in the AArch64 assembler, shared with GNU as (either overfitting LLVM IAS to GNU as or both repeating the same mistake). The short summary is that:

    adrp x0, :got:sym
    ldr x0, [x0, :got_lo12:sym]

will use a section symbol for the GOT relocations against sym if sym is a local symbol. But this doesn't work if the addend is non-zero when linking with LLD, and from skimming GNU ld I don't think it works there either. Rather than being G(GDAT(S+A)) (as specified by aaelf64, but differing from aaelf32 where it's GOT(S)+A-GOT) LLD gives G(GDAT(S))+A and so end up trying to offset from the GOT slot in question. Sometimes that happens to land you at a multiple of 8 (if the offset is), and you'll end up reading some garbage from wherever you happen to point, and sometimes that will not be a multiple of 8 (as seen here where the addend is 4) and give you a linker error. GNU ld instead seems to just ignore the addend in my testing.

This situation seems like a complete mess in the AArch64 ABI; technically the assembler isn't at fault, since it's requesting something that the ABI says should work, but neither toolchain's linker implementation of that ABI does that. I'd be inclined to say that the AArch64 ABI should be amended to document the actual linker behaviour, possibly reserving a non-zero addend completely rather than specifying it as either behaviour (that way linkers can give an error on this kind of input), and altering both assemblers to not exploit this part of the ABI and instead always use actual symbols for GOT relocations.

https://godbolt.org/z/Goh4farKf shows the relocation produced.

For a simple test program to run and examine the linker behaviour compile the following as a PIE:

#include <stdio.h>

#define	STR(x)	#x
#define	XSTR(x)	STR(x)

extern int x, y;

#define	OFFSET	4

__asm__(
"	.section	.data, \"aw\", %progbits\n"
"	.global	x\n"
"	.zero	" XSTR(OFFSET) "\n"
"x:\n"
"y:\n"
"	.4byte	42\n"
);

int
main(int argc, char **argv)
{
	printf("&x: %p\n", &x);
	printf("&y: %p\n", &y);
	printf("x: %d\n", x);
	printf("y: %d\n", y);
	return (0);
}

This models what's going on in compiler-rt, and shows that y's address is always the start of the section with GNU ld, but doesn't link with LLD, unless you make OFFSET a multiple of 8, at which point it reads garbage (for non-zero OFFSET). For example, binutils 2.41 on Debian sid as of today-ish gives:

(sid_arm64-dchroot)jrtc27@amdahl:~$ gcc -o test test.c -fPIE -pie
(sid_arm64-dchroot)jrtc27@amdahl:~$ ./test
&x: 0xaaaab92d003c
&y: 0xaaaab92d0038
x: 42
y: 0

whilst LLD 15.0.7 on FreeBSD 13.2-STABLE gives:

[jrtc27@ref13-aarch64 ~]$ clang -o test test.c -fPIE -pie
[jrtc27@ref13-aarch64 ~]$ ./test
&x: 0x3ba4b8ba0d88
&y: 0x0
x: 42
Segmentation fault (core dumped)

TL;DR: Everything is broken in different ways, nothing agrees and it all goes pear-shaped when symbol-declaring assembly is mixed with C. Someone from Arm needs to decide what they want to do about it. For compiler-rt, the workaround would be to make these symbols global or something that means the assembler can't turn them into being section-relative. Attempting to actually use this on Linux with GNU ld should be totally hosed, too, you just silently get a garbage binary out of GNU ld.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2023

Tagging @smithp35 for the broken ABI implementations and/or unintended spec

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 21, 2023

@llvm/issue-subscribers-lld-elf

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 21, 2023

@llvm/issue-subscribers-backend-aarch64

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2023

Attempting to actually use this on Linux with GNU ld should be totally hosed, too, you just silently get a garbage binary out of GNU ld.

Ah, no, FreeBSD and NetBSD are uniquely affected because __ASM_WEAK_WRAPPER is a no-op there but expands to .weak sym on other OSes. The .weak directive stops it being a local symbol. Perhaps the right workaround is to use .global on FreeBSD and NetBSD rather than nothing.

@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 21, 2023

Indeed:

diff --git a/compiler-rt/lib/interception/interception.h b/compiler-rt/lib/interception/interception.h
index 078d33b61..069f73d27 100644
--- a/compiler-rt/lib/interception/interception.h
+++ b/compiler-rt/lib/interception/interception.h
@@ -181,7 +181,7 @@ const interpose_substitution substitution_##func_name[]             \
 // FreeBSD's dynamic linker (incompliantly) gives non-weak symbols higher
 // priority than weak ones so weak aliases won't work for indirect calls
 // in position-independent (-fPIC / -fPIE) mode.
-#   define __ASM_WEAK_WRAPPER(func)
+#   define __ASM_WEAK_WRAPPER(func) ".globl " #func "\n"
 #  else
 #   define __ASM_WEAK_WRAPPER(func) ".weak " #func "\n"
 #  endif  // SANITIZER_FREEBSD || SANITIZER_NETBSD

works, and is probably the right thing to be doing anyway regardless of this toolchain/spec bug.

@smithp35
Copy link
Collaborator

@jrtc27 thanks for the tag. I've raised an ABI issue ARM-software/abi-aa#217 , feel free to add any additional comments/clarifications to that. It may be that this turns out to be a pair of toolchain bugs, but will need to discuss with the GNU folks to see what is possible in GNU ld.

@MaskRay
Copy link
Member

MaskRay commented Aug 22, 2023

Ah, no, FreeBSD and NetBSD are uniquely affected because __ASM_WEAK_WRAPPER is a no-op there but expands to .weak sym on other OSes. The .weak directive stops it being a local symbol. Perhaps the right workaround is to use .global on FreeBSD and NetBSD rather than nothing.

FreeBSD hasn't migrated to the compliant behavior (post glibc 2.2) LD_DYNAMIC_WEAK=0 yet: https://reviews.freebsd.org/D26352.

Once it does, we can change __ASM_WEAK_WRAPPER to be similar to Linux and remove other deviance.

-# define __ASM_WEAK_WRAPPER(func)
+# define __ASM_WEAK_WRAPPER(func) ".globl " #func "\n"

This change looks good to me. GNU assembler and LLVM integrated assembler don't convert a GOT generating relocation against non-local symbol to the section symbol.

@jrtc27 thanks for the tag. I've raised an ABI issue ARM-software/abi-aa#217 , feel free to add any additional comments/clarifications to that. It may be that this turns out to be a pair of toolchain bugs, but will need to discuss with the GNU folks to see what is possible in GNU ld.

Thanks for filing ARM-software/abi-aa#217 .
Whatever the ABI decision is, I think it will make sense to change the assembler to suppress local symbol to STT_SECTION conversion, so that we will get:

        .data
        .globl  x
        .zero   4
x:
y:
        .long   42

        .text
        adrp    x1, :got:y
        ldr     x1, [x1, :got_lo12:x]   // got for x
        adrp    x2, :got:y
        ldr     x2, [x2, :got_lo12:y]   // got for y

Then, the ABI difference largely doesn't affect the code above.

In the integrated assembler, ELFObjectWriter::shouldRelocateWithSymbol suppresses local symbol to section symbol conversion
if MCSymbolRefExpr::Kind uses a GOT relocation related variant kind.
However, AArch64 uses MCValue::RefKind instead of MCSymbolRefExpr::Kind to track the GOT relocation, therefore the issue stayed unknown for a long time.

(
On x86-64, there is even a trick that works just because GOT entries are related to S instead of S+A.

movl x@GOTPCREL+4(%rip) // R_X86_64_GOTPCRELX with an addend of 0 instead of -4

https://sourceware.org/bugzilla/show_bug.cgi?id=26939
)

@jrtc27 jrtc27 closed this as completed in 7e1afab Aug 22, 2023
@jrtc27
Copy link
Collaborator

jrtc27 commented Aug 22, 2023

/cherry-pick 7e1afab

@llvmbot
Copy link
Collaborator

llvmbot commented Aug 22, 2023

/branch llvm/llvm-project-release-prs/issue63418

llvmbot pushed a commit to llvm/llvm-project-release-prs that referenced this issue Aug 22, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm/llvm-project#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552

(cherry picked from commit 7e1afab1b1821550c5f8d0d6a50636236fa02e2c)
@EugeneZelenko EugeneZelenko reopened this Aug 22, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Aug 22, 2023

/pull-request llvm/llvm-project-release-prs#632

MaskRay added a commit to MaskRay/llvm-project that referenced this issue Aug 23, 2023
…relocations

Assemblers change certain relocations referencing a local symbol to
reference the section symbol instead. This conversion is disabled for
many conditions (`shouldRelocateWithSymbol`), e.g. TLS symbol, for most
targets (including AArch32, x86, PowerPC, and RISC-V) GOT-generating
relocations.

However, AArch64 encodes the GOT-generating intent in MCValue::RefKind
instead of MCSymbolRef::Kind (see commit
0999cbd (2014)), therefore not affected
by the code `case MCSymbolRefExpr::VK_GOT:`. Therefore, GOT relocations
referencing two local symbols may share the same GOT entry after linking
(GNU ld, ld.lld), which is not expected:
```
ldr     x1, [x1, :got_lo12:x]  // converted to .data+0
ldr     x1, [x1, :got_lo12:y]  // converted to .data+4

.data
// .globl x, y  would suppress STT_SECTION conversion
x:
.zero 4
y:
.long 42
```

This patch changes AArch64 to suppress local symbol to STT_SECTION
conversion for GOT relocations, matching most other targets. x and y
will use different GOT entries, which IMO is the most sensable behavior.

With this change, the ABI decision on ARM-software/abi-aa#217
will only affect relocations explicitly referencing STT_SECTION symbols, e.g.
```
ldr     x1, [x1, :got_lo12:(.data+0)]
ldr     x1, [x1, :got_lo12:(.data+4)]
// I consider this unreasonable uses
```

IMO all reasonable use cases are unaffected.

Link: llvm#63418
GNU assembler PR: https://sourceware.org/bugzilla/show_bug.cgi?id=30788

Differential Revision: https://reviews.llvm.org/D158577
tru pushed a commit to llvm/llvm-project-release-prs that referenced this issue Aug 25, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm/llvm-project#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552

(cherry picked from commit 7e1afab1b1821550c5f8d0d6a50636236fa02e2c)
MaskRay added a commit that referenced this issue Aug 29, 2023
…relocations

Assemblers change certain relocations referencing a local symbol to
reference the section symbol instead. This conversion is disabled for
many conditions (`shouldRelocateWithSymbol`), e.g. TLS symbol, for most
targets (including AArch32, x86, PowerPC, and RISC-V) GOT-generating
relocations.

However, AArch64 encodes the GOT-generating intent in MCValue::RefKind
instead of MCSymbolRef::Kind (see commit
0999cbd (2014)), therefore not affected
by the code `case MCSymbolRefExpr::VK_GOT:`. As GNU ld and ld.lld
create GOT entries based on the symbol, ignoring addend, the two ldr
instructions will share the same GOT entry, which is not expected:
```
ldr     x1, [x1, :got_lo12:x]  // converted to .data+0
ldr     x1, [x1, :got_lo12:y]  // converted to .data+4

.data
// .globl x, y  would suppress STT_SECTION conversion
x:
.zero 4
y:
.long 42
```

This patch changes AArch64 to suppress local symbol to STT_SECTION
conversion for GOT relocations, matching most other targets. x and y
will use different GOT entries, which IMO is the most sensable behavior.

With this change, the ABI decision on ARM-software/abi-aa#217
will only affect relocations explicitly referencing STT_SECTION symbols, e.g.
```
ldr     x1, [x1, :got_lo12:(.data+0)]
ldr     x1, [x1, :got_lo12:(.data+4)]
// I consider this unreasonable uses
```

IMO all reasonable use cases are unaffected.

Link: #63418
GNU assembler PR: https://sourceware.org/bugzilla/show_bug.cgi?id=30788

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D158577
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 2, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 2, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 2, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 3, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 3, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 6, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552
razmser pushed a commit to SuduIDE/llvm-project that referenced this issue Oct 11, 2023
On FreeBSD and NetBSD we don't use .weak due to differing semantics.
Currently we end up using no directive, which gives a local symbol,
whereas the closer thing to a weak symbol would be a global one. In
particular, both GNU and LLVM toolchains cannot handle a GOT-indirect
reference to a local symbol at a non-zero offset within a section on
AArch64 (see ARM-software/abi-aa#217), and so
interceptors do not work on FreeBSD/arm64, failing to link with LLD.
Switching to .globl both works around this bug and more closely aligns
such non-weak platforms with weak ones.

Fixes llvm#63418

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158552
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

8 participants