Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lld/AArch64: handle more relocation addends #87328

Merged
merged 3 commits into from
Apr 11, 2024
Merged

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Apr 2, 2024

The function getImplicitAddend() is incomplete, as it is possible to cook up object files with other relocation addends. Although using llvm-mc or the clang integrated assembler does not produce such object files, a proprietary assembler known as armasm can:

https://developer.arm.com/documentation/101754/0622/armasm-Legacy-Assembler-Reference

armasm is in a frozen state, but it is still actively used in a lot of legacy codebases as the directives, macros and operators are very different from the clang integrated assembler. This makes porting a lot of legacy code from armasm syntax impractical for a lot of projects. Some internal testing of projects using open-source clang and lld fell over at link time when legacy armasm objects were included in the link.

The goal of this patch is to enable people with legacy armasm objects to be able to use lld as the linker. Sadly armasm uses SHT_REL format relocations for AArch64 rather than SHT_RELA, which causes lld to reject the objects. As a frozen project we know the small number of relocations that the assembler officially supports and won't include (outside the equivalent of the .reloc directive which I think we can rule out of scope as that is not commonly used).

The benefit to lld is that it will ease migration from a proprietary to an open-source toolchain. The drawback is the implementation of a small number of SHT_REL relocations. Although this patch doesn't aim to comprehensively cover all possible relocation addends, it does extend lld to work with the relocation addends that armasm produces, using the canonical aaelf64 document as a reference:

https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 2, 2024

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: Ramkumar Ramachandra (artagnon)

Changes

The function getImplicitAddend() is woefully incomplete, as it is possible to cook up object files with ABS16, PREL16, ABS32, CALL26, and JUMP26 relocation addends. Although using llvm-mc does not produce such object files, other assemblers theoretically could. Although this patch doesn't aim to comprehensively cover all possible relocation addends, it does extend lld to work with the five mentioned relocation addends, using the canonical aaelf64 document as a reference:

https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst


Full diff: https://github.com/llvm/llvm-project/pull/87328.diff

6 Files Affected:

  • (modified) lld/ELF/Arch/AArch64.cpp (+7)
  • (added) lld/test/ELF/aarch64-reloc-abs16-addend.test (+38)
  • (added) lld/test/ELF/aarch64-reloc-abs32-addend.test (+38)
  • (added) lld/test/ELF/aarch64-reloc-call26-addend.test (+39)
  • (added) lld/test/ELF/aarch64-reloc-jump26-addend.test (+39)
  • (added) lld/test/ELF/aarch64-reloc-prel16-addend.test (+36)
diff --git a/lld/ELF/Arch/AArch64.cpp b/lld/ELF/Arch/AArch64.cpp
index 30ccd68f7b7506..0786b15d47c86d 100644
--- a/lld/ELF/Arch/AArch64.cpp
+++ b/lld/ELF/Arch/AArch64.cpp
@@ -217,6 +217,10 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_GLOB_DAT:
   case R_AARCH64_JUMP_SLOT:
     return 0;
+  case R_AARCH64_ABS16:
+  case R_AARCH64_PREL16:
+    return SignExtend64<16>(read16(buf));
+  case R_AARCH64_ABS32:
   case R_AARCH64_PREL32:
     return SignExtend64<32>(read32(buf));
   case R_AARCH64_ABS64:
@@ -225,6 +229,9 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
   case R_AARCH64_IRELATIVE:
   case R_AARCH64_TLS_TPREL64:
     return read64(buf);
+  case R_AARCH64_JUMP26:
+  case R_AARCH64_CALL26:
+    return read32(buf) & 0x0FFFFFFC >> 2;
   default:
     internalLinkerError(getErrorLocation(buf),
                         "cannot read addend for relocation " + toString(type));
diff --git a/lld/test/ELF/aarch64-reloc-abs16-addend.test b/lld/test/ELF/aarch64-reloc-abs16-addend.test
new file mode 100644
index 00000000000000..48023a970ac33e
--- /dev/null
+++ b/lld/test/ELF/aarch64-reloc-abs16-addend.test
@@ -0,0 +1,38 @@
+# REQUIRES: aarch64
+# RUN: yaml2obj %s -o %t.o
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-objdump -t %t.out | FileCheck %s
+
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 0000000000210120 g     F bar	0000000000000000 _start
+---
+!ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data: ELFDATA2LSB
+  Type: ET_REL
+  Machine: EM_AARCH64
+Sections:
+  - Name: bar
+    Type: SHT_PROGBITS
+    Flags: [SHF_ALLOC, SHF_EXECINSTR]
+  - Name: .debug_frame
+    Type: SHT_PROGBITS
+  - Name: .rel.debug_frame
+    Type: SHT_REL
+    Link: .symtab
+    Info: .debug_frame
+    Relocations:
+      - Symbol: .debug_frame
+        Type: R_AARCH64_ABS16
+Symbols:
+  - Name: .debug_frame
+    Type: STT_SECTION
+    Section: .debug_frame
+  - Name: bar
+    Type: STT_SECTION
+    Section: bar
+  - Name: _start
+    Type: STT_FUNC
+    Section: bar
+    Binding: STB_GLOBAL
diff --git a/lld/test/ELF/aarch64-reloc-abs32-addend.test b/lld/test/ELF/aarch64-reloc-abs32-addend.test
new file mode 100644
index 00000000000000..21627bb83a029a
--- /dev/null
+++ b/lld/test/ELF/aarch64-reloc-abs32-addend.test
@@ -0,0 +1,38 @@
+# REQUIRES: aarch64
+# RUN: yaml2obj %s -o %t.o
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-objdump -t %t.out | FileCheck %s
+
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 0000000000210120 g     F bar	0000000000000000 _start
+---
+!ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data: ELFDATA2LSB
+  Type: ET_REL
+  Machine: EM_AARCH64
+Sections:
+  - Name: bar
+    Type: SHT_PROGBITS
+    Flags: [SHF_ALLOC, SHF_EXECINSTR]
+  - Name: .debug_frame
+    Type: SHT_PROGBITS
+  - Name: .rel.debug_frame
+    Type: SHT_REL
+    Link: .symtab
+    Info: .debug_frame
+    Relocations:
+      - Symbol: .debug_frame
+        Type: R_AARCH64_ABS32
+Symbols:
+  - Name: .debug_frame
+    Type: STT_SECTION
+    Section: .debug_frame
+  - Name: bar
+    Type: STT_SECTION
+    Section: bar
+  - Name: _start
+    Type: STT_FUNC
+    Section: bar
+    Binding: STB_GLOBAL
diff --git a/lld/test/ELF/aarch64-reloc-call26-addend.test b/lld/test/ELF/aarch64-reloc-call26-addend.test
new file mode 100644
index 00000000000000..94f4f13511bcfa
--- /dev/null
+++ b/lld/test/ELF/aarch64-reloc-call26-addend.test
@@ -0,0 +1,39 @@
+# REQUIRES: aarch64
+# RUN: yaml2obj %s -o %t.o
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-objdump -t %t.out | FileCheck %s
+
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 0000000000210120 g     F foo	0000000000000000 _start
+---
+!ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data: ELFDATA2LSB
+  Type: ET_REL
+  Machine: EM_AARCH64
+Sections:
+  - Name: bar
+    Type: SHT_PROGBITS
+    Flags: [SHF_ALLOC, SHF_EXECINSTR]
+  - Name: foo
+    Type: SHT_PROGBITS
+    Flags: [SHF_ALLOC, SHF_EXECINSTR]
+  - Name: .relfoo
+    Type: SHT_REL
+    Link: .symtab
+    Info: foo
+    Relocations:
+      - Symbol: bar
+        Type: R_AARCH64_CALL26
+Symbols:
+  - Name: bar
+    Type: STT_SECTION
+    Section: bar
+  - Name: foo
+    Type: STT_SECTION
+    Section: bar
+  - Name: _start
+    Type: STT_FUNC
+    Section: foo
+    Binding: STB_GLOBAL
diff --git a/lld/test/ELF/aarch64-reloc-jump26-addend.test b/lld/test/ELF/aarch64-reloc-jump26-addend.test
new file mode 100644
index 00000000000000..ae1a13786e5603
--- /dev/null
+++ b/lld/test/ELF/aarch64-reloc-jump26-addend.test
@@ -0,0 +1,39 @@
+# REQUIRES: aarch64
+
+# RUN: yaml2obj %s -o %t.o
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-objdump -t %t.out | FileCheck %s
+
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 0000000000210120 g     F foo	0000000000000000 _start
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_REL
+  Machine:         EM_AARCH64
+Sections:
+  - Name:            bar
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+  - Name:            foo
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+  - Name:            .relfoo
+    Type:            SHT_REL
+    Link:            .symtab
+    Info:            foo
+    Relocations:
+      - Symbol:        bar
+        Type:          R_AARCH64_JUMP26
+Symbols:
+  - Name:            bar
+    Type:            STT_SECTION
+    Section:         bar
+  - Name:            foo
+    Type:            STT_SECTION
+    Section:         bar
+  - Name:            _start
+    Type:            STT_FUNC
+    Section:         foo
+    Binding:         STB_GLOBAL
diff --git a/lld/test/ELF/aarch64-reloc-prel16-addend.test b/lld/test/ELF/aarch64-reloc-prel16-addend.test
new file mode 100644
index 00000000000000..e129268276541a
--- /dev/null
+++ b/lld/test/ELF/aarch64-reloc-prel16-addend.test
@@ -0,0 +1,36 @@
+# REQUIRES: aarch64
+# RUN: yaml2obj %s -o %t.o
+# RUN: ld.lld %t.o -o %t.out
+# RUN: llvm-objdump -t %t.out | FileCheck %s
+
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 0000000000210158 g     F .text	0000000000000000 _start
+---
+!ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data: ELFDATA2LSB
+  Type: ET_REL
+  Machine: EM_AARCH64
+Sections:
+  - Name: .text
+    Type: SHT_PROGBITS
+    Flags: [SHF_ALLOC, SHF_EXECINSTR]
+  - Name: .data
+    Type: SHT_PROGBITS
+    Flags: [SHF_WRITE, SHF_ALLOC]
+  - Name: .rel.text
+    Type: SHT_REL
+    Info: .text
+    Relocations:
+      - Symbol: .data
+        Offset: 4
+        Type: R_AARCH64_PREL16
+Symbols:
+  - Name: .data
+    Type: STT_SECTION
+    Section: .data
+  - Name: _start
+    Type: STT_FUNC
+    Section: .text
+    Binding: STB_GLOBAL

@MaskRay
Copy link
Member

MaskRay commented Apr 3, 2024

Although using llvm-mc does not produce such object files, other assemblers theoretically could.

I'm intrigued by this statement. Which assembler do you use?

The Binutils status is that REL support is incomplete for AArch64 (and many other ports): https://sourceware.org/pipermail/binutils/2024-March/133216.html

If you prefer REL for its smaller size, you might find CREL useful.

I sincerely hope that folks' interest will help us get CREL standardized at generic-abi and move forward with the LLVM implementation.

CREL supports the implicit addend mode, but implementing --emit-relocs/-r requires some work, so I may not want to add assembler support for implicit addends.

@artagnon
Copy link
Contributor Author

artagnon commented Apr 3, 2024

I'm intrigued by this statement. Which assembler do you use?

We have a proprietary in-house assembler that we used to distribute with an older version of our compiler toolchain. While all the information on CRELs is quite useful, and may be worth pursuing, it unfortunately doesn't fix our case: we have a lot of object files built using the (deprecated) in-house assembler lying around, and we're trying to get lld to link them.

The motivation for my patch would be to migrate people away from deprecated assemblers gradually, and not to support constructs in lld to minimize code size.

@smithp35
Copy link
Collaborator

smithp35 commented Apr 3, 2024

To add a bit more context. The proprietary assembler is armasm https://developer.arm.com/documentation/101754/0622/armasm-Legacy-Assembler-Reference

As implied by the documentation title it is in a frozen state, but it is still actively used in a lot of legacy code-bases as the directives, macros and operators are very different from the clang integrated assembler. This makes porting a lot of legacy code from armasm syntax impractical for a lot of projects. Some internal testing of projects using open-source clang and lld fell over at link time when legacy armasm objects were included in the link.

Our goal is to enable people with legacy armasm objects to be able to use lld as the linker. Sadly armasm uses SHT_REL format relocations for AArch64 rather than SHT_RELA, which causes lld to reject the objects. As a frozen project we know the small number of relocations that the assembler officially supports and won't include (outside the equivalent of the .reloc directive which I think we can rule out of scope as that is not commonly used).

The benefit to lld (or the llvm ecosystem) is that it will ease migration from a proprietary to an open-source toolchain. The drawback is the implementation of a small number of SHT_REL relocations.

lld/test/ELF/aarch64-reloc-abs16-addend.test Outdated Show resolved Hide resolved
lld/test/ELF/aarch64-reloc-abs16-addend.test Outdated Show resolved Hide resolved
lld/test/ELF/aarch64-reloc-abs32-addend.test Outdated Show resolved Hide resolved
@MaskRay
Copy link
Member

MaskRay commented Apr 4, 2024

To add a bit more context. The proprietary assembler is armasm developer.arm.com/documentation/101754/0622/armasm-Legacy-Assembler-Reference

As implied by the documentation title it is in a frozen state, but it is still actively used in a lot of legacy code-bases as the directives, macros and operators are very different from the clang integrated assembler. This makes porting a lot of legacy code from armasm syntax impractical for a lot of projects. Some internal testing of projects using open-source clang and lld fell over at link time when legacy armasm objects were included in the link.

Our goal is to enable people with legacy armasm objects to be able to use lld as the linker. Sadly armasm uses SHT_REL format relocations for AArch64 rather than SHT_RELA, which causes lld to reject the objects. As a frozen project we know the small number of relocations that the assembler officially supports and won't include (outside the equivalent of the .reloc directive which I think we can rule out of scope as that is not commonly used).

The benefit to lld (or the llvm ecosystem) is that it will ease migration from a proprietary to an open-source toolchain. The drawback is the implementation of a small number of SHT_REL relocations.

Thanks for the context. Since the comment is public information, it seems useful to modify the patch description to include the context.

@artagnon
Copy link
Contributor Author

artagnon commented Apr 4, 2024

Thanks for the review @MaskRay. I've fixed all the issues, and added all the addends possibly reachable from armasm.

lld/test/ELF/aarch64-reloc-abs-addend.test Outdated Show resolved Hide resolved
lld/test/ELF/aarch64-reloc-abs-addend.test Outdated Show resolved Hide resolved
MaskRay added a commit to MaskRay/llvm-project that referenced this pull request Apr 5, 2024
@MaskRay
Copy link
Member

MaskRay commented Apr 5, 2024

I've created #87733 as base tests. You can change this PR to show the llvm-objdump -s %t | FileCheck %s difference.

lld/ELF/Arch/AArch64.cpp Outdated Show resolved Hide resolved
MaskRay added a commit that referenced this pull request Apr 5, 2024
to support certain static relocations in the REL format. See #87328 for
the armasm need.

Note: `R_AARCH64_{ABS64,PREL32,PREL64}` have been implemented by https://reviews.llvm.org/D120535

Pull Request: #87733
The function getImplicitAddend() is incomplete, as it is possible to
cook up object files with other relocation addends. Although using
llvm-mc or the clang integrated assembler does not produce such object
files, a proprietary assembler known as armasm can:

  https://developer.arm.com/documentation/101754/0622/armasm-Legacy-Assembler-Reference

armasm is in a frozen state, but it is still actively used in a lot of
legacy codebases as the directives, macros and operators are very
different from the clang integrated assembler. This makes porting a lot
of legacy code from armasm syntax impractical for a lot of projects.
Some internal testing of projects using open-source clang and lld fell
over at link time when legacy armasm objects were included in the link.

The goal of this patch is to enable people with legacy armasm objects to
be able to use lld as the linker. Sadly armasm uses SHT_REL format
relocations for AArch64 rather than SHT_RELA, which causes lld to reject
the objects. As a frozen project we know the small number of relocations
that the assembler officially supports and won't include (outside the
equivalent of the .reloc directive which I think we can rule out of
scope as that is not commonly used).

The benefit to lld is that it will ease migration from a proprietary to
an open-source toolchain. The drawback is the implementation of a small
number of SHT_REL relocations. Although this patch doesn't aim to
comprehensively cover all possible relocation addends, it does extend
lld to work with the relocation addends that armasm produces, using the
canonical aaelf64 document as a reference:

  https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst
@artagnon
Copy link
Contributor Author

artagnon commented Apr 5, 2024

Thanks a lot for the expertly-written test @MaskRay! I've now rebased, and the final patch is up along with all the corrections.

@artagnon artagnon requested a review from MaskRay April 6, 2024 11:36
@artagnon artagnon requested a review from MaskRay April 10, 2024 12:29
@artagnon artagnon merged commit cd14e71 into llvm:main Apr 11, 2024
4 checks passed
@artagnon artagnon deleted the lld-addend branch April 11, 2024 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants