Skip to content

[BOLT] Fix atypical TLSDESC to IE relaxation#197477

Open
awshkulkar wants to merge 2 commits into
llvm:mainfrom
awshkulkar:plt-syms
Open

[BOLT] Fix atypical TLSDESC to IE relaxation#197477
awshkulkar wants to merge 2 commits into
llvm:mainfrom
awshkulkar:plt-syms

Conversation

@awshkulkar
Copy link
Copy Markdown
Contributor

When mold transforms TLSDESC to IE, its ADRP lands on slot 3 (add) with stale reloc R_AARCH64_TLSDESC_ADD_LO12. This requires to be handled in Symbolizer, ADRP needs to have right PAGE of GOT slot where IE offset is stored.

LLD does not have this issue since the ADRP lands at R_AARCH64_TLSDESC_ADR_PAGE21 which gets correctly handled with __BOLT_got_zero symbol by happenstance since Rel.Value for this reloc is actually GOT slot address.

A TLSDESC stub looks like this:
adrp x0, :tlsdesc:var ; R_AARCH64_TLSDESC_ADR_PAGE21
ldr x1, [x0, :tlsdesc:var] ; R_AARCH64_TLSDESC_LD64_LO12
add x0, x0, :tlsdesc:var ; R_AARCH64_TLSDESC_ADD_LO12
blr x1 ; R_AARCH64_TLSDESC_CALL

LLD creates TLSDESC to IE relaxation as
Slot 1: TLSDESC_ADR_PAGE21 → ADRP x0, :gottprel:
Slot 2: TLSDESC_LD64_LO12 → LDR x0, [x0, :gottprel_lo12:]
Slot 3: TLSDESC_ADD_LO12 → NOP
Slot 4: TLSDESC_CALL → NOP

But mold does it as:
Slot 1: TLSDESC_ADR_PAGE21 → NOP
Slot 2: TLSDESC_LD64_LO12 → NOP
Slot 3: TLSDESC_ADD_LO12 → ADRP x0, :gottprel: <-- Crash
Slot 4: TLSDESC_CALL → LDR x0, [x0, ...]

Fix by returning nullopt from adjustRelocation() for TLSDESC_ADD_LO12 on ADRP, then symbolizing the ADRP in tryAddingSymbolicOperand() to have the right PC-relative offset symbol (+offset) of the GOT slot.

Refer to #197274 for details.

@llvmorg-github-actions
Copy link
Copy Markdown

@llvm/pr-subscribers-bolt

Author: Hemant Kulkarni (awshkulkar)

Changes

When mold transforms TLSDESC to IE, its ADRP lands on slot 3 (add) with stale reloc R_AARCH64_TLSDESC_ADD_LO12. This requires to be handled in Symbolizer, ADRP needs to have right PAGE of GOT slot where IE offset is stored.

LLD does not have this issue since the ADRP lands at R_AARCH64_TLSDESC_ADR_PAGE21 which gets correctly handled with __BOLT_got_zero symbol by happenstance since Rel.Value for this reloc is actually GOT slot address.

A TLSDESC stub looks like this:
adrp x0, :tlsdesc:var ; R_AARCH64_TLSDESC_ADR_PAGE21
ldr x1, [x0, :tlsdesc:var] ; R_AARCH64_TLSDESC_LD64_LO12
add x0, x0, :tlsdesc:var ; R_AARCH64_TLSDESC_ADD_LO12
blr x1 ; R_AARCH64_TLSDESC_CALL

LLD creates TLSDESC to IE relaxation as
Slot 1: TLSDESC_ADR_PAGE21 → ADRP x0, :gottprel:
Slot 2: TLSDESC_LD64_LO12 → LDR x0, [x0, :gottprel_lo12:]
Slot 3: TLSDESC_ADD_LO12 → NOP
Slot 4: TLSDESC_CALL → NOP

But mold does it as:
Slot 1: TLSDESC_ADR_PAGE21 → NOP
Slot 2: TLSDESC_LD64_LO12 → NOP
Slot 3: TLSDESC_ADD_LO12 → ADRP x0, :gottprel: <-- Crash
Slot 4: TLSDESC_CALL → LDR x0, [x0, ...]

Fix by returning nullopt from adjustRelocation() for TLSDESC_ADD_LO12 on ADRP, then symbolizing the ADRP in tryAddingSymbolicOperand() to have the right PC-relative offset symbol (+offset) of the GOT slot.

Refer to #197274 for details.


Full diff: https://github.com/llvm/llvm-project/pull/197477.diff

2 Files Affected:

  • (modified) bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp (+13)
  • (added) bolt/test/AArch64/tls-desc-ie-relaxation.s (+49)
diff --git a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
index 7bbfb1429e37b..39c2b3a2437da 100644
--- a/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCSymbolizer.cpp
@@ -52,6 +52,14 @@ bool AArch64MCSymbolizer::tryAddingSymbolicOperand(
       return true;
     }
 
+    if (BC.MIB->isADRP(Inst)) {
+      uint64_t TargetPage = (InstAddress & ~0xFFFULL) + (Value << 12);
+      auto [Sym, Offset] =
+          BC.handleAddressRef(TargetPage, Function, true);
+      addOperand(Sym, Offset, 0);
+      return true;
+    }
+
     LLVM_DEBUG(dbgs() << "BOLT-DEBUG: ignoring relocation at 0x"
                       << Twine::utohexstr(InstAddress) << '\n');
   }
@@ -112,6 +120,11 @@ AArch64MCSymbolizer::adjustRelocation(const Relocation &Rel,
     }
   }
 
+  // Handle TLSDESC -> IE relaxation where the linker places ADRP at the
+  // ADD slot (e.g. mold). The TLSDESC_ADD_LO12 relocation is stale.
+  if (BC.MIB->isADRP(Inst) && Rel.Type == ELF::R_AARCH64_TLSDESC_ADD_LO12)
+    return std::nullopt;
+
   if (!Relocation::isGOT(Rel.Type))
     return Rel;
 
diff --git a/bolt/test/AArch64/tls-desc-ie-relaxation.s b/bolt/test/AArch64/tls-desc-ie-relaxation.s
new file mode 100644
index 0000000000000..d475e930a7442
--- /dev/null
+++ b/bolt/test/AArch64/tls-desc-ie-relaxation.s
@@ -0,0 +1,49 @@
+# REQUIRES: system-linux
+#
+# Check that BOLT handles TLSDESC -> IE relaxation with stale
+# R_AARCH64_TLSDESC_ADD_LO12 on an ADRP (mold linker layout), while
+# not regressing the normal GOTTPREL ADRP handling (lld layout).
+#
+# Build with correct GOTTPREL relocations, then swap the first
+# GOTTPREL_PAGE21 to TLSDESC_ADD_LO12 to simulate mold's stale reloc.
+#
+# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-linux %s -o %t.o
+# RUN: ld.lld --emit-relocs -shared %t.o -o %t.so
+# RUN: obj2yaml %t.so | sed '0,/R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21/{s/R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21/R_AARCH64_TLSDESC_ADD_LO12/}' |\
+# RUN: sed '0,/R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21/{s/R_AARCH64_TLSIE_ADR_GOTTPREL_PAGE21/R_AARCH64_TLSDESC_ADR_PAGE21/}' | yaml2obj -o %t.mold.so
+# RUN: llvm-bolt %t.mold.so -o %t.bolt 2>&1 | FileCheck %s
+# RUN: llvm-objdump -d --section=.text %t.bolt | FileCheck %s --check-prefix=DISASM
+
+# CHECK-NOT: BOLT-ERROR
+
+# Both ADRP instructions must target the GOT page, not page 0
+# DISASM:      <_start>:
+# DISASM-NEXT: adrp x0, 0x{{[1-9][0-9a-f]*}} <tls_var+0x{{[0-9a-f]+}}>
+# DISASM:      adrp x0, 0x{{[1-9][0-9a-f]*}} <tls_var+0x{{[0-9a-f]+}}>
+
+  .text
+  .globl _start
+  .type _start, %function
+_start:
+  // First TLS access - will become mold-style (TLSDESC_ADD_LO12 on ADRP)
+  adrp x0, :gottprel:tls_var
+  ldr  x0, [x0, :gottprel_lo12:tls_var]
+  mrs  x8, tpidr_el0
+  add  x19, x8, x0
+
+  // Second TLS access - stays as lld-style (GOTTPREL on ADRP)
+  adrp x0, :gottprel:tls_var
+  ldr  x0, [x0, :gottprel_lo12:tls_var]
+  mrs  x8, tpidr_el0
+  add  x20, x8, x0
+
+  ret
+  .size _start, .-_start
+
+  .section .tbss,"awT",@nobits
+  .globl tls_var
+  .type tls_var, @tls_object
+tls_var:
+  .word 0
+  .size tls_var, 4
+

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 13, 2026

🐧 Linux x64 Test Results

  • 663 tests passed
  • 57 tests skipped

✅ The build succeeded and all tests passed.

@awshkulkar awshkulkar force-pushed the plt-syms branch 2 times, most recently from 5ccccff to fda52cb Compare May 13, 2026 20:24
When mold transforms TLSDESC to IE, its ADRP lands
on slot 3 (add) with stale reloc  R_AARCH64_TLSDESC_ADD_LO12.
This requires to be handled in Symbolizer, ADRP needs to have
right PAGE of GOT slot where IE offset is stored.

LLD does not have this issue since the ADRP lands at
R_AARCH64_TLSDESC_ADR_PAGE21 which gets correctly handled with
__BOLT_got_zero symbol by happenstance since Rel.Value for this
reloc  is actually GOT slot address.

A TLSDESC stub looks like this:
  adrp  x0, :tlsdesc:var        ; R_AARCH64_TLSDESC_ADR_PAGE21
  ldr   x1, [x0, :tlsdesc:var]  ; R_AARCH64_TLSDESC_LD64_LO12
  add   x0, x0, :tlsdesc:var    ; R_AARCH64_TLSDESC_ADD_LO12
  blr   x1                      ; R_AARCH64_TLSDESC_CALL

LLD creates TLSDESC to IE relaxation as
  Slot 1: TLSDESC_ADR_PAGE21  → ADRP x0, :gottprel:
  Slot 2: TLSDESC_LD64_LO12   → LDR  x0, [x0, :gottprel_lo12:]
  Slot 3: TLSDESC_ADD_LO12    → NOP
  Slot 4: TLSDESC_CALL         → NOP

But mold does it as:
  Slot 1: TLSDESC_ADR_PAGE21  → NOP
  Slot 2: TLSDESC_LD64_LO12   → NOP
  Slot 3: TLSDESC_ADD_LO12    → ADRP x0, :gottprel:  <-- Crash
  Slot 4: TLSDESC_CALL         → LDR  x0, [x0, ...]

Fix by returning nullopt from adjustRelocation() for TLSDESC_ADD_LO12 on
ADRP, then symbolizing the ADRP in tryAddingSymbolicOperand() to have
the right PC-relative offset symbol (+offset) of the GOT slot.
@yavtuk
Copy link
Copy Markdown
Contributor

yavtuk commented May 25, 2026

@ilinpv lets see how to do it better here

@ilinpv
Copy link
Copy Markdown
Contributor

ilinpv commented May 28, 2026

BOLT workaround looks reasonable to me, but ideally mold should not emit misleading relocation metadata in the first place.
R_AARCH64_TLSDESC_ADD_LO12 appears to be specifically an ADD-immediate relocation, while an ADRP instruction should use a page-based relocation such as R_AARCH64_TLSDESC_ADR_PAGE21:
https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst
So if the relaxed sequence rewrites the instruction into an ADRP, keeping R_AARCH64_TLSDESC_ADD_LO12 attached to it looks semantically questionable and can confuse post-link tools.
Drawing @rui314 attention here for the mold perspective.

@awshkulkar
Copy link
Copy Markdown
Contributor Author

awshkulkar commented May 28, 2026

BOLT workaround looks reasonable to me, but ideally mold should not emit misleading relocation metadata in the first place. R_AARCH64_TLSDESC_ADD_LO12 appears to be specifically an ADD-immediate relocation, while an ADRP instruction should use a page-based relocation such as R_AARCH64_TLSDESC_ADR_PAGE21: https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst So if the relaxed sequence rewrites the instruction into an ADRP, keeping R_AARCH64_TLSDESC_ADD_LO12 attached to it looks semantically questionable and can confuse post-link tools. Drawing @rui314 attention here for the mold perspective.

I think problem here is that (irrespective of linker) emit-relocs is always going to have older TLS relocs that were emitted by the compiler/assembler for TLSDESC. Since there is no strict requirement in ABI on how the transformed layout should look like i.e. where nop and new instruction are placed in a 4 wide instruction space, we have this mismatched gap. As I see both mold and LLD are emitting code that is correct at runtime. So we can either force everyone to have same style of transformation through AArch64 ELF ABI or we should be flexible to account all possible stale relocs that can happen in this transformation.

@yavtuk
Copy link
Copy Markdown
Contributor

yavtuk commented May 29, 2026

BOLT workaround looks reasonable to me, but ideally mold should not emit misleading relocation metadata in the first place. R_AARCH64_TLSDESC_ADD_LO12 appears to be specifically an ADD-immediate relocation, while an ADRP instruction should use a page-based relocation such as R_AARCH64_TLSDESC_ADR_PAGE21: https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst So if the relaxed sequence rewrites the instruction into an ADRP, keeping R_AARCH64_TLSDESC_ADD_LO12 attached to it looks semantically questionable and can confuse post-link tools. Drawing @rui314 attention here for the mold perspective.

I think problem here is that (irrespective of linker) emit-relocs is always going to have older TLS relocs that were emitted by the compiler/assembler for TLSDESC. Since there is no strict requirement in ABI on how the transformed layout should look like i.e. where nop and new instruction are placed in a 4 wide instruction space, we have this mismatched gap. As I see both mold and LLD are emitting code that is correct at runtime. So we can either force everyone to have same style of transformation through AArch64 ELF ABI or we should be flexible to account all possible stale relocs that can happen in this transformation.

flexibility related to parse adrp+add+nop+nop or nop+nop+adrp+add can be, but here in the patch we check that adrp instruction has Rel.Type == ELF::R_AARCH64_TLSDESC_ADD_LO12 relocation. From time to time, bugs related to relocation errors occur. If we take this workaround, how do we verify that it’s not a bug?! From my point of view it easy to add workaround to bolt and handle it but the right fix should be on the mold linker side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants