Skip to content

Conversation

@haberman
Copy link
Contributor

Currently, if multiple external weak symbols are defined at the same address in an object file (e.g., by using the .set assembler directive to alias them to a single weak variable), ld64.lld treats them as a single unit. When any one of these symbols is overridden by a strong definition, all of the original weak symbols resolve to the strong definition.

This patch changes the behavior in transplantSymbolsAtOffset. When a weak symbol is being replaced by a strong one, only non-external (local) symbols at the same offset are moved to the new symbol's section. Other external symbols are no longer transplanted.

This allows each external weak symbol to be overridden independently. This behavior is consistent with Apple's ld-classic, but diverges from ld-prime in one case, as noted on #167262 (this discrepancy has recently been reported to Apple).

Backward Compatibility

This change alters linker behavior for a specific scenario. The creation of multiple external weak symbols aliased to the same address via assembler directives is primarily an advanced technique. It's unlikely that existing builds rely on the current behavior of all aliases being overridden together.

If there are concerns, this could be put behind a linker option, but the new default seems more correct, less surprising, and is consistent with ld-classic.

Testing

The new lit test test/MachO/weak-alias-override.s verifies this behavior using llvm-nm.

Fixes #167262

@llvmbot
Copy link
Member

llvmbot commented Nov 13, 2025

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-macho

Author: Joshua Haberman (haberman)

Changes

Currently, if multiple external weak symbols are defined at the same address in an object file (e.g., by using the .set assembler directive to alias them to a single weak variable), ld64.lld treats them as a single unit. When any one of these symbols is overridden by a strong definition, all of the original weak symbols resolve to the strong definition.

This patch changes the behavior in transplantSymbolsAtOffset. When a weak symbol is being replaced by a strong one, only non-external (local) symbols at the same offset are moved to the new symbol's section. Other external symbols are no longer transplanted.

This allows each external weak symbol to be overridden independently. This behavior is consistent with Apple's ld-classic, but diverges from ld-prime in one case, as noted on #167262 (this discrepancy has recently been reported to Apple).

Backward Compatibility

This change alters linker behavior for a specific scenario. The creation of multiple external weak symbols aliased to the same address via assembler directives is primarily an advanced technique. It's unlikely that existing builds rely on the current behavior of all aliases being overridden together.

If there are concerns, this could be put behind a linker option, but the new default seems more correct, less surprising, and is consistent with ld-classic.

Testing

The new lit test test/MachO/weak-alias-override.s verifies this behavior using llvm-nm.

Fixes #167262


Full diff: https://github.com/llvm/llvm-project/pull/167825.diff

2 Files Affected:

  • (modified) lld/MachO/SymbolTable.cpp (+22-14)
  • (added) lld/test/MachO/weak-alias-override.s (+97)
diff --git a/lld/MachO/SymbolTable.cpp b/lld/MachO/SymbolTable.cpp
index baddddcb76fbf..edf6b1804dc5d 100644
--- a/lld/MachO/SymbolTable.cpp
+++ b/lld/MachO/SymbolTable.cpp
@@ -80,20 +80,28 @@ static void transplantSymbolsAtOffset(InputSection *fromIsec,
     auto *d = cast<Defined>(s);
     if (d->value != fromOff)
       return false;
-    if (d != skip) {
-      // This repeated insertion will be quadratic unless insertIt is the end
-      // iterator. However, that is typically the case for files that have
-      // .subsections_via_symbols set.
-      insertIt = toIsec->symbols.insert(insertIt, d);
-      d->originalIsec = toIsec;
-      d->value = toOff;
-      // We don't want to have more than one unwindEntry at a given address, so
-      // drop the redundant ones. We We can safely drop the unwindEntries of
-      // the symbols in fromIsec since we will be adding another unwindEntry as
-      // we finish parsing toIsec's file. (We can assume that toIsec has its
-      // own unwindEntry because of the ODR.)
-      d->originalUnwindEntry = nullptr;
-    }
+
+    if (d == skip)
+      return true;
+
+    // Do not transplant other external symbols.
+    // Treat them as independent entities, even if at the same offset.
+    if (d->isExternal())
+      return false;
+
+    // This repeated insertion will be quadratic unless insertIt is the end
+    // iterator. However, that is typically the case for files that have
+    // .subsections_via_symbols set.
+    insertIt = toIsec->symbols.insert(insertIt, d);
+    d->originalIsec = toIsec;
+    d->value = toOff;
+
+    // We don't want to have more than one unwindEntry at a given address, so
+    // drop the redundant ones. We We can safely drop the unwindEntries of
+    // the symbols in fromIsec since we will be adding another unwindEntry as
+    // we finish parsing toIsec's file. (We can assume that toIsec has its
+    // own unwindEntry because of the ODR.)
+    d->originalUnwindEntry = nullptr;
     return true;
   });
 }
diff --git a/lld/test/MachO/weak-alias-override.s b/lld/test/MachO/weak-alias-override.s
new file mode 100644
index 0000000000000..f56bfa5c34d95
--- /dev/null
+++ b/lld/test/MachO/weak-alias-override.s
@@ -0,0 +1,97 @@
+# REQUIRES: x86
+# RUN: rm -rf %t; split-file %s %t
+# RUN: mkdir -p %t/bin
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos -o %t/weak.o %t/weak.s
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos -o %t/strong_a.o %t/strong_a.s
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos -o %t/strong_b.o %t/strong_b.s
+
+# --- Test Case 1: No overrides
+# RUN: %lld %t/weak.o -o %t/bin/alone -e _s
+# RUN: llvm-nm -am %t/bin/alone | FileCheck --check-prefix=NM_ALONE %s
+
+# NM_ALONE:            [[P_ADDR:[0-9a-f]+]] (__TEXT,__const) weak external _placeholder_int
+# NM_ALONE:            [[P_ADDR]] (__TEXT,__const) weak external _weak_a
+# NM_ALONE:            [[P_ADDR]] (__TEXT,__const) weak external _weak_b
+
+# --- Test Case 2: Override weak_a
+# RUN: %lld %t/weak.o %t/strong_a.o -o %t/bin/with_a -e _s
+# RUN: llvm-nm -am %t/bin/with_a | FileCheck --check-prefix=NM_WITH_A %s
+# RUN: llvm-nm -am %t/bin/with_a | FileCheck --check-prefix=NM_WITH_A_BAD %s
+
+# NM_WITH_A:           [[P_ADDR:[0-9a-f]+]] (__TEXT,__const) weak external _placeholder_int
+# NM_WITH_A:           [[A_ADDR:[0-9a-f]+]] (__TEXT,__const) external _strong_a
+# NM_WITH_A:           [[A_ADDR]]           (__TEXT,__const) external _weak_a
+# NM_WITH_A:           [[P_ADDR]]           (__TEXT,__const) weak external _weak_b
+
+# --- Addresses of _placeholder_int and _strong_a must not match.
+# NM_WITH_A_BAD:       [[P_ADDR:[0-9a-f]+]] (__TEXT,__const) weak external _placeholder_int
+# NM_WITH_A_BAD-NOT:   [[P_ADDR]]           (__TEXT,__const) external _strong_a
+
+# --- Test Case 3: Override weak_b
+# RUN: %lld %t/weak.o %t/strong_b.o -o %t/bin/with_b -e _s
+# RUN: llvm-nm -am %t/bin/with_b | FileCheck --check-prefix=NM_WITH_B %s
+# RUN: llvm-nm -am %t/bin/with_b | FileCheck --check-prefix=NM_WITH_B_BAD %s
+
+# NM_WITH_B:           [[P_ADDR:[0-9a-f]+]] (__TEXT,__const) weak external _placeholder_int
+# NM_WITH_B:           [[B_ADDR:[0-9a-f]+]] (__TEXT,__const) external _strong_b
+# NM_WITH_B:           [[P_ADDR]]           (__TEXT,__const) weak external _weak_a
+# NM_WITH_B:           [[B_ADDR]]           (__TEXT,__const) external _weak_b
+
+# --- Addresses of _placeholder_int and _strong_a must not match.
+# NM_WITH_B_BAD:       [[P_ADDR:[0-9a-f]+]] (__TEXT,__const) weak external _placeholder_int
+# NM_WITH_B_BAD-NOT:   [[P_ADDR]]           (__TEXT,__const) external _strong_b
+
+# --- Test Case 4: Override weak_a and weak_b
+# RUN: %lld %t/weak.o %t/strong_a.o %t/strong_b.o -o %t/bin/with_ab -e _s
+# RUN: llvm-nm -am %t/bin/with_ab | FileCheck --check-prefix=NM_WITH_AB %s
+# RUN: llvm-nm -am %t/bin/with_ab | FileCheck --check-prefix=NM_WITH_AB_BAD %s
+
+# NM_WITH_AB:          [[P_ADDR:[0-9a-f]+]] (__TEXT,__const) weak external _placeholder_int
+# NM_WITH_AB:          [[A_ADDR:[0-9a-f]+]] (__TEXT,__const) external _strong_a
+# NM_WITH_AB:          [[B_ADDR:[0-9a-f]+]] (__TEXT,__const) external _strong_b
+# NM_WITH_AB:          [[A_ADDR]]           (__TEXT,__const) external _weak_a
+# NM_WITH_AB:          [[B_ADDR]]           (__TEXT,__const) external _weak_b
+
+# --- Addresses of _placeholder_int, _strong_a, and _strong_b must all be distinct
+# NM_WITH_AB_BAD:      [[P_ADDR:[0-9a-f]+]] (__TEXT,__const) weak external _placeholder_int
+# NM_WITH_AB_BAD-NOT:  [[P_ADDR]]           (__TEXT,__const) external _strong_a
+# NM_WITH_AB_BAD-NOT:  [[P_ADDR]]           (__TEXT,__const) external _strong_b
+
+#--- weak.s
+.section __TEXT,__const
+.globl _placeholder_int
+.weak_definition _placeholder_int
+_placeholder_int:
+ .long 0
+
+.globl _weak_a
+.set _weak_a, _placeholder_int
+.weak_definition _weak_a
+
+.globl _weak_b
+.set _weak_b, _placeholder_int
+.weak_definition _weak_b
+
+.globl _s
+_s:
+ .quad _weak_a
+ .quad _weak_b
+
+#--- strong_a.s
+.section __TEXT,__const
+.globl _strong_a
+_strong_a:
+ .long 1
+
+.globl _weak_a
+_weak_a = _strong_a
+
+#--- strong_b.s
+.section __TEXT,__const
+.globl _strong_b
+_strong_b:
+ .long 2
+
+.globl _weak_b
+_weak_b = _strong_b

Currently, if multiple external weak symbols are defined at the same
address in an object file (e.g., by using the .set assembler directive
to alias them to a single weak variable), ld64.lld treats them as a single
unit. When any one of these symbols is overridden by a strong definition,
all of the original weak symbols resolve to the strong definition.

This patch changes the behavior in `transplantSymbolsAtOffset`. When a
weak symbol is being replaced by a strong one, only non-external (local)
symbols at the same offset are moved to the new symbol's section. Other
*external* symbols are no longer transplanted.

This allows each external weak symbol to be overridden independently.
This behavior is consistent with ld-classic, but diverges from ld-prime.

A new test case, weak-alias-override.s, is added.

Fixes llvm#167262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lld/MachO: Weak symbols aliased via .set are not independently overridable

2 participants