Skip to content

Conversation

ellishg
Copy link
Contributor

@ellishg ellishg commented Sep 29, 2025

Use llvm::Align instead of directly storing the shift amount for clarity. Also remove the DeduplicatedCStringSection::StringOffset in favor of simply storing the uint64_t offset since trailingZeros is not used outside of finalizeContents(). These two changes allow us to refactor finalizeContents().

No function change intended.

Depends on #161241.

@llvmbot
Copy link
Member

llvmbot commented Sep 29, 2025

@llvm/pr-subscribers-lld-macho

@llvm/pr-subscribers-lld

Author: Ellis Hoag (ellishg)

Changes

Use llvm::Align instead of directly storing the shift amount for clarity. Also remove the DeduplicatedCStringSection::StringOffset in favor of simply storing the uint64_t offset since trailingZeros is not used outside of finalizeContents(). These two changes allow us to refactor finalizeContents().

No function change intended.

Depends on #161241.


Full diff: https://github.com/llvm/llvm-project/pull/161253.diff

2 Files Affected:

  • (modified) lld/MachO/SyntheticSections.cpp (+26-38)
  • (modified) lld/MachO/SyntheticSections.h (+2-10)
diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
index 5645d8a05a28f..38386c107fea0 100644
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -848,8 +848,7 @@ void ObjCSelRefsHelper::initialize() {
 void ObjCSelRefsHelper::cleanup() { methnameToSelref.clear(); }
 
 ConcatInputSection *ObjCSelRefsHelper::makeSelRef(StringRef methname) {
-  auto methnameOffset =
-      in.objcMethnameSection->getStringOffset(methname).outSecOff;
+  auto methnameOffset = in.objcMethnameSection->getStringOffset(methname);
 
   size_t wordSize = target->wordSize;
   uint8_t *selrefData = bAlloc().Allocate<uint8_t>(wordSize);
@@ -1722,13 +1721,12 @@ void CStringSection::writeTo(uint8_t *buf) const {
 // and don't need this alignment. They will be emitted at some arbitrary address
 // `A`, but ld64 will treat them as being 16-byte aligned with an offset of
 // `16 % A`.
-static uint8_t getStringPieceAlignment(const CStringInputSection *isec,
-                                       const StringPiece &piece) {
-  return llvm::countr_zero(isec->align | piece.inSecOff);
+static Align getStringPieceAlignment(const CStringInputSection *isec,
+                                     const StringPiece &piece) {
+  return llvm::Align(1ULL << llvm::countr_zero(isec->align | piece.inSecOff));
 }
 
 void CStringSection::finalizeContents() {
-  uint64_t offset = 0;
   // TODO: Call buildCStringPriorities() to support cstring ordering when
   // deduplication is off, although this may negatively impact build
   // performance.
@@ -1736,30 +1734,27 @@ void CStringSection::finalizeContents() {
     for (const auto &[i, piece] : llvm::enumerate(isec->pieces)) {
       if (!piece.live)
         continue;
-      uint32_t pieceAlign = 1 << getStringPieceAlignment(isec, piece);
-      offset = alignToPowerOf2(offset, pieceAlign);
-      piece.outSecOff = offset;
-      isec->isFinal = true;
+      piece.outSecOff = alignTo(size, getStringPieceAlignment(isec, piece));
       StringRef string = isec->getStringRef(i);
-      offset += string.size() + 1; // account for null terminator
+      size = piece.outSecOff + string.size() + 1; // account for null terminator
     }
+    isec->isFinal = true;
   }
-  size = offset;
 }
 
 void DeduplicatedCStringSection::finalizeContents() {
   // Find the largest alignment required for each string.
+  DenseMap<CachedHashStringRef, Align> strToAlignment;
   for (const CStringInputSection *isec : inputs) {
     for (const auto &[i, piece] : llvm::enumerate(isec->pieces)) {
       if (!piece.live)
         continue;
       auto s = isec->getCachedHashStringRef(i);
       assert(isec->align != 0);
-      uint8_t trailingZeros = getStringPieceAlignment(isec, piece);
-      auto it = stringOffsetMap.insert(
-          std::make_pair(s, StringOffset(trailingZeros)));
-      if (!it.second && it.first->second.trailingZeros < trailingZeros)
-        it.first->second.trailingZeros = trailingZeros;
+      auto align = getStringPieceAlignment(isec, piece);
+      auto [it, wasInserted] = strToAlignment.try_emplace(s, align);
+      if (!wasInserted && it->second < align)
+        it->second = align;
     }
   }
 
@@ -1769,38 +1764,31 @@ void DeduplicatedCStringSection::finalizeContents() {
   for (auto &[isec, i] : priorityBuilder.buildCStringPriorities(inputs)) {
     auto &piece = isec->pieces[i];
     auto s = isec->getCachedHashStringRef(i);
-    auto it = stringOffsetMap.find(s);
-    assert(it != stringOffsetMap.end());
-    lld::macho::DeduplicatedCStringSection::StringOffset &offsetInfo =
-        it->second;
-    if (offsetInfo.outSecOff == UINT64_MAX) {
-      offsetInfo.outSecOff =
-          alignToPowerOf2(size, 1ULL << offsetInfo.trailingZeros);
-      size = offsetInfo.outSecOff + s.size() + 1; // account for null terminator
+    auto [it, wasInserted] = stringOffsetMap.try_emplace(s, /*placeholder*/ 0);
+    if (wasInserted) {
+      // Avoid computing the offset until we are sure we will need to
+      uint64_t offset = alignTo(size, strToAlignment.at(s));
+      it->second = offset;
+      size = offset + s.size() + 1; // account for null terminator
     }
-    piece.outSecOff = offsetInfo.outSecOff;
+    // If the string was already in stringOffsetMap, it is a duplicate and we
+    // only need to assign the offset.
+    piece.outSecOff = it->second;
   }
   for (CStringInputSection *isec : inputs)
     isec->isFinal = true;
 }
 
 void DeduplicatedCStringSection::writeTo(uint8_t *buf) const {
-  for (const auto &p : stringOffsetMap) {
-    StringRef data = p.first.val();
-    uint64_t off = p.second.outSecOff;
-    if (!data.empty())
-      memcpy(buf + off, data.data(), data.size());
-  }
+  for (const auto &[s, outSecOff] : stringOffsetMap)
+    if (s.size())
+      memcpy(buf + outSecOff, s.data(), s.size());
 }
 
-DeduplicatedCStringSection::StringOffset
-DeduplicatedCStringSection::getStringOffset(StringRef str) const {
+uint64_t DeduplicatedCStringSection::getStringOffset(StringRef str) const {
   // StringPiece uses 31 bits to store the hashes, so we replicate that
   uint32_t hash = xxh3_64bits(str) & 0x7fffffff;
-  auto offset = stringOffsetMap.find(CachedHashStringRef(str, hash));
-  assert(offset != stringOffsetMap.end() &&
-         "Looked-up strings should always exist in section");
-  return offset->second;
+  return stringOffsetMap.at(CachedHashStringRef(str, hash));
 }
 
 // This section is actually emitted as __TEXT,__const by ld64, but clang may
diff --git a/lld/MachO/SyntheticSections.h b/lld/MachO/SyntheticSections.h
index 1abf3c210a64e..a37dd66107ee7 100644
--- a/lld/MachO/SyntheticSections.h
+++ b/lld/MachO/SyntheticSections.h
@@ -571,18 +571,10 @@ class DeduplicatedCStringSection final : public CStringSection {
   uint64_t getSize() const override { return size; }
   void finalizeContents() override;
   void writeTo(uint8_t *buf) const override;
-
-  struct StringOffset {
-    uint8_t trailingZeros;
-    uint64_t outSecOff = UINT64_MAX;
-
-    explicit StringOffset(uint8_t zeros) : trailingZeros(zeros) {}
-  };
-
-  StringOffset getStringOffset(StringRef str) const;
+  uint64_t getStringOffset(StringRef str) const;
 
 private:
-  llvm::DenseMap<llvm::CachedHashStringRef, StringOffset> stringOffsetMap;
+  llvm::DenseMap<llvm::CachedHashStringRef, uint64_t> stringOffsetMap;
   size_t size = 0;
 };
 

Copy link
Contributor

@kazutakahirata kazutakahirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

Base automatically changed from users/ellishg/lld-count-zero-helper to main September 30, 2025 16:44
@ellishg ellishg force-pushed the users/ellishg/lld-align branch from 4e0bde2 to 72bc6a8 Compare September 30, 2025 16:49
@ellishg ellishg enabled auto-merge (squash) September 30, 2025 16:50
@ellishg ellishg merged commit dd43a79 into main Sep 30, 2025
9 checks passed
@ellishg ellishg deleted the users/ellishg/lld-align branch September 30, 2025 16:57
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
Use `llvm::Align` instead of directly storing the shift amount for
clarity. Also remove the `DeduplicatedCStringSection::StringOffset` in
favor of simply storing the `uint64_t` offset since `trailingZeros` is
not used outside of `finalizeContents()`. These two changes allow us to
refactor `finalizeContents()`.


No function change intended.

Depends on llvm#161241.
ellishg added a commit that referenced this pull request Oct 3, 2025
Add the flag `--tail-merge-strings` to enable tail merging of cstrings.
For example, if we have strings `mystring\0` and `ring\0`, we could
place `mystring\0` at address `0x1000` and `ring\0` at address `0x1004`
and have them share the same underlying data.

It turns out that many ObjC method names can be tail merged. For
example, `error:` and `doFoo:error:`. On a large iOS binary, we saw
nearly a 15% size improvement in the `__TEXT__objc_methname` section and
negligible impact on link time.
```
$ bloaty --domain=vm merged.o.stripped -- base.o.stripped
     VM SIZE
 --------------
   +95% +5.85Ki    [__TEXT]
  -2.4%  -239Ki    __TEXT,__cstring
 -14.5%  -710Ki    __TEXT,__objc_methname
  -1.0%  -944Ki    TOTAL
```

Tail merging for MachO was originally removed in
7c269db.
The previous implementation used `StringTableBuilder`, but that was
removed in
4308f03
to ensure deduplicated strings are aligned correctly. This
implementation ensures that tail merged strings are also aligned
correctly.

Special thanks to nocchijiang for pointing this out in
#158720 (comment).

Depends on #161253.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 3, 2025
Add the flag `--tail-merge-strings` to enable tail merging of cstrings.
For example, if we have strings `mystring\0` and `ring\0`, we could
place `mystring\0` at address `0x1000` and `ring\0` at address `0x1004`
and have them share the same underlying data.

It turns out that many ObjC method names can be tail merged. For
example, `error:` and `doFoo:error:`. On a large iOS binary, we saw
nearly a 15% size improvement in the `__TEXT__objc_methname` section and
negligible impact on link time.
```
$ bloaty --domain=vm merged.o.stripped -- base.o.stripped
     VM SIZE
 --------------
   +95% +5.85Ki    [__TEXT]
  -2.4%  -239Ki    __TEXT,__cstring
 -14.5%  -710Ki    __TEXT,__objc_methname
  -1.0%  -944Ki    TOTAL
```

Tail merging for MachO was originally removed in
llvm/llvm-project@7c269db.
The previous implementation used `StringTableBuilder`, but that was
removed in
llvm/llvm-project@4308f03
to ensure deduplicated strings are aligned correctly. This
implementation ensures that tail merged strings are also aligned
correctly.

Special thanks to nocchijiang for pointing this out in
llvm/llvm-project#158720 (comment).

Depends on llvm/llvm-project#161253.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants