Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BOLT] Add support for BOLT-reserved space in a binary #90300

Merged
merged 2 commits into from
Apr 29, 2024

Conversation

maksfb
Copy link
Contributor

@maksfb maksfb commented Apr 26, 2024

Allow the user to allocate space in a binary that could be used by BOLT for allocating new sections. The reservation is specified by two special symbols recognizable by BOLT: _bolt_reserved{start,end}.

The reserved space will be useful for optimizing the Linux kernel where we cannot allocate a new executable segment. However, the support is not limited to kernel binaries as some user-space application may find it useful too.

Allow the user to allocate space in a binary that could be used by BOLT
for allocating new sections. The reservation is specified by two special
symbols recognizable by BOLT: __bolt_reserved_{start,end}.

The reserved space will be useful for optimizing the Linux kernel where
we cannot allocate a new executable segment. However, the support is not
limited to kernel binaries as some user-space application may find it
useful too.
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 26, 2024

@llvm/pr-subscribers-bolt

Author: Maksim Panchenko (maksfb)

Changes

Allow the user to allocate space in a binary that could be used by BOLT for allocating new sections. The reservation is specified by two special symbols recognizable by BOLT: _bolt_reserved{start,end}.

The reserved space will be useful for optimizing the Linux kernel where we cannot allocate a new executable segment. However, the support is not limited to kernel binaries as some user-space application may find it useful too.


Full diff: https://github.com/llvm/llvm-project/pull/90300.diff

2 Files Affected:

  • (modified) bolt/include/bolt/Rewrite/RewriteInstance.h (+4)
  • (modified) bolt/lib/Rewrite/RewriteInstance.cpp (+64-22)
diff --git a/bolt/include/bolt/Rewrite/RewriteInstance.h b/bolt/include/bolt/Rewrite/RewriteInstance.h
index f4bffba96b1d4e..21d66a709364c4 100644
--- a/bolt/include/bolt/Rewrite/RewriteInstance.h
+++ b/bolt/include/bolt/Rewrite/RewriteInstance.h
@@ -422,6 +422,10 @@ class RewriteInstance {
   /// Section name used for extra BOLT code in addition to .text.
   static StringRef getBOLTTextSectionName() { return ".bolt.text"; }
 
+  /// Symbol markers for BOLT reserved area.
+  static StringRef getBOLTReservedStart() { return "__bolt_reserved_start"; }
+  static StringRef getBOLTReservedEnd() { return "__bolt_reserved_end"; }
+
   /// Common section names.
   static StringRef getEHFrameSectionName() { return ".eh_frame"; }
   static StringRef getEHFrameHdrSectionName() { return ".eh_frame_hdr"; }
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 065260936e70a5..72ebf57a0e737a 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -1062,6 +1062,11 @@ void RewriteInstance::discoverFileObjects() {
       continue;
     }
 
+    if (SymName == getBOLTReservedStart() || SymName == getBOLTReservedEnd()) {
+      registerName(SymbolSize);
+      continue;
+    }
+
     LLVM_DEBUG(dbgs() << "BOLT-DEBUG: considering symbol " << UniqueName
                       << " for function\n");
 
@@ -3526,6 +3531,24 @@ void RewriteInstance::updateMetadata() {
 void RewriteInstance::mapFileSections(BOLTLinker::SectionMapper MapSection) {
   BC->deregisterUnusedSections();
 
+  // Check if the input has a space reserved for BOLT.
+  BinaryData *StartBD = BC->getBinaryDataByName(getBOLTReservedStart());
+  BinaryData *EndBD = BC->getBinaryDataByName(getBOLTReservedEnd());
+  if (StartBD) {
+    if (!EndBD) {
+      BC->errs() << "BOLT-ERROR: " << getBOLTReservedEnd() << " is missing\n";
+      exit(1);
+    }
+
+    PHDRTableOffset = 0;
+    PHDRTableAddress = 0;
+    NextAvailableAddress = StartBD->getAddress();
+    NewTextSegmentAddress = 0;
+    NewTextSegmentOffset = 0;
+    BC->outs()
+        << "BOLT-INFO: using reserved space for allocating new sections\n";
+  }
+
   // If no new .eh_frame was written, remove relocated original .eh_frame.
   BinarySection *RelocatedEHFrameSection =
       getSection(".relocated" + getEHFrameSectionName());
@@ -3545,6 +3568,18 @@ void RewriteInstance::mapFileSections(BOLTLinker::SectionMapper MapSection) {
 
   // Map the rest of the sections.
   mapAllocatableSections(MapSection);
+
+  if (StartBD) {
+    const uint64_t ReservedSpace = EndBD->getAddress() - StartBD->getAddress();
+    const uint64_t AllocatedSize = NextAvailableAddress - StartBD->getAddress();
+    if (ReservedSpace < AllocatedSize) {
+      BC->errs() << "BOLT-ERROR: reserved space (" << ReservedSpace << " byte"
+                 << (ReservedSpace == 1 ? "" : "s")
+                 << ") is smaller than required for new allocations ("
+                 << AllocatedSize << " bytes)\n";
+      exit(1);
+    }
+  }
 }
 
 std::vector<BinarySection *> RewriteInstance::getCodeSections() {
@@ -3786,7 +3821,7 @@ void RewriteInstance::mapCodeSections(BOLTLinker::SectionMapper MapSection) {
   // Add the new text section aggregating all existing code sections.
   // This is pseudo-section that serves a purpose of creating a corresponding
   // entry in section header table.
-  int64_t NewTextSectionSize =
+  const uint64_t NewTextSectionSize =
       NextAvailableAddress - NewTextSectionStartAddress;
   if (NewTextSectionSize) {
     const unsigned Flags = BinarySection::getFlags(/*IsReadOnly=*/true,
@@ -3869,7 +3904,7 @@ void RewriteInstance::mapAllocatableSections(
       if (PHDRTableAddress) {
         // Segment size includes the size of the PHDR area.
         NewTextSegmentSize = NextAvailableAddress - PHDRTableAddress;
-      } else {
+      } else if (NewTextSegmentAddress) {
         // Existing PHDR table would be updated.
         NewTextSegmentSize = NextAvailableAddress - NewTextSegmentAddress;
       }
@@ -3908,7 +3943,7 @@ void RewriteInstance::patchELFPHDRTable() {
     assert(!PHDRTableAddress && "unexpected address for program header table");
     PHDRTableOffset = Obj.getHeader().e_phoff;
     if (NewWritableSegmentSize) {
-      BC->errs() << "Unable to add writable segment with UseGnuStack option\n";
+      BC->errs() << "BOLT-ERROR: unable to add writable segment\n";
       exit(1);
     }
   }
@@ -3918,7 +3953,7 @@ void RewriteInstance::patchELFPHDRTable() {
   if (!NewWritableSegmentSize) {
     if (PHDRTableAddress)
       NewTextSegmentSize = NextAvailableAddress - PHDRTableAddress;
-    else
+    else if (NewTextSegmentAddress)
       NewTextSegmentSize = NextAvailableAddress - NewTextSegmentAddress;
   } else {
     NewWritableSegmentSize = NextAvailableAddress - NewWritableSegmentAddress;
@@ -3952,8 +3987,10 @@ void RewriteInstance::patchELFPHDRTable() {
   };
 
   auto writeNewSegmentPhdrs = [&]() {
-    ELF64LE::Phdr NewTextPhdr = createNewTextPhdr();
-    OS.write(reinterpret_cast<const char *>(&NewTextPhdr), sizeof(NewTextPhdr));
+    if (PHDRTableAddress || NewTextSegmentSize) {
+      ELF64LE::Phdr NewPhdr = createNewTextPhdr();
+      OS.write(reinterpret_cast<const char *>(&NewPhdr), sizeof(NewPhdr));
+    }
 
     if (NewWritableSegmentSize) {
       ELF64LEPhdrTy NewPhdr;
@@ -4051,9 +4088,8 @@ void RewriteInstance::rewriteNoteSections() {
   const ELFFile<ELF64LE> &Obj = ELF64LEFile->getELFFile();
   raw_fd_ostream &OS = Out->os();
 
-  uint64_t NextAvailableOffset = getFileOffsetForAddress(NextAvailableAddress);
-  assert(NextAvailableOffset >= FirstNonAllocatableOffset &&
-         "next available offset calculation failure");
+  uint64_t NextAvailableOffset = std::max(
+      getFileOffsetForAddress(NextAvailableAddress), FirstNonAllocatableOffset);
   OS.seek(NextAvailableOffset);
 
   // Copy over non-allocatable section contents and update file offsets.
@@ -4792,7 +4828,7 @@ void RewriteInstance::updateELFSymbolTable(
       ++NumHotDataSymsUpdated;
     }
 
-    if (*SymbolName == "_end")
+    if (*SymbolName == "_end" && NextAvailableAddress > Symbol.st_value)
       updateSymbolValue(*SymbolName, NextAvailableAddress);
 
     if (IsDynSym)
@@ -4906,13 +4942,6 @@ void RewriteInstance::patchELFSymTabs(ELFObjectFile<ELFT> *File) {
   std::vector<uint32_t> NewSectionIndex;
   getOutputSections(File, NewSectionIndex);
 
-  // Set pointer at the end of the output file, so we can pwrite old symbol
-  // tables if we need to.
-  uint64_t NextAvailableOffset = getFileOffsetForAddress(NextAvailableAddress);
-  assert(NextAvailableOffset >= FirstNonAllocatableOffset &&
-         "next available offset calculation failure");
-  Out->os().seek(NextAvailableOffset);
-
   // Update dynamic symbol table.
   const ELFShdrTy *DynSymSection = nullptr;
   for (const ELFShdrTy &Section : cantFail(Obj.sections())) {
@@ -4924,6 +4953,10 @@ void RewriteInstance::patchELFSymTabs(ELFObjectFile<ELFT> *File) {
   assert((DynSymSection || BC->IsStaticExecutable) &&
          "dynamic symbol table expected");
   if (DynSymSection) {
+    // Set pointer to the end of the section, so we can use pwrite to update
+    // the dynamic symbol table.
+    Out->os().seek(DynSymSection->sh_offset + DynSymSection->sh_size);
+
     updateELFSymbolTable(
         File,
         /*IsDynSym=*/true,
@@ -5477,10 +5510,10 @@ void RewriteInstance::rewriteFile() {
   auto Streamer = BC->createStreamer(OS);
   // Make sure output stream has enough reserved space, otherwise
   // pwrite() will fail.
-  uint64_t Offset = OS.seek(getFileOffsetForAddress(NextAvailableAddress));
-  (void)Offset;
-  assert(Offset == getFileOffsetForAddress(NextAvailableAddress) &&
-         "error resizing output file");
+  uint64_t Offset = std::max(getFileOffsetForAddress(NextAvailableAddress),
+                             FirstNonAllocatableOffset);
+  Offset = OS.seek(Offset);
+  assert((Offset != (uint64_t)-1) && "Error resizing output file");
 
   // Overwrite functions with fixed output address. This is mostly used by
   // non-relocation mode, with one exception: injected functions are covered
@@ -5712,7 +5745,7 @@ void RewriteInstance::writeEHFrameHeader() {
   std::vector<char> NewEHFrameHdr = CFIRdWrt->generateEHFrameHeader(
       RelocatedEHFrame, NewEHFrame, EHFrameHdrOutputAddress, FailedAddresses);
 
-  assert(Out->os().tell() == EHFrameHdrFileOffset && "offset mismatch");
+  Out->os().seek(EHFrameHdrFileOffset);
   Out->os().write(NewEHFrameHdr.data(), NewEHFrameHdr.size());
 
   const unsigned Flags = BinarySection::getFlags(/*IsReadOnly=*/true,
@@ -5732,6 +5765,15 @@ void RewriteInstance::writeEHFrameHeader() {
 
   NextAvailableAddress += EHFrameHdrSec.getOutputSize();
 
+  if (const BinaryData *ReservedEnd =
+          BC->getBinaryDataByName(getBOLTReservedEnd())) {
+    if (NextAvailableAddress > ReservedEnd->getAddress()) {
+      BC->errs() << "BOLT-ERROR: unable to fit " << getEHFrameHdrSectionName()
+                 << " into reserved space\n";
+      exit(1);
+    }
+  }
+
   // Merge new .eh_frame with the relocated original so that gdb can locate all
   // FDEs.
   if (RelocatedEHFrameSection) {

@aaupov
Copy link
Contributor

aaupov commented Apr 27, 2024

Awesome. That's the space for spilling/filling rax around xbegin/xabort :)

@maksfb
Copy link
Contributor Author

maksfb commented Apr 27, 2024

Awesome. That's the space for spilling/filling rax around xbegin/xabort :)

Only if you make it writable :)

Copy link
Member

@dcci dcci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I just have one suggestion.

bolt/lib/Rewrite/RewriteInstance.cpp Outdated Show resolved Hide resolved
@maksfb maksfb merged commit 3a0d894 into llvm:main Apr 29, 2024
4 checks passed
maksfb added a commit to maksfb/llvm-project that referenced this pull request Apr 30, 2024
We use pwrite() in RewriteInstance to update contents of existing
sections. pwrite() requires file position to be set past the written
offset which we guarantee at the start of rewriteFile(). Then we had an
implicit assumption in patchBuildID() that the file position will be set
again in patchELFSymTabs() after being reset in patchELFPHDRTable().
That assumption was broken in llvm#90300. The fix is to save and restore
file position in patchELFPHDRTable(). Then we don't have to update it
again in patchELFSymTabs().
maksfb added a commit that referenced this pull request Apr 30, 2024
We use pwrite() in RewriteInstance to update contents of existing
sections. pwrite() requires file position to be set past the written
offset which we guarantee at the start of rewriteFile(). Then we had an
implicit assumption in patchBuildID() that the file position will be set
again in patchELFSymTabs() after being reset in patchELFPHDRTable().
That assumption was broken in #90300. The fix is to save and restore
file position in patchELFPHDRTable(). Then we don't have to update it
again in patchELFSymTabs().
@maksfb maksfb deleted the gh-bolt-reserved branch April 30, 2024 21:33
maksfb added a commit that referenced this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants