Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dsymutil][DWARFLinker] Refactor handling mergeable libraries. #80615

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

avl-llvm
Copy link
Collaborator

@avl-llvm avl-llvm commented Feb 4, 2024

This patch refactors working with mergeable libraries. Instead of storing relocations pointing to address attributes of mergeable libraries, this patch links address attributes through the new debug map. This solution significantly simplifies attributes handling(no need to track attribute offsets), minimizes size of dSYM bundle and run-time memory requirements:

  1. Link mergeable library as usual. Do not create new set of relocations.
  2. When linked mergeable library is used as part of another library - read address attribute, link it through the new debug map, store linked value.

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 4, 2024

@llvm/pr-subscribers-debuginfo

Author: Alexey Lapshin (avl-llvm)

Changes

This patch refactors working with mergeable libraries. Instead of storing relocations pointing to address attributes of mergeable libraries, this patch links address attributes through the new debug map. This solution significantly simplifies attributes handling(no need to track attribute offsets), minimizes size of dSYM bundle and run-time memory requirements:

  1. Link mergeable library as usual. Do not create new set of relocations.
  2. When linked mergeable library is used as part of another library - read address attribute, link it through the new debug map, store linked value.

Patch is 44.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80615.diff

12 Files Affected:

  • (modified) llvm/include/llvm/DWARFLinker/AddressesMap.h (+2-21)
  • (modified) llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp (+3-30)
  • (modified) llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp (+2-2)
  • (modified) llvm/tools/dsymutil/CMakeLists.txt (-1)
  • (modified) llvm/tools/dsymutil/DebugMap.cpp (+2-6)
  • (modified) llvm/tools/dsymutil/DebugMap.h (+27-10)
  • (modified) llvm/tools/dsymutil/DwarfLinkerForBinary.cpp (+88-138)
  • (modified) llvm/tools/dsymutil/DwarfLinkerForBinary.h (+38-82)
  • (modified) llvm/tools/dsymutil/MachODebugMapParser.cpp (-18)
  • (removed) llvm/tools/dsymutil/RelocationMap.cpp (-92)
  • (removed) llvm/tools/dsymutil/RelocationMap.h (-160)
  • (modified) llvm/tools/llvm-dwarfutil/DebugInfoLinker.cpp (+1-11)
diff --git a/llvm/include/llvm/DWARFLinker/AddressesMap.h b/llvm/include/llvm/DWARFLinker/AddressesMap.h
index a232aafadc5ce..a521cc808c278 100644
--- a/llvm/include/llvm/DWARFLinker/AddressesMap.h
+++ b/llvm/include/llvm/DWARFLinker/AddressesMap.h
@@ -31,9 +31,8 @@ class AddressesMap {
 public:
   virtual ~AddressesMap() = default;
 
-  /// Checks that there are valid relocations in the .debug_info
-  /// section.
-  virtual bool hasValidRelocs() = 0;
+  /// Check if we can quickly detect that there is no any live debug info.
+  virtual bool hasLiveDebugInfo() = 0;
 
   /// Checks that the specified DWARF expression operand \p Op references live
   /// code section and returns the relocation adjustment value (to get the
@@ -65,24 +64,6 @@ class AddressesMap {
   virtual bool applyValidRelocs(MutableArrayRef<char> Data, uint64_t BaseOffset,
                                 bool IsLittleEndian) = 0;
 
-  /// Check if the linker needs to gather and save relocation info.
-  virtual bool needToSaveValidRelocs() = 0;
-
-  /// Update and save relocation values to be serialized
-  virtual void updateAndSaveValidRelocs(bool IsDWARF5,
-                                        uint64_t OriginalUnitOffset,
-                                        int64_t LinkedOffset,
-                                        uint64_t StartOffset,
-                                        uint64_t EndOffset) = 0;
-
-  /// Update the valid relocations that used OriginalUnitOffset as the compile
-  /// unit offset, and update their values to reflect OutputUnitOffset.
-  virtual void updateRelocationsWithUnitOffset(uint64_t OriginalUnitOffset,
-                                               uint64_t OutputUnitOffset) = 0;
-
-  /// Erases all data.
-  virtual void clear() = 0;
-
   /// This function checks whether variable has DWARF expression containing
   /// operation referencing live address(f.e. DW_OP_addr, DW_OP_addrx...).
   /// \returns first is true if the expression has an operation referencing an
diff --git a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
index 4f5a4e2ffc702..3d3a9c4cd527c 100644
--- a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
+++ b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
@@ -1653,12 +1653,6 @@ shouldSkipAttribute(bool Update,
   }
 }
 
-struct AttributeLinkedOffsetFixup {
-  int64_t LinkedOffsetFixupVal;
-  uint64_t InputAttrStartOffset;
-  uint64_t InputAttrEndOffset;
-};
-
 DIE *DWARFLinker::DIECloner::cloneDIE(const DWARFDie &InputDIE,
                                       const DWARFFile &File, CompileUnit &Unit,
                                       int64_t PCOffset, uint32_t OutOffset,
@@ -1744,7 +1738,6 @@ DIE *DWARFLinker::DIECloner::cloneDIE(const DWARFDie &InputDIE,
 
   std::optional<StringRef> LibraryInstallName =
       ObjFile.Addresses->getLibraryInstallName();
-  SmallVector<AttributeLinkedOffsetFixup> AttributesFixups;
   for (const auto &AttrSpec : Abbrev->attributes()) {
     if (shouldSkipAttribute(Update, AttrSpec, Flags & TF_SkipPC)) {
       DWARFFormValue::skipValue(AttrSpec.Form, Data, &Offset,
@@ -1752,22 +1745,14 @@ DIE *DWARFLinker::DIECloner::cloneDIE(const DWARFDie &InputDIE,
       continue;
     }
 
-    AttributeLinkedOffsetFixup CurAttrFixup;
-    CurAttrFixup.InputAttrStartOffset = InputDIE.getOffset() + Offset;
-    CurAttrFixup.LinkedOffsetFixupVal =
-        Unit.getStartOffset() + OutOffset - CurAttrFixup.InputAttrStartOffset;
-
     DWARFFormValue Val = AttrSpec.getFormValue();
     uint64_t AttrSize = Offset;
     Val.extractValue(Data, &Offset, U.getFormParams(), &U);
-    CurAttrFixup.InputAttrEndOffset = InputDIE.getOffset() + Offset;
     AttrSize = Offset - AttrSize;
 
     uint64_t FinalAttrSize =
         cloneAttribute(*Die, InputDIE, File, Unit, Val, AttrSpec, AttrSize,
                        AttrInfo, IsLittleEndian);
-    if (FinalAttrSize != 0 && ObjFile.Addresses->needToSaveValidRelocs())
-      AttributesFixups.push_back(CurAttrFixup);
 
     OutOffset += FinalAttrSize;
   }
@@ -1860,15 +1845,6 @@ DIE *DWARFLinker::DIECloner::cloneDIE(const DWARFDie &InputDIE,
   // Add the size of the abbreviation number to the output offset.
   OutOffset += AbbrevNumberSize;
 
-  // Update fixups with the size of the abbreviation number
-  for (AttributeLinkedOffsetFixup &F : AttributesFixups)
-    F.LinkedOffsetFixupVal += AbbrevNumberSize;
-
-  for (AttributeLinkedOffsetFixup &F : AttributesFixups)
-    ObjFile.Addresses->updateAndSaveValidRelocs(
-        Unit.getOrigUnit().getVersion() >= 5, Unit.getOrigUnit().getOffset(),
-        F.LinkedOffsetFixupVal, F.InputAttrStartOffset, F.InputAttrEndOffset);
-
   if (!HasChildren) {
     // Update our size.
     Die->setSize(OutOffset - Die->getOffset());
@@ -2718,12 +2694,9 @@ Error DWARFLinker::link() {
     if (Options.VerifyInputDWARF)
       verifyInput(OptContext.File);
 
-    // Look for relocations that correspond to address map entries.
-
-    // there was findvalidrelocations previously ... probably we need to gather
-    // info here
+    // Check if we can skip linking.
     if (LLVM_LIKELY(!Options.Update) &&
-        !OptContext.File.Addresses->hasValidRelocs()) {
+        !OptContext.File.Addresses->hasLiveDebugInfo()) {
       if (Options.Verbose)
         outs() << "No valid relocations found. Skipping.\n";
 
@@ -2853,7 +2826,7 @@ Error DWARFLinker::link() {
     // The calls to applyValidRelocs inside cloneDIE will walk the reloc
     // array again (in the same way findValidRelocsInDebugInfo() did). We
     // need to reset the NextValidReloc index to the beginning.
-    if (OptContext.File.Addresses->hasValidRelocs() ||
+    if (OptContext.File.Addresses->hasLiveDebugInfo() ||
         LLVM_UNLIKELY(Options.Update)) {
       SizeByObject[OptContext.File.FileName].Input =
           getDebugInfoSize(*OptContext.File.Dwarf);
diff --git a/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp b/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp
index a052969e74c0f..930eb768e6d3e 100644
--- a/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp
+++ b/llvm/lib/DWARFLinker/Parallel/DWARFLinkerImpl.cpp
@@ -463,7 +463,7 @@ Error DWARFLinkerImpl::LinkContext::link(TypeUnit *ArtificialTypeUnit) {
   // Check for live relocations. If there is no any live relocation then we
   // can skip entire object file.
   if (!GlobalData.getOptions().UpdateIndexTablesOnly &&
-      !InputDWARFFile.Addresses->hasValidRelocs()) {
+      !InputDWARFFile.Addresses->hasLiveDebugInfo()) {
     if (GlobalData.getOptions().Verbose)
       outs() << "No valid relocations found. Skipping.\n";
     return Error::success();
@@ -670,7 +670,7 @@ void DWARFLinkerImpl::LinkContext::linkSingleCompileUnit(
           // Clone input compile unit.
           if (CU.isClangModule() ||
               GlobalData.getOptions().UpdateIndexTablesOnly ||
-              CU.getContaingFile().Addresses->hasValidRelocs()) {
+              CU.getContaingFile().Addresses->hasLiveDebugInfo()) {
             if (Error Err = CU.cloneAndEmit(GlobalData.getTargetTriple(),
                                             ArtificialTypeUnit))
               return std::move(Err);
diff --git a/llvm/tools/dsymutil/CMakeLists.txt b/llvm/tools/dsymutil/CMakeLists.txt
index 53f882e90b4e9..ac338839fcfc0 100644
--- a/llvm/tools/dsymutil/CMakeLists.txt
+++ b/llvm/tools/dsymutil/CMakeLists.txt
@@ -31,7 +31,6 @@ add_llvm_tool(dsymutil
   MachODebugMapParser.cpp
   MachOUtils.cpp
   Reproducer.cpp
-  RelocationMap.cpp
   SymbolMap.cpp
 
   DEPENDS
diff --git a/llvm/tools/dsymutil/DebugMap.cpp b/llvm/tools/dsymutil/DebugMap.cpp
index 8724b70422f32..a69f0c919db3b 100644
--- a/llvm/tools/dsymutil/DebugMap.cpp
+++ b/llvm/tools/dsymutil/DebugMap.cpp
@@ -58,10 +58,6 @@ bool DebugMapObject::addSymbol(StringRef Name,
   return InsertResult.second;
 }
 
-void DebugMapObject::setRelocationMap(dsymutil::RelocationMap &RM) {
-  RelocMap.emplace(RM);
-}
-
 void DebugMapObject::setInstallName(StringRef IN) { InstallName.emplace(IN); }
 
 void DebugMapObject::print(raw_ostream &OS) const {
@@ -169,8 +165,8 @@ struct MappingTraits<dsymutil::DebugMapObject>::YamlDMO {
   std::vector<dsymutil::DebugMapObject::YAMLSymbolMapping> Entries;
 };
 
-void MappingTraits<std::pair<std::string, SymbolMapping>>::mapping(
-    IO &io, std::pair<std::string, SymbolMapping> &s) {
+void MappingTraits<std::pair<std::string, DebugMapObject::SymbolMapping>>::
+    mapping(IO &io, std::pair<std::string, DebugMapObject::SymbolMapping> &s) {
   io.mapRequired("sym", s.first);
   io.mapOptional("objAddr", s.second.ObjectAddress);
   io.mapRequired("binAddr", s.second.BinaryAddress);
diff --git a/llvm/tools/dsymutil/DebugMap.h b/llvm/tools/dsymutil/DebugMap.h
index 9c3a698fa1191..9ef7fdf16cb1e 100644
--- a/llvm/tools/dsymutil/DebugMap.h
+++ b/llvm/tools/dsymutil/DebugMap.h
@@ -21,7 +21,6 @@
 #ifndef LLVM_TOOLS_DSYMUTIL_DEBUGMAP_H
 #define LLVM_TOOLS_DSYMUTIL_DEBUGMAP_H
 
-#include "RelocationMap.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/StringMap.h"
 #include "llvm/ADT/StringRef.h"
@@ -135,6 +134,22 @@ class DebugMap {
 /// linked binary for all the linked atoms in this object file.
 class DebugMapObject {
 public:
+  struct SymbolMapping {
+    std::optional<yaml::Hex64> ObjectAddress;
+    yaml::Hex64 BinaryAddress;
+    yaml::Hex32 Size;
+
+    SymbolMapping(std::optional<uint64_t> ObjectAddr, uint64_t BinaryAddress,
+                  uint32_t Size)
+        : BinaryAddress(BinaryAddress), Size(Size) {
+      if (ObjectAddr)
+        ObjectAddress = *ObjectAddr;
+    }
+
+    /// For YAML IO support
+    SymbolMapping() = default;
+  };
+
   using YAMLSymbolMapping = std::pair<std::string, SymbolMapping>;
   using DebugMapEntry = StringMapEntry<SymbolMapping>;
 
@@ -167,11 +182,6 @@ class DebugMapObject {
   }
   const std::vector<std::string> &getWarnings() const { return Warnings; }
 
-  const std::optional<RelocationMap> &getRelocationMap() const {
-    return RelocMap;
-  }
-  void setRelocationMap(dsymutil::RelocationMap &RM);
-
   const std::optional<std::string> &getInstallName() const {
     return InstallName;
   }
@@ -191,11 +201,10 @@ class DebugMapObject {
 
   std::string Filename;
   sys::TimePoint<std::chrono::seconds> Timestamp;
-  StringMap<struct SymbolMapping> Symbols;
+  StringMap<SymbolMapping> Symbols;
   DenseMap<uint64_t, DebugMapEntry *> AddressToMapping;
   uint8_t Type;
 
-  std::optional<RelocationMap> RelocMap;
   std::optional<std::string> InstallName;
 
   std::vector<std::string> Warnings;
@@ -223,8 +232,10 @@ namespace yaml {
 
 using namespace llvm::dsymutil;
 
-template <> struct MappingTraits<std::pair<std::string, SymbolMapping>> {
-  static void mapping(IO &io, std::pair<std::string, SymbolMapping> &s);
+template <>
+struct MappingTraits<std::pair<std::string, DebugMapObject::SymbolMapping>> {
+  static void mapping(IO &io,
+                      std::pair<std::string, DebugMapObject::SymbolMapping> &s);
   static const bool flow = true;
 };
 
@@ -233,6 +244,12 @@ template <> struct MappingTraits<dsymutil::DebugMapObject> {
   static void mapping(IO &io, dsymutil::DebugMapObject &DMO);
 };
 
+template <> struct ScalarTraits<Triple> {
+  static void output(const Triple &val, void *, raw_ostream &out);
+  static StringRef input(StringRef scalar, void *, Triple &value);
+  static QuotingType mustQuote(StringRef) { return QuotingType::Single; }
+};
+
 template <>
 struct SequenceTraits<std::vector<std::unique_ptr<dsymutil::DebugMapObject>>> {
   static size_t
diff --git a/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp b/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
index 5ae5ecd556adb..69ebf3221585e 100644
--- a/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
+++ b/llvm/tools/dsymutil/DwarfLinkerForBinary.cpp
@@ -188,42 +188,6 @@ static Error remarksErrorHandler(const DebugMapObject &DMO,
 
   return createFileError(FE->getFileName(), std::move(NewE));
 }
-Error DwarfLinkerForBinary::emitRelocations(
-    const DebugMap &DM, std::vector<ObjectWithRelocMap> &ObjectsForLinking) {
-  // Return early if the "Resources" directory is not being written to.
-  if (!Options.ResourceDir)
-    return Error::success();
-
-  RelocationMap RM(DM.getTriple(), DM.getBinaryPath());
-  for (auto &Obj : ObjectsForLinking) {
-    if (!Obj.OutRelocs->isInitialized())
-      continue;
-    Obj.OutRelocs->addValidRelocs(RM);
-  }
-
-  SmallString<128> InputPath;
-  SmallString<128> Path;
-  // Create the "Relocations" directory in the "Resources" directory, and
-  // create an architecture-specific directory in the "Relocations" directory.
-  StringRef ArchName = Triple::getArchName(RM.getTriple().getArch(),
-                                           RM.getTriple().getSubArch());
-  sys::path::append(Path, *Options.ResourceDir, "Relocations", ArchName);
-  if (std::error_code EC = sys::fs::create_directories(Path.str(), true,
-                                                       sys::fs::perms::all_all))
-    return errorCodeToError(EC);
-
-  // Append the file name.
-  sys::path::append(Path, sys::path::filename(DM.getBinaryPath()));
-  Path.append(".yml");
-
-  std::error_code EC;
-  raw_fd_ostream OS(Path.str(), EC, sys::fs::OF_Text);
-  if (EC)
-    return errorCodeToError(EC);
-
-  RM.print(OS);
-  return Error::success();
-}
 
 static Error emitRemarks(const LinkOptions &Options, StringRef BinaryPath,
                          StringRef ArchName, const remarks::RemarkLinker &RL) {
@@ -263,10 +227,10 @@ static Error emitRemarks(const LinkOptions &Options, StringRef BinaryPath,
   return Error::success();
 }
 
-ErrorOr<std::unique_ptr<DWARFFile>> DwarfLinkerForBinary::loadObject(
-    const DebugMapObject &Obj, const DebugMap &DebugMap,
-    remarks::RemarkLinker &RL,
-    std::shared_ptr<DwarfLinkerForBinaryRelocationMap> DLBRM) {
+ErrorOr<std::unique_ptr<DWARFFile>>
+DwarfLinkerForBinary::loadObject(const DebugMapObject &Obj,
+                                 const DebugMap &DebugMap,
+                                 remarks::RemarkLinker &RL) {
   auto ErrorOrObj = loadObject(Obj, DebugMap.getTriple());
   std::unique_ptr<DWARFFile> Res;
 
@@ -284,10 +248,9 @@ ErrorOr<std::unique_ptr<DWARFFile>> DwarfLinkerForBinary::loadObject(
             reportWarning(Info.message());
           });
         });
-    DLBRM->init(*Context);
     Res = std::make_unique<DWARFFile>(
         Obj.getObjectFilename(), std::move(Context),
-        std::make_unique<AddressManager>(*this, *ErrorOrObj, Obj, DLBRM),
+        std::make_unique<AddressManager>(*this, *ErrorOrObj, Obj),
         [&](StringRef FileName) { BinHolder.eraseObjectEntry(FileName); });
 
     Error E = RL.link(*ErrorOrObj);
@@ -630,7 +593,7 @@ template <typename Linker>
 bool DwarfLinkerForBinary::linkImpl(
     const DebugMap &Map, typename Linker::OutputFileType ObjectType) {
 
-  std::vector<ObjectWithRelocMap> ObjectsForLinking;
+  std::vector<std::unique_ptr<DWARFFile>> ObjectsForLinking;
 
   DebugMap DebugMap(Map.getTriple(), Map.getBinaryPath());
 
@@ -703,11 +666,10 @@ bool DwarfLinkerForBinary::linkImpl(
     auto &Obj = DebugMap.addDebugMapObject(
         Path, sys::TimePoint<std::chrono::seconds>(), MachO::N_OSO);
 
-    auto DLBRelocMap = std::make_shared<DwarfLinkerForBinaryRelocationMap>();
     if (ErrorOr<std::unique_ptr<DWARFFile>> ErrorOrObj =
-            loadObject(Obj, DebugMap, RL, DLBRelocMap)) {
-      ObjectsForLinking.emplace_back(std::move(*ErrorOrObj), DLBRelocMap);
-      return *ObjectsForLinking.back().Object;
+            loadObject(Obj, DebugMap, RL)) {
+      ObjectsForLinking.emplace_back(std::move(*ErrorOrObj));
+      return *ObjectsForLinking.back();
     } else {
       // Try and emit more helpful warnings by applying some heuristics.
       StringRef ObjFile = ContainerName;
@@ -818,18 +780,15 @@ bool DwarfLinkerForBinary::linkImpl(
       continue;
     }
 
-    auto DLBRelocMap = std::make_shared<DwarfLinkerForBinaryRelocationMap>();
     if (ErrorOr<std::unique_ptr<DWARFFile>> ErrorOrObj =
-            loadObject(*Obj, Map, RL, DLBRelocMap)) {
-      ObjectsForLinking.emplace_back(std::move(*ErrorOrObj), DLBRelocMap);
-      GeneralLinker->addObjectFile(*ObjectsForLinking.back().Object, Loader,
+            loadObject(*Obj, Map, RL)) {
+      ObjectsForLinking.emplace_back(std::move(*ErrorOrObj));
+      GeneralLinker->addObjectFile(*ObjectsForLinking.back(), Loader,
                                    OnCUDieLoaded);
     } else {
-      ObjectsForLinking.push_back(
-          {std::make_unique<DWARFFile>(Obj->getObjectFilename(), nullptr,
-                                       nullptr),
-           DLBRelocMap});
-      GeneralLinker->addObjectFile(*ObjectsForLinking.back().Object);
+      ObjectsForLinking.push_back({std::make_unique<DWARFFile>(
+          Obj->getObjectFilename(), nullptr, nullptr)});
+      GeneralLinker->addObjectFile(*ObjectsForLinking.back());
     }
   }
 
@@ -854,9 +813,6 @@ bool DwarfLinkerForBinary::linkImpl(
   if (Options.NoOutput)
     return true;
 
-  if (Error E = emitRelocations(Map, ObjectsForLinking))
-    return error(toString(std::move(E)));
-
   if (Options.ResourceDir && !ParseableSwiftInterfaces.empty()) {
     StringRef ArchName = Triple::getArchTypeName(Map.getTriple().getArch());
     if (auto E = copySwiftInterfaces(ArchName))
@@ -941,14 +897,12 @@ void DwarfLinkerForBinary::AddressManager::findValidRelocsMachO(
         continue;
       }
       if (const auto *Mapping = DMO.lookupSymbol(*SymbolName))
-        ValidRelocs.emplace_back(Offset64, RelocSize, Addend, Mapping->getKey(),
-                                 Mapping->getValue());
+        ValidRelocs.emplace_back(Offset64, RelocSize, Addend, Mapping);
     } else if (const auto *Mapping = DMO.lookupObjectAddress(SymAddress)) {
       // Do not store the addend. The addend was the address of the symbol in
       // the object file, the address in the binary that is stored in the debug
       // map doesn't need to be offset.
-      ValidRelocs.emplace_back(Offset64, RelocSize, SymOffset,
-                               Mapping->getKey(), Mapping->getValue());
+      ValidRelocs.emplace_back(Offset64, RelocSize, SymOffset, Mapping);
     }
   }
 }
@@ -1002,13 +956,18 @@ bool DwarfLinkerForBinary::AddressManager::findValidRelocsInDebugSections(
   return FoundValidRelocs;
 }
 
-std::vector<ValidReloc> DwarfLinkerForBinary::AddressManager::getRelocations(
-    const std::vector<ValidReloc> &Relocs, uint64_t StartPos, uint64_t EndPos) {
-  std::vector<ValidReloc> Res;
-
-  auto CurReloc = partition_point(Relocs, [StartPos](const ValidReloc &Reloc) {
-    return (uint64_t)Reloc.Offset < StartPos;
-  });
+std::vector<DwarfLinkerForBinary::AddressManager::ValidReloc>
+DwarfLinkerForBinary::AddressManager::getRelocations(
+    const std::vector<DwarfLinkerForBinary::AddressManager::ValidReloc> &Relocs,
+    uint64_t StartPos, uint64_t EndPos) {
+  std::vector<DwarfLinkerForBinary::AddressManager::ValidReloc> Res;
+
+  auto CurReloc = partition_point(
+      Relocs,
+      [StartPos](
+          const DwarfLinkerForBinary::AddressManager::ValidReloc &Reloc) {
+        return (uint64_t)Reloc.Offset < StartPos;
+      });
 
   while (CurReloc != Relocs.end() && CurReloc->Offset >= StartPos &&
          (uint64_t)CurReloc->Offset < EndPos) {
@@ -1019,22 +978,46 @@ std::vector<ValidReloc> DwarfLinkerForBinary::AddressManager::getRelocations(
   return Res;
 }
 
-void DwarfLinkerForBinary::AddressManager::printReloc(const ValidReloc &Reloc) {
-  const auto &Mapping = Reloc.SymbolMapping;
-  const uint64_t ObjectAddress = Mapping.ObjectAddress
-                                     ? uint64_t(*Mapping.ObjectAddress)
+void DwarfLinkerForBinary::AddressManager::printReloc(
+    const DwarfLinkerForBinary::AddressManager::ValidReloc &Reloc) {
+  printMapping(Reloc.Mapping->getKey(), Reloc.Mapping->getValue());
+}
+
+void DwarfLinkerForBinary::AddressManager::printMapping(
+    StringRef SymName, const DebugMapObject::SymbolMapping &SymMapping) {
+  const uint64_t ObjectAddress = SymMapping.ObjectAddress
+                                     ? uint64_t(*SymMapping.ObjectAddress)
                                      : std::numeric_limits<uint64_t>::max();
 
-  outs() << "Found valid debug map entry: " << Reloc.SymbolName << "\t"
+  outs() << "Found valid debug map entry: " << SymName << "\t"
          << format("0x%016" PRIx64 " => 0x%016" PRIx...
[truncated]

@avl-llvm
Copy link
Collaborator Author

avl-llvm commented Feb 4, 2024

cc author of initial patch who added support of mergeable libraries https://reviews.llvm.org/D158124 @Alpha-10000

@avl-llvm
Copy link
Collaborator Author

friendly ping.

1 similar comment
@avl-llvm
Copy link
Collaborator Author

friendly ping.

@JDevlieghere
Copy link
Member

I was waiting for @Alpha-10000 to take a look. I've pinged him offline.

@Alpha-10000
Copy link
Contributor

Alpha-10000 commented Mar 20, 2024

I did not see this PR, sorry for the delayed response.
Generally looks good!
I verified that the attached relink.tar.gz produces a debuggable executable.
(make mrproper && make all-relink && make strip && lldb)

A few minor outstanding things:

  • The tests need to be updated to reflect the changes, to at the very list stop referencing relocation maps.
  • Similarly to [dsymutil] Fix spurious warnings in MachODebugMapParser #78794, could you please further update switchToNewLibDebugMapObject so that addCommonSymbols/resetParserState is at the beginning of the function and that duplicate N_LIB entry names are tracked? To void time spent handling errors and emitting warnings.

Otherwise, the rest is fine by me 👍

@fredriss
Copy link
Collaborator

I don't think this new approach covers all cases. The initial implementation of llvm-dsymutil didn't care about relocations and would just go through the Dwarf and patch things that looked like they needed patching like the low_pc of functions. I discovered that these are however not the only places where relocations occur. Global variables have relocations too, and weirdly I've seen relocations in the middle of a complex Dwarf location expression. Falling back on actually relocation processing was the way we handled these cases.

@Alpha-10000
Copy link
Contributor

Additionally, there is on outstanding thing, regarding compatibility with linking dSYM produced before the changes.
I attached a fairly straightforward reproducer in relink.tar.gz.
Replace the dsymutil paths with your local ones, and run make mrproper && make all-relink && make mix && make strip && lldb.
You can see how some external data has an incorrect type:

   8   	{
   9   	  int result = foo();
-> 10  	  printf("FOO %d\n", result + baz - altfoo());
   11  	  display();
   12  	  return 0;
   13  	}
Target 0: (relink) stopped.
(lldb) p baz
(void *) 0x0000000000000003 // <- should be int

@avl-llvm avl-llvm force-pushed the avl-llvm/backport-mergeable-libs3 branch from 78d230b to 4ad5fb9 Compare March 24, 2024 16:45
Copy link

✅ With the latest revision this PR passed the Python code formatter.

Copy link

✅ With the latest revision this PR passed the C/C++ code formatter.

@avl-llvm
Copy link
Collaborator Author

I did not see this PR, sorry for the delayed response. Generally looks good! I verified that the attached relink.tar.gz produces a debuggable executable. (make mrproper && make all-relink && make strip && lldb)

A few minor outstanding things:

* The tests need to be updated to reflect the changes, to at the very list stop referencing relocation maps.

* Similarly to [[dsymutil] Fix spurious warnings in MachODebugMapParser #78794](https://github.com/llvm/llvm-project/pull/78794), could you please further update `switchToNewLibDebugMapObject` so that `addCommonSymbols/resetParserState`  is at the beginning of the function and that duplicate N_LIB entry names are tracked? To void time spent handling errors and emitting warnings.

Otherwise, the rest is fine by me 👍

Thank you for the comments! I updated patch for switchToNewLibDebugMapObject(moved addCommonSymbols/resetParserState and check for duplicate N_LIB entry names).

I did not understand what should be done for the tests. Could you point for the relocation map reference which should be deleted, please?

@avl-llvm
Copy link
Collaborator Author

I don't think this new approach covers all cases. The initial implementation of llvm-dsymutil didn't care about relocations and would just go through the Dwarf and patch things that looked like they needed patching like the low_pc of functions. I discovered that these are however not the only places where relocations occur. Global variables have relocations too, and weirdly I've seen relocations in the middle of a complex Dwarf location expression. Falling back on actually relocation processing was the way we handled these cases.

My understanding is that this new patch does not broke this behavior. Let`s see how it work for global variables:

  1. file foo.o has following definition for some global variable:
   DW_TAG_variable
     DW_AT_name var1
     DW_AT_location  "DW_OP_addr 0x1000"  <<< object file contains a relocation for DW_OP_addr value.

  1. when dsymutil links foo.o into foo.dylib it links relocation through the debug map. The foo.dylib
    will contain following:
   DW_TAG_variable
     DW_AT_name var1
     DW_AT_location  "DW_OP_addr 0x2000"  <<< dylib contains absolute linked value.

  1. when dsymutil links another file referencing foo.dylib it reads absolute value 0x2000 and
    links it through the new debug map. So the final file contains:
   DW_TAG_variable
     DW_AT_name var1
     DW_AT_location  "DW_OP_addr 0x3000"  <<< final file contains absolute relinked value.

Are there cases which would be handled incorrectly using this scheme?

@avl-llvm
Copy link
Collaborator Author

Additionally, there is on outstanding thing, regarding compatibility with linking dSYM produced before the changes. I attached a fairly straightforward reproducer in relink.tar.gz. Replace the dsymutil paths with your local ones, and run make mrproper && make all-relink && make mix && make strip && lldb. You can see how some external data has an incorrect type:

   8   	{
   9   	  int result = foo();
-> 10  	  printf("FOO %d\n", result + baz - altfoo());
   11  	  display();
   12  	  return 0;
   13  	}
Target 0: (relink) stopped.
(lldb) p baz
(void *) 0x0000000000000003 // <- should be int

Thank you for the testcase. I do not have a machine with XCode15 installed to check it right now. I would search for the machine and update the patch.

@Alpha-10000
Copy link
Contributor

Alpha-10000 commented Mar 28, 2024

I did not understand what should be done for the tests. Could you point for the relocation map reference which should be deleted, please?

All the .dSYM files under test/tools/dsymutil/Inputs/, with the Contents/Resources/Relocations/ directory

@avl-llvm avl-llvm force-pushed the avl-llvm/backport-mergeable-libs3 branch from 4ad5fb9 to 8ca3625 Compare March 28, 2024 20:24
@avl-llvm
Copy link
Collaborator Author

I did not understand what should be done for the tests. Could you point for the relocation map reference which should be deleted, please?

All the .dSYM files under test/tools/dsymutil/Inputs/, with the Contents/Resources/Relocations/ directory

Oh I see. Thanks. Removed Contents/Resources/Relocations.

This patch refactors working with mergeable libraries.
Instead of storing relocations pointing to address attributes
of mergeable libraries, this patch links address attributes
through the new debug map. This solution significantly simplifies
attributes handling(no need to track attribute offsets), minimizes
size of dSYM bundle and run-time memory requirements:

1. Link mergeable library as usual. Do not create new set of relocations.
2. When linked mergeable library is used as part of another library -
   read address attribute, link it through the new debug map,
   store linked value.
@avl-llvm avl-llvm force-pushed the avl-llvm/backport-mergeable-libs3 branch from 8ca3625 to a0de793 Compare March 28, 2024 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants