Skip to content

Conversation

JDevlieghere
Copy link
Member

Generally, separate debug info files are not object files. The exception
are dSYMs which are Mach-O and often contain enough information to be
used as a module.

This patch allows users to add dSYM symbol files as target modules to
support use cases where the executable isn't available. This scenario is
common when doing core file debugging and Jason has a follow up patch
that allows core file debugging without just the dSYM.

rdar://117773589

@llvmbot
Copy link
Member

llvmbot commented Jan 12, 2024

@llvm/pr-subscribers-lldb

Author: Jonas Devlieghere (JDevlieghere)

Changes

Generally, separate debug info files are not object files. The exception
are dSYMs which are Mach-O and often contain enough information to be
used as a module.

This patch allows users to add dSYM symbol files as target modules to
support use cases where the executable isn't available. This scenario is
common when doing core file debugging and Jason has a follow up patch
that allows core file debugging without just the dSYM.

rdar://117773589


Full diff: https://github.com/llvm/llvm-project/pull/77966.diff

8 Files Affected:

  • (modified) lldb/include/lldb/Symbol/ObjectFile.h (+4)
  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp (+6)
  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h (+2)
  • (modified) lldb/source/Symbol/ObjectFile.cpp (+33)
  • (modified) lldb/source/Target/Target.cpp (+3-30)
  • (added) lldb/test/API/macosx/dsym-module/Makefile (+3)
  • (added) lldb/test/API/macosx/dsym-module/TestTargetModuleAddDsym.py (+23)
  • (added) lldb/test/API/macosx/dsym-module/main.c (+5)
diff --git a/lldb/include/lldb/Symbol/ObjectFile.h b/lldb/include/lldb/Symbol/ObjectFile.h
index 6348d8103f85de..fb946ee71873de 100644
--- a/lldb/include/lldb/Symbol/ObjectFile.h
+++ b/lldb/include/lldb/Symbol/ObjectFile.h
@@ -698,6 +698,10 @@ class ObjectFile : public std::enable_shared_from_this<ObjectFile>,
   /// file to be complete.
   virtual bool CanTrustAddressRanges() { return false; }
 
+  /// Can this object file serve as a valid target module. If a Status is passed
+  /// and the answer is no, it will be populated with the reason.
+  virtual bool CanBeTarget(Status *status = nullptr);
+
   static lldb::SymbolType GetSymbolTypeFromName(
       llvm::StringRef name,
       lldb::SymbolType symbol_type_hint = lldb::eSymbolTypeUndefined);
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
index d7a2846200fcaf..d623476c4c7243 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
@@ -6128,6 +6128,12 @@ Section *ObjectFileMachO::GetMachHeaderSection() {
   return nullptr;
 }
 
+bool ObjectFileMachO::CanBeTarget(Status *status ) {
+  if (m_header.filetype == MH_DSYM)
+    return true;
+  return ObjectFile::CanBeTarget(status);
+}
+
 bool ObjectFileMachO::SectionIsLoadable(const Section *section) {
   if (!section)
     return false;
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
index 0a47f3a7dd1861..7a2cda2b83ec21 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
@@ -160,6 +160,8 @@ class ObjectFileMachO : public lldb_private::ObjectFile {
 
   lldb_private::Section *GetMachHeaderSection();
 
+  virtual bool CanBeTarget(lldb_private::Status *status = nullptr) override;
+
   // PluginInterface protocol
   llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }
 
diff --git a/lldb/source/Symbol/ObjectFile.cpp b/lldb/source/Symbol/ObjectFile.cpp
index d890ad92e83122..06457be6b606d6 100644
--- a/lldb/source/Symbol/ObjectFile.cpp
+++ b/lldb/source/Symbol/ObjectFile.cpp
@@ -764,6 +764,39 @@ uint32_t ObjectFile::GetCacheHash() {
   return *m_cache_hash;
 }
 
+bool ObjectFile::CanBeTarget(Status *status) {
+  switch (GetType()) {
+  case ObjectFile::eTypeCoreFile:      /// A core file that has a checkpoint of
+                                       /// a program's execution state
+  case ObjectFile::eTypeExecutable:    /// A normal executable
+  case ObjectFile::eTypeDynamicLinker: /// The platform's dynamic linker
+                                       /// executable
+  case ObjectFile::eTypeObjectFile:    /// An intermediate object file
+  case ObjectFile::eTypeSharedLibrary: /// A shared library that can be
+                                       /// used during execution
+    return true;
+  case ObjectFile::eTypeDebugInfo: /// An object file that contains only
+                                   /// debug information
+    if (status)
+      status->SetErrorString("debug info files aren't valid target "
+                             "modules, please specify an executable");
+    return false;
+  case ObjectFile::eTypeStubLibrary: /// A library that can be linked
+                                     /// against but not used for
+                                     /// execution
+    if (status)
+      status->SetErrorString("stub libraries aren't valid target "
+                             "modules, please specify an executable");
+    return false;
+  default:
+    if (status)
+      status->SetErrorString(
+          "unsupported file type, please specify an executable");
+    return false;
+  }
+  llvm_unreachable("Fully covered switch above!");
+}
+
 namespace llvm {
 namespace json {
 
diff --git a/lldb/source/Target/Target.cpp b/lldb/source/Target/Target.cpp
index 302c2bad7021b9..e5e8658163fdba 100644
--- a/lldb/source/Target/Target.cpp
+++ b/lldb/source/Target/Target.cpp
@@ -2254,37 +2254,10 @@ ModuleSP Target::GetOrCreateModule(const ModuleSpec &module_spec, bool notify,
     // there wasn't an equivalent module in the list already, and if there was,
     // let's remove it.
     if (module_sp) {
-      ObjectFile *objfile = module_sp->GetObjectFile();
-      if (objfile) {
-        switch (objfile->GetType()) {
-        case ObjectFile::eTypeCoreFile: /// A core file that has a checkpoint of
-                                        /// a program's execution state
-        case ObjectFile::eTypeExecutable:    /// A normal executable
-        case ObjectFile::eTypeDynamicLinker: /// The platform's dynamic linker
-                                             /// executable
-        case ObjectFile::eTypeObjectFile:    /// An intermediate object file
-        case ObjectFile::eTypeSharedLibrary: /// A shared library that can be
-                                             /// used during execution
-          break;
-        case ObjectFile::eTypeDebugInfo: /// An object file that contains only
-                                         /// debug information
-          if (error_ptr)
-            error_ptr->SetErrorString("debug info files aren't valid target "
-                                      "modules, please specify an executable");
-          return ModuleSP();
-        case ObjectFile::eTypeStubLibrary: /// A library that can be linked
-                                           /// against but not used for
-                                           /// execution
-          if (error_ptr)
-            error_ptr->SetErrorString("stub libraries aren't valid target "
-                                      "modules, please specify an executable");
-          return ModuleSP();
-        default:
-          if (error_ptr)
-            error_ptr->SetErrorString(
-                "unsupported file type, please specify an executable");
+      if (ObjectFile *objfile = module_sp->GetObjectFile()) {
+        if (!objfile->CanBeTarget(error_ptr))
           return ModuleSP();
-        }
+
         // GetSharedModule is not guaranteed to find the old shared module, for
         // instance in the common case where you pass in the UUID, it is only
         // going to find the one module matching the UUID.  In fact, it has no
diff --git a/lldb/test/API/macosx/dsym-module/Makefile b/lldb/test/API/macosx/dsym-module/Makefile
new file mode 100644
index 00000000000000..99998b20bcb050
--- /dev/null
+++ b/lldb/test/API/macosx/dsym-module/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/macosx/dsym-module/TestTargetModuleAddDsym.py b/lldb/test/API/macosx/dsym-module/TestTargetModuleAddDsym.py
new file mode 100644
index 00000000000000..05509a34e9f936
--- /dev/null
+++ b/lldb/test/API/macosx/dsym-module/TestTargetModuleAddDsym.py
@@ -0,0 +1,23 @@
+import os
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+
+
+@skipUnlessDarwin
+class TargetModuleAddDsymTest(TestBase):
+
+    @no_debug_info_test
+    def test_target_module_add(self):
+        """Test that you can add a dSYM as a module."""
+        self.build(debug_info="dsym")
+
+        exe_path = self.getBuildArtifact("a.out")
+        dsym_path = exe_path + ".dSYM"
+        sym_path = os.path.join(dsym_path, 'Contents','Resources','DWARF','a.out')
+
+        exe = self.getBuildArtifact("a.out")
+        self.dbg.CreateTarget(exe)
+
+        self.runCmd("target module add %s" % sym_path)
+        self.expect("image list", substrs=[sym_path])
+
diff --git a/lldb/test/API/macosx/dsym-module/main.c b/lldb/test/API/macosx/dsym-module/main.c
new file mode 100644
index 00000000000000..b5c9a5152d61d9
--- /dev/null
+++ b/lldb/test/API/macosx/dsym-module/main.c
@@ -0,0 +1,5 @@
+#include <stdio.h>
+
+int main(int argc, char const *argv[]) {
+  return argc + 1;
+}

Copy link

github-actions bot commented Jan 12, 2024

✅ With the latest revision this PR passed the Python code formatter.

Copy link

github-actions bot commented Jan 12, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff dd5ce4572fb90323799f1bdf585c01d08613e277 761f4c18a13d0e50f570bf1d3a8d79ce8b26ddc0 -- lldb/test/API/macosx/dsym-module/main.c lldb/include/lldb/Symbol/ObjectFile.h lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h lldb/source/Symbol/ObjectFile.cpp lldb/source/Target/Target.cpp
View the diff from clang-format here.
diff --git a/lldb/test/API/macosx/dsym-module/main.c b/lldb/test/API/macosx/dsym-module/main.c
index b5c9a5152d..9cbfa1591e 100644
--- a/lldb/test/API/macosx/dsym-module/main.c
+++ b/lldb/test/API/macosx/dsym-module/main.c
@@ -1,5 +1,3 @@
 #include <stdio.h>
 
-int main(int argc, char const *argv[]) {
-  return argc + 1;
-}
+int main(int argc, char const *argv[]) { return argc + 1; }

Generally, separate debug info files are not object files. The exception
are dSYMs which are Mach-O and often contain enough information to be
used as a module.

This patch allows users to add dSYM symbol files as target modules to
support use cases where the executable isn't available. This scenario is
common when doing core file debugging and Jason has a follow up patch
that allows core file debugging without just the dSYM.

rdar://117773589
Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a module is debug info only, like a dSYM file, and then you try to target module add it to a target, it would be nice to add it as the symbol file for an already existing module. The test below, I wouldn't want a target with two things representing a.out, which is what is happening. This can cause lookups to return less valuable results because we will always match a.out first, which has no debug info.

Would it be better to be modify "target symbols add" to allow it to add the symbol file as a module if it returns true from a call to "ObjectFile::CanBeTargetModule(...)" but only if there are no matching modules to add the symbol file to?

sym_path = os.path.join(dsym_path, "Contents", "Resources", "DWARF", "a.out")

exe = self.getBuildArtifact("a.out")
self.dbg.CreateTarget(exe)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates a target with a.out, then adds another module which represents the exact same with from withing a.out.dSYM. If you build a target like this, and then load the sections in both, then lookups will always find the a.out first when resolving addresses.

Copy link
Member Author

@JDevlieghere JDevlieghere Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the test isn't meant to be particularly useful, it's mostly here to prove that the functionality works. The reason I use the same dSYM is so I don't have to mess with the triple to create the target. I could have added it to an empty target but this was just easier.

@clayborg
Copy link
Collaborator

If a module is debug info only, like a dSYM file, and then you try to target module add it to a target, it would be nice to add it as the symbol file for an already existing module. The test below, I wouldn't want a target with two things representing a.out, which is what is happening. This can cause lookups to return less valuable results because we will always match a.out first, which has no debug info.

Would it be better to be modify "target symbols add" to allow it to add the symbol file as a module if it returns true from a call to "ObjectFile::CanBeTargetModule(...)" but only if there are no matching modules to add the symbol file to?

@JDevlieghere Any comments on the above comment?

@JDevlieghere
Copy link
Member Author

If a module is debug info only, like a dSYM file, and then you try to target module add it to a target, it would be nice to add it as the symbol file for an already existing module. The test below, I wouldn't want a target with two things representing a.out, which is what is happening. This can cause lookups to return less valuable results because we will always match a.out first, which has no debug info.
Would it be better to be modify "target symbols add" to allow it to add the symbol file as a module if it returns true from a call to "ObjectFile::CanBeTargetModule(...)" but only if there are no matching modules to add the symbol file to?

@JDevlieghere Any comments on the above comment?

Your suggestion is to try target symbols add <dSYM> first when someone does target module add <dSYM> so that if a target module already exists, it's getting added there, as is the case in the test. If it doesn't, we do the same thing the test I added does (which I'll modify to not match the main module). I think that makes sense and can prevents users from accidentally doing the wrong thing. I'll see how easy/hard it would be to make that work at the command level. 👍

@clayborg
Copy link
Collaborator

clayborg commented Jan 16, 2024

If a module is debug info only, like a dSYM file, and then you try to target module add it to a target, it would be nice to add it as the symbol file for an already existing module. The test below, I wouldn't want a target with two things representing a.out, which is what is happening. This can cause lookups to return less valuable results because we will always match a.out first, which has no debug info.
Would it be better to be modify "target symbols add" to allow it to add the symbol file as a module if it returns true from a call to "ObjectFile::CanBeTargetModule(...)" but only if there are no matching modules to add the symbol file to?

@JDevlieghere Any comments on the above comment?

Your suggestion is to try target symbols add <dSYM> first when someone does target module add <dSYM> so that if a target module already exists, it's getting added there, as is the case in the test. If it doesn't, we do the same thing the test I added does (which I'll modify to not match the main module). I think that makes sense and can prevents users from accidentally doing the wrong thing. I'll see how easy/hard it would be to make that work at the command level. 👍

Yeah, in target module add we could test if the object file returns eTypeDebugInfo as the type, then if so we can add it as the symbol file for an object file, but it there isn't any match, fall back to just adding it as a module if the object file returns true from ObjectFile::CanBeTargetModule().

The other way to do this is to do this work in target symbols add only and not allow target module add to work with symbol files. Where symbols add would try to add the symbols, but if it fails, then add it as a module if the object file returns true from ObjectFile::CanBeTargetModule().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants