Skip to content

Conversation

Nerixyz
Copy link
Contributor

@Nerixyz Nerixyz commented Aug 18, 2025

This PR implements SymbolFileNativePDB::AddSymbols which adds public symbols to the symbol table.

These symbols are found in the publics stream. It contains mangled names coupled with addresses. Addresses are a pair of (segment, offset).
If I understood correctly, then the segment is the section ID from the COFF header. Sections are already constructed using this 1-based index (MS docs). This allows us to use section_list->FindSectionByID.

@Nerixyz Nerixyz requested a review from JDevlieghere as a code owner August 18, 2025 14:07
@llvmbot llvmbot added the lldb label Aug 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 18, 2025

@llvm/pr-subscribers-lldb

Author: nerix (Nerixyz)

Changes

This PR implements SymbolFileNativePDB::AddSymbols which adds public symbols to the symbol table.

These symbols are found in the publics stream. It contains mangled names coupled with addresses. Addresses are a pair of (segment, offset).
If I understood correctly, then the segment is the section ID from the COFF header. Sections are already constructed using this 1-based index (MS docs). This allows us to use section_list->FindSectionByID.


Full diff: https://github.com/llvm/llvm-project/pull/154121.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp (+42-1)
  • (added) lldb/test/Shell/SymbolFile/NativePDB/symtab.cpp (+59)
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index 112eb06e462fc..ed466d8295f65 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -1054,7 +1054,48 @@ lldb::LanguageType SymbolFileNativePDB::ParseLanguage(CompileUnit &comp_unit) {
   return TranslateLanguage(item->m_compile_opts->getLanguage());
 }
 
-void SymbolFileNativePDB::AddSymbols(Symtab &symtab) {}
+void SymbolFileNativePDB::AddSymbols(Symtab &symtab) {
+  std::set<lldb::addr_t> existing_addresses;
+  for (size_t i = 0; i < symtab.GetNumSymbols(); i++)
+    existing_addresses.insert(symtab.SymbolAtIndex(i)->GetFileAddress());
+
+  auto *section_list = m_objfile_sp->GetSectionList();
+  if (!section_list)
+    return;
+
+  for (auto pid : m_index->publics().getPublicsTable()) {
+    PdbGlobalSymId global{pid, true};
+    CVSymbol sym = m_index->ReadSymbolRecord(global);
+    auto kind = sym.kind();
+    if (kind != S_PUB32)
+      continue;
+    PublicSym32 pub =
+        llvm::cantFail(SymbolDeserializer::deserializeAs<PublicSym32>(sym));
+
+    auto section_sp = section_list->FindSectionByID(pub.Segment);
+    if (!section_sp)
+      continue;
+
+    lldb::SymbolType type = eSymbolTypeData;
+    if ((pub.Flags & PublicSymFlags::Function) != PublicSymFlags::None ||
+        (pub.Flags & PublicSymFlags::Code) != PublicSymFlags::None)
+      type = eSymbolTypeCode;
+
+    symtab.AddSymbol(Symbol(/*symID=*/pid,
+                            /*name=*/pub.Name,
+                            /*type=*/type,
+                            /*external=*/true,
+                            /*is_debug=*/false,
+                            /*is_trampoline=*/false,
+                            /*is_artificial=*/false,
+                            /*section_sp=*/section_sp,
+                            /*value=*/pub.Offset,
+                            /*size=*/0,
+                            /*size_is_valid=*/false,
+                            /*contains_linker_annotations=*/false,
+                            /*flags=*/0));
+  }
+}
 
 size_t SymbolFileNativePDB::ParseFunctions(CompileUnit &comp_unit) {
   std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
diff --git a/lldb/test/Shell/SymbolFile/NativePDB/symtab.cpp b/lldb/test/Shell/SymbolFile/NativePDB/symtab.cpp
new file mode 100644
index 0000000000000..81d643d9572d8
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/NativePDB/symtab.cpp
@@ -0,0 +1,59 @@
+// REQUIRES: x86
+
+// Test symtab reading
+// RUN: %build --compiler=clang-cl --arch=64 --nodefaultlib -o %t.exe -- %s
+// RUN: env LLDB_USE_NATIVE_PDB_READER=1 lldb-test symtab %t.exe --find-symbols-by-regex=".*" | FileCheck %s
+// RUN: env LLDB_USE_NATIVE_PDB_READER=0 lldb-test symtab %t.exe --find-symbols-by-regex=".*" | FileCheck %s
+
+struct A {
+  void something() {}
+};
+
+namespace ns {
+template <typename T> struct B {
+  struct C {
+    static int static_fn() { return 1; }
+  };
+
+  int b_func() const { return 3; }
+};
+
+struct Dyn {
+  virtual ~Dyn() = default;
+};
+
+int a_function() { return 1; }
+} // namespace ns
+
+void *operator new(unsigned long long n) { return nullptr; }
+void operator delete(void *p, unsigned long long i) {}
+
+A global_a;
+ns::B<long long>::C global_c;
+int global_int;
+
+int main(int argc, char **argv) {
+  A a;
+  a.something();
+  ns::B<int>::C::static_fn();
+  ns::B<bool>::C::static_fn();
+  ns::B<short> b;
+  ns::Dyn dyn;
+  return ns::a_function() + b.b_func();
+}
+
+// CHECK-DAG: Code {{.*}} main
+// CHECK-DAG: Code {{.*}} ?b_func@?$B@F@ns@@QEBAHXZ
+// CHECK-DAG: Code {{.*}} ?something@A@@QEAAXXZ
+// CHECK-DAG: Code {{.*}} ??_GDyn@ns@@UEAAPEAXI@Z
+// CHECK-DAG: Code {{.*}} ??2@YAPEAX_K@Z
+// CHECK-DAG: Code {{.*}} ??3@YAXPEAX_K@Z
+// CHECK-DAG: Code {{.*}} ?static_fn@C@?$B@H@ns@@SAHXZ
+// CHECK-DAG: Code {{.*}} ?a_function@ns@@YAHXZ
+// CHECK-DAG: Code {{.*}} ?static_fn@C@?$B@_N@ns@@SAHXZ
+// CHECK-DAG: Code {{.*}} ??1Dyn@ns@@UEAA@XZ
+// CHECK-DAG: Code {{.*}} ??0Dyn@ns@@QEAA@XZ
+// CHECK-DAG: Data {{.*}} ?global_int@@3HA
+// CHECK-DAG: Data {{.*}} ??_7Dyn@ns@@6B@
+// CHECK-DAG: Data {{.*}} ?global_a@@3UA@@A
+// CHECK-DAG: Data {{.*}} ?global_c@@3UC@?$B@_J@ns@@A

/*name=*/pub.Name,
/*type=*/type,
/*external=*/true,
/*is_debug=*/false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this should be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably? I'm not sure what this flag signifies ("non-zero if this symbol is debug information in a symbol"), so I passed false like in SymbolFilePDB. If this means "this symbol comes from debug info", then it should be true in both cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear what the intention behind this flag is and how it supposed to be used. I can't seem to find a case where this is actually true.

It's part of the SB API so we could either start using it the way you've described it here or we could remove it and always return false at the SB level.

@jimingham @clayborg Any chance you remember the meaning of this flag?

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to align with how the PDB plugin does it, so LGTM

@Nerixyz Nerixyz requested a review from ZequanWu September 9, 2025 19:53
@Nerixyz
Copy link
Contributor Author

Nerixyz commented Sep 9, 2025

Seems to align with how the PDB plugin does it

The only difference is that PDB doesn't set the debug flag. I think the debug flag is more correct here, since the symbols we're adding aren't exposed in the binary but only in the debug info.

@Nerixyz Nerixyz merged commit 6578772 into llvm:main Sep 11, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants