Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Object][Wasm] Move WasmSymbolInfo directly into WasmSymbol (NFC) #80219

Merged
merged 5 commits into from
Feb 2, 2024

Conversation

dschuff
Copy link
Member

@dschuff dschuff commented Jan 31, 2024

Move the WasmSymbolInfos from their own vector on the WasmLinkingData directly into the WasmSymbol object. Removing the const-ref to an external object allows the vector of WasmSymbols to be safely expanded/reallocated; generating symbol info from the name section will require this, as the numbers of function and data segment names are stored separately.

This is a step toward generating symbol information from name sections for #76107

Copy link

github-actions bot commented Jan 31, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@dschuff
Copy link
Member Author

dschuff commented Feb 1, 2024

@sbc100 this is a prototype PR to move the WasmSymbolInfos from their own vector on the WasmLinkingData directly into the WasmSymbol object. Removing the const-ref to an external object allows the vector of WasmSymbols to be safely expanded/reallocated (which we'd need to do if we want to synthesize symbols from export and name sections together).
It can maybe be further refined, but mostly I'm wondering if you recall why the organization was this way in the first place. The WasymSymbolInfos never seem to be accessed other than via the WasmSymbol itself, so this ended up being a very minimal change.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 1, 2024

@sbc100 this is a prototype PR to move the WasmSymbolInfos from their own vector on the WasmLinkingData directly into the WasmSymbol object. Removing the const-ref to an external object allows the vector of WasmSymbols to be safely expanded/reallocated (which we'd need to do if we want to synthesize symbols from export and name sections together). It can maybe be further refined, but mostly I'm wondering if you recall why the organization was this way in the first place. The WasymSymbolInfos never seem to be accessed other than via the WasmSymbol itself, so this ended up being a very minimal change.

I think it as just so that the WasmLinkingData struct could contain all the stuff in the linking metadata section.. trying to keep the stuf in Wasm.h as direct a mapping to the binary file as possible. I guess that has not proved very useful so this change seems reasonable. Do we need WasmLinkingData at all anymore?

@dschuff
Copy link
Member Author

dschuff commented Feb 1, 2024

We just discussed live and decided that we don't need to combine symbol info from the export and name sections, that the names from the name section can take precedence. However I realized that we will still need to fix how these are stored because the name section does not declare the total number of names up front (instead it has subsections for each name kind). So assuming we want symbols for both function and data section names, we can't allocate the vectors up front. So we'll either need to change the data representation as this PR does, or do a preparse step to skip ahead in the section and find the total name count before reading the names.

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 1, 2024

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-llvm-binary-utilities
@llvm/pr-subscribers-backend-webassembly

@llvm/pr-subscribers-objectyaml

Author: Derek Schuff (dschuff)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/80219.diff

5 Files Affected:

  • (modified) lld/wasm/InputFiles.cpp (+10-10)
  • (modified) llvm/include/llvm/BinaryFormat/Wasm.h (+5-1)
  • (modified) llvm/include/llvm/Object/Wasm.h (+3-1)
  • (modified) llvm/lib/Object/WasmObjectFile.cpp (+2-7)
  • (modified) llvm/tools/obj2yaml/wasm2yaml.cpp (+2-1)
diff --git a/lld/wasm/InputFiles.cpp b/lld/wasm/InputFiles.cpp
index f43c39b218787..60ee60d9a3aa9 100644
--- a/lld/wasm/InputFiles.cpp
+++ b/lld/wasm/InputFiles.cpp
@@ -322,20 +322,20 @@ void ObjFile::addLegacyIndirectFunctionTableIfNeeded(
     return;
   }
 
-  auto *info = make<WasmSymbolInfo>();
-  info->Name = tableImport->Field;
-  info->Kind = WASM_SYMBOL_TYPE_TABLE;
-  info->ImportModule = tableImport->Module;
-  info->ImportName = tableImport->Field;
-  info->Flags = WASM_SYMBOL_UNDEFINED;
-  info->Flags |= WASM_SYMBOL_NO_STRIP;
-  info->ElementIndex = 0;
-  LLVM_DEBUG(dbgs() << "Synthesizing symbol for table import: " << info->Name
+  WasmSymbolInfo info;
+  info.Name = tableImport->Field;
+  info.Kind = WASM_SYMBOL_TYPE_TABLE;
+  info.ImportModule = tableImport->Module;
+  info.ImportName = tableImport->Field;
+  info.Flags = WASM_SYMBOL_UNDEFINED;
+  info.Flags |= WASM_SYMBOL_NO_STRIP;
+  info.ElementIndex = 0;
+  LLVM_DEBUG(dbgs() << "Synthesizing symbol for table import: " << info.Name
                     << "\n");
   const WasmGlobalType *globalType = nullptr;
   const WasmSignature *signature = nullptr;
   auto *wasmSym =
-      make<WasmSymbol>(*info, globalType, &tableImport->Table, signature);
+      make<WasmSymbol>(info, globalType, &tableImport->Table, signature);
   Symbol *sym = createUndefined(*wasmSym, false);
   // We're only sure it's a TableSymbol if the createUndefined succeeded.
   if (errorCount())
diff --git a/llvm/include/llvm/BinaryFormat/Wasm.h b/llvm/include/llvm/BinaryFormat/Wasm.h
index cd13b7014bfe2..aec6ea0b75779 100644
--- a/llvm/include/llvm/BinaryFormat/Wasm.h
+++ b/llvm/include/llvm/BinaryFormat/Wasm.h
@@ -467,11 +467,15 @@ struct WasmDebugName {
   StringRef Name;
 };
 
+// Info from the linking metadata section of a wasm object file.
 struct WasmLinkingData {
   uint32_t Version;
   std::vector<WasmInitFunc> InitFunctions;
   std::vector<StringRef> Comdats;
-  std::vector<WasmSymbolInfo> SymbolTable;
+  // The linking section also contains a symbol table. This info (represented
+  // in a WasmSymbolInfo struct) is stored inside the WasmSymbol object instead
+  // of in this structure; this allows vectors of WasmSymbols and
+  // WasmLinkingDatas to be reallocated.
 };
 
 struct WasmSignature {
diff --git a/llvm/include/llvm/Object/Wasm.h b/llvm/include/llvm/Object/Wasm.h
index 927dce882f6ae..13d9a17e24c3d 100644
--- a/llvm/include/llvm/Object/Wasm.h
+++ b/llvm/include/llvm/Object/Wasm.h
@@ -43,7 +43,9 @@ class WasmSymbol {
     assert(!Signature || Signature->Kind != wasm::WasmSignature::Placeholder);
   }
 
-  const wasm::WasmSymbolInfo &Info;
+  // Symbol info as represented in the symbol's 'syminfo' entry of an object
+  // file's symbol table.
+  wasm::WasmSymbolInfo Info;
   const wasm::WasmGlobalType *GlobalType;
   const wasm::WasmTableType *TableType;
   const wasm::WasmSignature *Signature;
diff --git a/llvm/lib/Object/WasmObjectFile.cpp b/llvm/lib/Object/WasmObjectFile.cpp
index 953e7c70b5f6e..7e10bd8071aee 100644
--- a/llvm/lib/Object/WasmObjectFile.cpp
+++ b/llvm/lib/Object/WasmObjectFile.cpp
@@ -642,9 +642,7 @@ Error WasmObjectFile::parseLinkingSectionSymtab(ReadContext &Ctx) {
   uint32_t Count = readVaruint32(Ctx);
   // Clear out any symbol information that was derived from the exports
   // section.
-  LinkingData.SymbolTable.clear();
   Symbols.clear();
-  LinkingData.SymbolTable.reserve(Count);
   Symbols.reserve(Count);
   StringSet<> SymbolNames;
 
@@ -844,8 +842,7 @@ Error WasmObjectFile::parseLinkingSectionSymtab(ReadContext &Ctx) {
       return make_error<GenericBinaryError>("duplicate symbol name " +
                                                 Twine(Info.Name),
                                             object_error::parse_failed);
-    LinkingData.SymbolTable.emplace_back(Info);
-    Symbols.emplace_back(LinkingData.SymbolTable.back(), GlobalType, TableType,
+    Symbols.emplace_back(Info, GlobalType, TableType,
                          Signature);
     LLVM_DEBUG(dbgs() << "Adding symbol: " << Symbols.back() << "\n");
   }
@@ -1390,7 +1387,6 @@ Error WasmObjectFile::parseGlobalSection(ReadContext &Ctx) {
 Error WasmObjectFile::parseExportSection(ReadContext &Ctx) {
   uint32_t Count = readVaruint32(Ctx);
   Exports.reserve(Count);
-  LinkingData.SymbolTable.reserve(Count);
   Symbols.reserve(Count);
   for (uint32_t I = 0; I < Count; I++) {
     wasm::WasmExport Ex;
@@ -1455,8 +1451,7 @@ Error WasmObjectFile::parseExportSection(ReadContext &Ctx) {
     }
     Exports.push_back(Ex);
     if (Ex.Kind != wasm::WASM_EXTERNAL_MEMORY) {
-      LinkingData.SymbolTable.emplace_back(Info);
-      Symbols.emplace_back(LinkingData.SymbolTable.back(), GlobalType,
+      Symbols.emplace_back(Info, GlobalType,
                            TableType, Signature);
       LLVM_DEBUG(dbgs() << "Adding symbol: " << Symbols.back() << "\n");
     }
diff --git a/llvm/tools/obj2yaml/wasm2yaml.cpp b/llvm/tools/obj2yaml/wasm2yaml.cpp
index c15d64cef8d65..397b1fd0ab81f 100644
--- a/llvm/tools/obj2yaml/wasm2yaml.cpp
+++ b/llvm/tools/obj2yaml/wasm2yaml.cpp
@@ -124,7 +124,8 @@ WasmDumper::dumpCustomSection(const WasmSection &WasmSec) {
     }
 
     uint32_t SymbolIndex = 0;
-    for (const wasm::WasmSymbolInfo &Symbol : Obj.linkingData().SymbolTable) {
+    for (const object::SymbolRef &Sym : Obj.symbols()) {
+      const wasm::WasmSymbolInfo& Symbol = Obj.getWasmSymbol(Sym).Info;
       WasmYAML::SymbolInfo Info;
       Info.Index = SymbolIndex++;
       Info.Kind = static_cast<uint32_t>(Symbol.Kind);

@dschuff
Copy link
Member Author

dschuff commented Feb 1, 2024

Here's a slightly cleaned-up version; WDYT?

@dschuff dschuff requested a review from sbc100 February 1, 2024 23:09
info.ImportModule = tableImport->Module;
info.ImportName = tableImport->Field;
info.Flags = WASM_SYMBOL_UNDEFINED;
info.Flags |= WASM_SYMBOL_NO_STRIP;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine the above two lines?

@dschuff dschuff merged commit ef1f999 into llvm:main Feb 2, 2024
3 of 4 checks passed
agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024
…vm#80219)

Move the WasmSymbolInfos from their own vector on the WasmLinkingData
directly into the WasmSymbol object. Removing the const-ref to an
external object allows the vector of WasmSymbols to be safely
expanded/reallocated; generating symbol info from the name section will
require this, as the numbers of function and data segment names are
stored separately.

This is a step toward generating symbol information from name sections
for llvm#76107
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants