Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Object][Wasm] Generate symbol info from name section names #81063

Merged
merged 10 commits into from
Feb 8, 2024

Conversation

dschuff
Copy link
Member

@dschuff dschuff commented Feb 8, 2024

Currently symbol info is generated from a linking section or from export names. This PR generates
symbols in a WasmObjectFile from the name section as well, which allows tools like objdump and
nm to show useful information for more linked binaryes. There are some limitations: most notably
that we don't assume any particular ABI, so we don't get detailed information about data symbols
if the segments are merged (which is the default).

Covers most of the desired functionality from #76107

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 8, 2024

@llvm/pr-subscribers-backend-webassembly

@llvm/pr-subscribers-llvm-binary-utilities

Author: Derek Schuff (dschuff)

Changes

Currently symbol info is generated from a linking section or from export names. This PR generates
symbols in a WasmObjectFile from the name section as well, which allows tools like objdump and
nm to show useful information for more linked binaryes. There are some limitations: most notably
that we don't assume any particular ABI, so we don't get detailed information about data symbols
if the segments are merged (which is the default).


Full diff: https://github.com/llvm/llvm-project/pull/81063.diff

2 Files Affected:

  • (modified) llvm/lib/Object/WasmObjectFile.cpp (+45-4)
  • (added) llvm/test/tools/llvm-objdump/wasm/linked-symbol-table-namesec.yaml (+87)
diff --git a/llvm/lib/Object/WasmObjectFile.cpp b/llvm/lib/Object/WasmObjectFile.cpp
index 8c1bbe9c264593..ea1715446d9d90 100644
--- a/llvm/lib/Object/WasmObjectFile.cpp
+++ b/llvm/lib/Object/WasmObjectFile.cpp
@@ -508,10 +508,17 @@ Error WasmObjectFile::parseNameSection(ReadContext &Ctx) {
   llvm::DenseSet<uint64_t> SeenGlobals;
   llvm::DenseSet<uint64_t> SeenSegments;
 
+  // If there is symbol info from the export section, this info will supersede
+  // it, but not info from a linking section
+  if (!HasLinkingSection) {
+    Symbols.clear();
+  }
+
   while (Ctx.Ptr < Ctx.End) {
     uint8_t Type = readUint8(Ctx);
     uint32_t Size = readVaruint32(Ctx);
     const uint8_t *SubSectionEnd = Ctx.Ptr + Size;
+
     switch (Type) {
     case wasm::WASM_NAMES_FUNCTION:
     case wasm::WASM_NAMES_GLOBAL:
@@ -521,6 +528,16 @@ Error WasmObjectFile::parseNameSection(ReadContext &Ctx) {
         uint32_t Index = readVaruint32(Ctx);
         StringRef Name = readString(Ctx);
         wasm::NameType nameType = wasm::NameType::FUNCTION;
+        wasm::WasmSymbolInfo Info{Name,
+                                  /*Kind */ wasm::WASM_SYMBOL_TYPE_FUNCTION,
+                                  /* Flags */ 0,
+                                  /* ImportModule */ std::nullopt,
+                                  /* ImportName */ std::nullopt,
+                                  /* ExportName */ std::nullopt,
+                                  {/* ElementIndex */ Index}};
+        const wasm::WasmSignature *Signature = nullptr;
+        const wasm::WasmGlobalType *GlobalType = nullptr;
+        const wasm::WasmTableType *TableType = nullptr;
         if (Type == wasm::WASM_NAMES_FUNCTION) {
           if (!SeenFunctions.insert(Index).second)
             return make_error<GenericBinaryError>(
@@ -529,26 +546,50 @@ Error WasmObjectFile::parseNameSection(ReadContext &Ctx) {
             return make_error<GenericBinaryError>("invalid function name entry",
                                                   object_error::parse_failed);
 
-          if (isDefinedFunctionIndex(Index))
-            getDefinedFunction(Index).DebugName = Name;
+          if (isDefinedFunctionIndex(Index)) {
+            wasm::WasmFunction &F = getDefinedFunction(Index);
+            F.DebugName = Name;
+            Signature = &Signatures[F.SigIndex];
+            if (F.ExportName) {
+              Info.ExportName = F.ExportName;
+              Info.Flags |= wasm::WASM_SYMBOL_BINDING_GLOBAL;
+            } else {
+              Info.Flags |= wasm::WASM_SYMBOL_BINDING_LOCAL;
+            }
+          } else {
+            Info.Flags |= wasm::WASM_SYMBOL_UNDEFINED;
+          }
         } else if (Type == wasm::WASM_NAMES_GLOBAL) {
-          nameType = wasm::NameType::GLOBAL;
           if (!SeenGlobals.insert(Index).second)
             return make_error<GenericBinaryError>("global named more than once",
                                                   object_error::parse_failed);
           if (!isValidGlobalIndex(Index) || Name.empty())
             return make_error<GenericBinaryError>("invalid global name entry",
                                                   object_error::parse_failed);
+          nameType = wasm::NameType::GLOBAL;
+          Info.Kind = wasm::WASM_SYMBOL_TYPE_GLOBAL;
+          if (isDefinedGlobalIndex(Index)) {
+            GlobalType = &getDefinedGlobal(Index).Type;
+          } else {
+            Info.Flags |= wasm::WASM_SYMBOL_UNDEFINED;
+          }
         } else {
-          nameType = wasm::NameType::DATA_SEGMENT;
           if (!SeenSegments.insert(Index).second)
             return make_error<GenericBinaryError>(
                 "segment named more than once", object_error::parse_failed);
           if (Index > DataSegments.size())
             return make_error<GenericBinaryError>("invalid data segment name entry",
                                                   object_error::parse_failed);
+          nameType = wasm::NameType::DATA_SEGMENT;
+          Info.Kind = wasm::WASM_SYMBOL_TYPE_DATA;
+          Info.Flags |= wasm::WASM_SYMBOL_BINDING_LOCAL;
+          assert(Index < DataSegments.size());
+          Info.DataRef = wasm::WasmDataReference{
+              Index, 0, DataSegments[Index].Data.Content.size()};
         }
         DebugNames.push_back(wasm::WasmDebugName{nameType, Index, Name});
+        if (!HasLinkingSection)
+          Symbols.emplace_back(Info, GlobalType, TableType, Signature);
       }
       break;
     }
diff --git a/llvm/test/tools/llvm-objdump/wasm/linked-symbol-table-namesec.yaml b/llvm/test/tools/llvm-objdump/wasm/linked-symbol-table-namesec.yaml
new file mode 100644
index 00000000000000..622a6060b902a3
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/wasm/linked-symbol-table-namesec.yaml
@@ -0,0 +1,87 @@
+# RUN: yaml2obj %s -o %t.wasm
+# RUN: llvm-objdump -t %t.wasm | FileCheck %s
+#
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 00000000   F *UND* my_func_import_name
+# CHECK-NEXT: 00000083 g F CODE my_func_export_name
+# CHECK-NEXT: 00000086 l F CODE my_func_local_name
+# CHECK-NEXT: 00000000    *UND* my_global_import_name
+# CHECK-NEXT: 00000001 g  GLOBAL my_global_export_name
+# CHECK-NEXT: 00000000 l O DATA my_datasegment_name
+
+--- !WASM
+FileHeader:
+  Version:         0x1
+Sections:
+  - Type:            TYPE
+    Signatures:
+      - Index:           0
+        ParamTypes:      []
+        ReturnTypes:     []
+  - Type:            IMPORT
+    Imports:
+      - Module:          env
+        Field:           foo
+        Kind:            FUNCTION
+        SigIndex:        0
+      - Module:          env
+        Field:           bar
+        Kind:            GLOBAL
+        GlobalType:      I32
+        GlobalMutable:   true
+      - Module:          env
+        Field:           memory
+        Kind:            MEMORY
+        Memory:
+          Minimum:         0x1
+  - Type:            FUNCTION
+    FunctionTypes:   [ 0, 0 ]
+  - Type:            GLOBAL
+    Globals:
+      - Index:           1
+        Mutable:         false
+        Type:            I32
+        InitExpr:
+          Opcode:          I32_CONST
+          Value:           42
+  - Type:            EXPORT
+    Exports:
+      - Name:            my_func_export
+        Kind:            FUNCTION
+        Index:           1
+      - Name:            my_global_export
+        Kind:            GLOBAL
+        Index:           1
+  - Type:            CODE
+    Functions:
+      - Index:           1
+        Locals:
+        Body:            00
+      - Index:           2
+        Locals:
+        Body:            00
+  - Type:            DATA
+    Segments:
+      - SectionOffset:   0
+        InitFlags:       0
+        Offset:
+          Opcode:          I32_CONST
+          Value:           0
+        Content:         'abcd1234'
+  - Type:            CUSTOM
+    Name:            name
+    FunctionNames:
+     - Index:        0
+       Name:         my_func_import_name
+     - Index:        1
+       Name:         my_func_export_name
+     - Index:        2
+       Name:         my_func_local_name
+    GlobalNames:
+     - Index:        0
+       Name:         my_global_import_name
+     - Index:        1
+       Name:         my_global_export_name
+    DataSegmentNames:
+     - Index:        0
+       Name:         my_datasegment_name

GlobalType = &getDefinedGlobal(Index).Type;
} else {
Info.Flags |= wasm::WASM_SYMBOL_UNDEFINED;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem here is that we can't distinguish between symbols of type GLOBAL and symbols of type DATA. In dynamic linking we exports data symbols as wasm globals. I guess its not huge deal if nm gets this wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the text above about limitations kind of gets at that. You'll just see one DATA symbol per segment, and one GLOBAL symbol per global but they won't be linked.
We could try to be more sophisticated. I guess at least for dylibs we'll know it's a dylib before we parse the name section, right (because there's a custom section before the known sections)? So we could try to get it right in that case.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % comments

@@ -0,0 +1,87 @@
# RUN: yaml2obj %s -o %t.wasm
# RUN: llvm-objdump -t %t.wasm | FileCheck %s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a test for nm too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Either objdump or nm is sufficient to cover this code, since they both just dump info from the Object, and (hopefully) we have other tests for nm-specific things. I was thinking about updating nm to show symbol sizes for functions (currently we just do it for data symbols) so we could use a name section to test that and get some overlapping coverage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an nm test to also test that the linking section overrides the name section. I'll move both of the tests to test/Object after #81072 lands

@dschuff
Copy link
Member Author

dschuff commented Feb 8, 2024

This PR is now no longer dependent on #81072, PTAL. I'll test it with emscripten before I land it this time :)

@dschuff dschuff merged commit 5818572 into llvm:main Feb 8, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants