Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-ifs] Handle more e_machine values for --target #128559

Merged
merged 2 commits into from
Mar 26, 2025

Conversation

frobtech
Copy link
Contributor

@frobtech frobtech commented Feb 24, 2025

This adds ELF::convertTripleArchTypeToEMachine and uses it in
llvm-ifs. It handles many more Triple::ArchType values than the
old code, though not all since I couldn't quickly discern
what all the mappings are.

This adds ELF::convertTripleArchTypeToEMachine and uses it in
llvm-ifs.  It handles many more Triple::ArchType values than the
old code, though not all since I couldn't not quickly discern
what all the mappings are.
@frobtech frobtech requested a review from zeroomega February 24, 2025 19:41
@frobtech frobtech marked this pull request as ready for review February 24, 2025 19:41
@llvmbot
Copy link
Member

llvmbot commented Feb 24, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: Roland McGrath (frobtech)

Changes

This adds ELF::convertTripleArchTypeToEMachine and uses it in
llvm-ifs. It handles many more Triple::ArchType values than the
old code, though not all since I couldn't quickly discern
what all the mappings are.


Full diff: https://github.com/llvm/llvm-project/pull/128559.diff

4 Files Affected:

  • (modified) llvm/include/llvm/BinaryFormat/ELF.h (+4)
  • (modified) llvm/lib/BinaryFormat/ELF.cpp (+65)
  • (modified) llvm/lib/InterfaceStub/IFSHandler.cpp (+4-13)
  • (modified) llvm/test/tools/llvm-ifs/write-stub.test (+11-8)
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index 37eab89e706db..1f3cea4bd1ae6 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -20,6 +20,7 @@
 #define LLVM_BINARYFORMAT_ELF_H
 
 #include "llvm/ADT/StringRef.h"
+#include "llvm/TargetParser/Triple.h"
 #include <cstdint>
 #include <cstring>
 #include <type_traits>
@@ -2018,6 +2019,9 @@ uint16_t convertArchNameToEMachine(StringRef Arch);
 /// Convert an ELF's e_machine value into an architecture name.
 StringRef convertEMachineToArchName(uint16_t EMachine);
 
+// Convert a triple's architecture to ELF's e_machine value.
+uint16_t convertTripleArchTypeToEMachine(Triple::ArchType ArchType);
+
 // Convert a lowercase string identifier into an OSABI value.
 uint8_t convertNameToOSABI(StringRef Name);
 
diff --git a/llvm/lib/BinaryFormat/ELF.cpp b/llvm/lib/BinaryFormat/ELF.cpp
index 9878f5769087e..0ad1a09429e7c 100644
--- a/llvm/lib/BinaryFormat/ELF.cpp
+++ b/llvm/lib/BinaryFormat/ELF.cpp
@@ -200,6 +200,71 @@ uint16_t ELF::convertArchNameToEMachine(StringRef Arch) {
       .Default(EM_NONE);
 }
 
+uint16_t ELF::convertTripleArchTypeToEMachine(Triple::ArchType ArchType) {
+  switch (ArchType) {
+  case Triple::UnknownArch:
+  default:
+    return EM_NONE;
+
+  case Triple::arm:
+  case Triple::armeb:
+  case Triple::thumb:
+  case Triple::thumbeb:
+    return EM_ARM;
+  case Triple::aarch64:
+  case Triple::aarch64_be:
+  case Triple::aarch64_32:
+    return EM_AARCH64;
+  case Triple::arc:
+    return EM_ARC;
+  case Triple::avr:
+    return EM_AVR;
+  case Triple::bpfel:
+  case Triple::bpfeb:
+    return EM_BPF;
+  case Triple::csky:
+    return EM_CSKY;
+  case Triple::hexagon:
+    return EM_HEXAGON;
+  case Triple::loongarch32:
+  case Triple::loongarch64:
+    return EM_LOONGARCH;
+  case Triple::m68k:
+    return EM_68K;
+  case Triple::mips:
+  case Triple::mipsel:
+  case Triple::mips64:
+  case Triple::mips64el:
+    return EM_MIPS;
+  case Triple::msp430:
+    return EM_MSP430;
+  case Triple::ppc:
+  case Triple::ppcle:
+    return EM_PPC;
+  case Triple::ppc64:
+  case Triple::ppc64le:
+    return EM_PPC;
+  case Triple::riscv32:
+  case Triple::riscv64:
+    return EM_RISCV;
+  case Triple::sparc:
+  case Triple::sparcel:
+    return EM_SPARC;
+  case Triple::sparcv9:
+    return EM_SPARCV9;
+  case Triple::systemz:
+    return EM_S390;
+  case Triple::x86:
+    return EM_386;
+  case Triple::x86_64:
+    return EM_X86_64;
+  case Triple::xcore:
+    return EM_XCORE;
+  case Triple::xtensa:
+    return EM_XTENSA;
+  }
+}
+
 /// Convert an ELF's e_machine value into an architecture name.
 StringRef ELF::convertEMachineToArchName(uint16_t EMachine) {
   switch (EMachine) {
diff --git a/llvm/lib/InterfaceStub/IFSHandler.cpp b/llvm/lib/InterfaceStub/IFSHandler.cpp
index e80a59a572d88..39dc46601b2a0 100644
--- a/llvm/lib/InterfaceStub/IFSHandler.cpp
+++ b/llvm/lib/InterfaceStub/IFSHandler.cpp
@@ -304,19 +304,10 @@ Error ifs::validateIFSTarget(IFSStub &Stub, bool ParseTriple) {
 IFSTarget ifs::parseTriple(StringRef TripleStr) {
   Triple IFSTriple(TripleStr);
   IFSTarget RetTarget;
-  // TODO: Implement a Triple Arch enum to e_machine map.
-  switch (IFSTriple.getArch()) {
-  case Triple::ArchType::aarch64:
-    RetTarget.Arch = (IFSArch)ELF::EM_AARCH64;
-    break;
-  case Triple::ArchType::x86_64:
-    RetTarget.Arch = (IFSArch)ELF::EM_X86_64;
-    break;
-  case Triple::ArchType::riscv64:
-    RetTarget.Arch = (IFSArch)ELF::EM_RISCV;
-    break;
-  default:
-    RetTarget.Arch = (IFSArch)ELF::EM_NONE;
+  IFSArch TripleArch =
+      ELF::convertTripleArchTypeToEMachine(IFSTriple.getArch());
+  if (TripleArch != ELF::EM_NONE) {
+    RetTarget.Arch = TripleArch;
   }
   RetTarget.Endianness = IFSTriple.isLittleEndian() ? IFSEndiannessType::Little
                                                     : IFSEndiannessType::Big;
diff --git a/llvm/test/tools/llvm-ifs/write-stub.test b/llvm/test/tools/llvm-ifs/write-stub.test
index 44c194b77abc7..64459f4ef4fb6 100644
--- a/llvm/test/tools/llvm-ifs/write-stub.test
+++ b/llvm/test/tools/llvm-ifs/write-stub.test
@@ -1,19 +1,19 @@
 ## Test writing stub elf with minimal sections.
 
 # RUN: llvm-ifs --output-elf=%t.elf32l --arch=x86_64 --bitwidth=32 --endianness=little %s
-# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32l | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="LittleEndian (0x1)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32l | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="LittleEndian (0x1)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0 -DMACHINE="EM_X86_64 (0x3E)"
 
 # RUN: llvm-ifs --output-elf=%t.elf32b --arch=x86_64 --bitwidth=32 --endianness=big %s
-# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32b | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="BigEndian (0x2)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32b | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="BigEndian (0x2)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0 -DMACHINE="EM_X86_64 (0x3E)"
 
 # RUN: llvm-ifs --output-elf=%t.elf64l --arch=x86_64 --bitwidth=64 --endianness=little %s
-# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"
 
 # RUN: llvm-ifs --output-elf=%t.elf64l --target=x86_64-linux-gnu %s
-# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"
 
 # RUN: llvm-ifs --output-elf=%t.elf64b --arch=x86_64 --bitwidth=64 --endianness=big %s
-# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64b | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="BigEndian (0x2)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64b | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="BigEndian (0x2)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"
 
 # RUN: not llvm-ifs --output-elf=%t --arch=x86_64 --bitwidth=64 --endianness=big --target=x86_64-linux-gnu %s 2>&1 | FileCheck %s --check-prefix=TRIPLEERR
 
@@ -33,13 +33,16 @@
 
 # RUN: llvm-ifs --output-ifs=%t.ifs --output-elf=%t.elf --target=x86_64-linux-gnu %s
 # RUN: llvm-ifs --output-elf=%t.elf2 --target=x86_64-linux-gnu %t.ifs
-# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
-# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf2 | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf2 | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"
 
 # RUN: llvm-ifs --output-elf=- --target=riscv64-linux-gnu %s | llvm-readelf -h - | FileCheck %s --check-prefix=MACHINE
 
 # RUN: not llvm-ifs --output-elf=- --arch=riscv64 --endianness=little --bitwidth=64 %s 2>&1 | FileCheck %s -DMSG=riscv64 --check-prefix=ARCHERR
 
+# RUN: llvm-ifs --output-elf=%t.armel --target=armv7-linux-gnueabihf %s
+# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.armel | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="LittleEndian (0x1)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0 -DMACHINE="EM_ARM (0x28)"
+
 --- !ifs-v1
 IfsVersion: 3.0
 NeededLibs:
@@ -61,7 +64,7 @@ Symbols:
 # CHECK-NEXT:     Unused: (00 00 00 00 00 00 00)
 # CHECK-NEXT:   }
 # CHECK-NEXT:   Type: SharedObject (0x3)
-# CHECK-NEXT:   Machine: EM_X86_64 (0x3E)
+# CHECK-NEXT:   Machine: [[MACHINE]]
 # CHECK-NEXT:   Version: 1
 # CHECK-NEXT:   Entry: 0x0
 # CHECK:        Flags [ (0x0)

@frobtech frobtech requested a review from petrhosek February 24, 2025 20:28
switch (ArchType) {
case Triple::UnknownArch:
default:
return EM_NONE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave default out? I think that we want a covered switch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that every ArchType has an e_machine mapping. If they do, I don't know what they all are.
I covered all the ones where I could figure out the mapping quickly myself. If you know more mappings then we can add them here now. But handling most rather than very few is strictly an improvement over the status quo, so I would like not to have it held up because most is not all.

@@ -2018,6 +2019,9 @@ uint16_t convertArchNameToEMachine(StringRef Arch);
/// Convert an ELF's e_machine value into an architecture name.
StringRef convertEMachineToArchName(uint16_t EMachine);

// Convert a triple's architecture to ELF's e_machine value.
uint16_t convertTripleArchTypeToEMachine(Triple::ArchType ArchType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a good place to share this - feels like something we would need in the MC layer for the object file emission.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is ELFObjectFile::getArch() going in the other direction, with pretty minimal coverage.
AFAICT the actual emission is set by each MCELFObjectTargetWriter subclass's constructor.
I don't really see any straightforward way to share a single source of truth there.

Copy link
Contributor

@zeroomega zeroomega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@frobtech frobtech merged commit 25f4f0a into llvm:main Mar 26, 2025
11 checks passed
@frobtech frobtech deleted the p/llvm-ifs-arch branch March 26, 2025 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants