Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm-ifs] Handle more e_machine values for --target #128559

Merged
merged 2 commits into from
Mar 26, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions llvm/include/llvm/BinaryFormat/ELF.h
Original file line number Diff line number Diff line change
@@ -20,6 +20,7 @@
#define LLVM_BINARYFORMAT_ELF_H

#include "llvm/ADT/StringRef.h"
#include "llvm/TargetParser/Triple.h"
#include <cstdint>
#include <cstring>
#include <type_traits>
@@ -2018,6 +2019,9 @@ uint16_t convertArchNameToEMachine(StringRef Arch);
/// Convert an ELF's e_machine value into an architecture name.
StringRef convertEMachineToArchName(uint16_t EMachine);

// Convert a triple's architecture to ELF's e_machine value.
uint16_t convertTripleArchTypeToEMachine(Triple::ArchType ArchType);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there is a good place to share this - feels like something we would need in the MC layer for the object file emission.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is ELFObjectFile::getArch() going in the other direction, with pretty minimal coverage.
AFAICT the actual emission is set by each MCELFObjectTargetWriter subclass's constructor.
I don't really see any straightforward way to share a single source of truth there.


// Convert a lowercase string identifier into an OSABI value.
uint8_t convertNameToOSABI(StringRef Name);

65 changes: 65 additions & 0 deletions llvm/lib/BinaryFormat/ELF.cpp
Original file line number Diff line number Diff line change
@@ -200,6 +200,71 @@ uint16_t ELF::convertArchNameToEMachine(StringRef Arch) {
.Default(EM_NONE);
}

uint16_t ELF::convertTripleArchTypeToEMachine(Triple::ArchType ArchType) {
switch (ArchType) {
case Triple::UnknownArch:
default:
return EM_NONE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave default out? I think that we want a covered switch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that every ArchType has an e_machine mapping. If they do, I don't know what they all are.
I covered all the ones where I could figure out the mapping quickly myself. If you know more mappings then we can add them here now. But handling most rather than very few is strictly an improvement over the status quo, so I would like not to have it held up because most is not all.


case Triple::arm:
case Triple::armeb:
case Triple::thumb:
case Triple::thumbeb:
return EM_ARM;
case Triple::aarch64:
case Triple::aarch64_be:
case Triple::aarch64_32:
return EM_AARCH64;
case Triple::arc:
return EM_ARC;
case Triple::avr:
return EM_AVR;
case Triple::bpfel:
case Triple::bpfeb:
return EM_BPF;
case Triple::csky:
return EM_CSKY;
case Triple::hexagon:
return EM_HEXAGON;
case Triple::loongarch32:
case Triple::loongarch64:
return EM_LOONGARCH;
case Triple::m68k:
return EM_68K;
case Triple::mips:
case Triple::mipsel:
case Triple::mips64:
case Triple::mips64el:
return EM_MIPS;
case Triple::msp430:
return EM_MSP430;
case Triple::ppc:
case Triple::ppcle:
return EM_PPC;
case Triple::ppc64:
case Triple::ppc64le:
return EM_PPC;
case Triple::riscv32:
case Triple::riscv64:
return EM_RISCV;
case Triple::sparc:
case Triple::sparcel:
return EM_SPARC;
case Triple::sparcv9:
return EM_SPARCV9;
case Triple::systemz:
return EM_S390;
case Triple::x86:
return EM_386;
case Triple::x86_64:
return EM_X86_64;
case Triple::xcore:
return EM_XCORE;
case Triple::xtensa:
return EM_XTENSA;
}
}

/// Convert an ELF's e_machine value into an architecture name.
StringRef ELF::convertEMachineToArchName(uint16_t EMachine) {
switch (EMachine) {
17 changes: 4 additions & 13 deletions llvm/lib/InterfaceStub/IFSHandler.cpp
Original file line number Diff line number Diff line change
@@ -304,19 +304,10 @@ Error ifs::validateIFSTarget(IFSStub &Stub, bool ParseTriple) {
IFSTarget ifs::parseTriple(StringRef TripleStr) {
Triple IFSTriple(TripleStr);
IFSTarget RetTarget;
// TODO: Implement a Triple Arch enum to e_machine map.
switch (IFSTriple.getArch()) {
case Triple::ArchType::aarch64:
RetTarget.Arch = (IFSArch)ELF::EM_AARCH64;
break;
case Triple::ArchType::x86_64:
RetTarget.Arch = (IFSArch)ELF::EM_X86_64;
break;
case Triple::ArchType::riscv64:
RetTarget.Arch = (IFSArch)ELF::EM_RISCV;
break;
default:
RetTarget.Arch = (IFSArch)ELF::EM_NONE;
IFSArch TripleArch =
ELF::convertTripleArchTypeToEMachine(IFSTriple.getArch());
if (TripleArch != ELF::EM_NONE) {
RetTarget.Arch = TripleArch;
}
RetTarget.Endianness = IFSTriple.isLittleEndian() ? IFSEndiannessType::Little
: IFSEndiannessType::Big;
19 changes: 11 additions & 8 deletions llvm/test/tools/llvm-ifs/write-stub.test
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
## Test writing stub elf with minimal sections.

# RUN: llvm-ifs --output-elf=%t.elf32l --arch=x86_64 --bitwidth=32 --endianness=little %s
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32l | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="LittleEndian (0x1)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32l | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="LittleEndian (0x1)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0 -DMACHINE="EM_X86_64 (0x3E)"

# RUN: llvm-ifs --output-elf=%t.elf32b --arch=x86_64 --bitwidth=32 --endianness=big %s
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32b | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="BigEndian (0x2)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf32b | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="BigEndian (0x2)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0 -DMACHINE="EM_X86_64 (0x3E)"

# RUN: llvm-ifs --output-elf=%t.elf64l --arch=x86_64 --bitwidth=64 --endianness=little %s
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"

# RUN: llvm-ifs --output-elf=%t.elf64l --target=x86_64-linux-gnu %s
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64l | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"

# RUN: llvm-ifs --output-elf=%t.elf64b --arch=x86_64 --bitwidth=64 --endianness=big %s
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64b | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="BigEndian (0x2)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf64b | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="BigEndian (0x2)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"

# RUN: not llvm-ifs --output-elf=%t --arch=x86_64 --bitwidth=64 --endianness=big --target=x86_64-linux-gnu %s 2>&1 | FileCheck %s --check-prefix=TRIPLEERR

@@ -33,13 +33,16 @@

# RUN: llvm-ifs --output-ifs=%t.ifs --output-elf=%t.elf --target=x86_64-linux-gnu %s
# RUN: llvm-ifs --output-elf=%t.elf2 --target=x86_64-linux-gnu %t.ifs
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf2 | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.elf2 | FileCheck %s -DCLASS="64-bit (0x2)" -DDE="LittleEndian (0x1)" -DHS=64 -DPHES=56 -DSHES=64 -DDYNSYMAL=8 -DDYNSYMES=24 -DDYNAMICAL=8 -DDYNAMICES=16 -DDYNTABZ=000000000 -DMACHINE="EM_X86_64 (0x3E)"

# RUN: llvm-ifs --output-elf=- --target=riscv64-linux-gnu %s | llvm-readelf -h - | FileCheck %s --check-prefix=MACHINE

# RUN: not llvm-ifs --output-elf=- --arch=riscv64 --endianness=little --bitwidth=64 %s 2>&1 | FileCheck %s -DMSG=riscv64 --check-prefix=ARCHERR

# RUN: llvm-ifs --output-elf=%t.armel --target=armv7-linux-gnueabihf %s
# RUN: llvm-readobj -h -S --string-dump .dynstr --string-dump .shstrtab --dyn-symbols --dynamic-table %t.armel | FileCheck %s -DCLASS="32-bit (0x1)" -DDE="LittleEndian (0x1)" -DHS=52 -DPHES=32 -DSHES=40 -DDYNSYMAL=4 -DDYNSYMES=16 -DDYNAMICAL=4 -DDYNAMICES=8 -DDYNTABZ=0 -DMACHINE="EM_ARM (0x28)"

--- !ifs-v1
IfsVersion: 3.0
NeededLibs:
@@ -61,7 +64,7 @@ Symbols:
# CHECK-NEXT: Unused: (00 00 00 00 00 00 00)
# CHECK-NEXT: }
# CHECK-NEXT: Type: SharedObject (0x3)
# CHECK-NEXT: Machine: EM_X86_64 (0x3E)
# CHECK-NEXT: Machine: [[MACHINE]]
# CHECK-NEXT: Version: 1
# CHECK-NEXT: Entry: 0x0
# CHECK: Flags [ (0x0)