Skip to content

Commit

Permalink
[BOLT] Add minimal RISC-V 64-bit support
Browse files Browse the repository at this point in the history
Just enough features are implemented to process a simple "hello world"
executable and produce something that still runs (including libc calls).
This was mainly a matter of implementing support for various
relocations. Currently, the following are handled:

- R_RISCV_JAL
- R_RISCV_CALL
- R_RISCV_CALL_PLT
- R_RISCV_BRANCH
- R_RISCV_RVC_BRANCH
- R_RISCV_RVC_JUMP
- R_RISCV_GOT_HI20
- R_RISCV_PCREL_HI20
- R_RISCV_PCREL_LO12_I
- R_RISCV_RELAX
- R_RISCV_NONE

Executables linked with linker relaxation will probably fail to be
processed. BOLT relocates .text to a high address while leaving .plt at
its original (low) address. This causes PC-relative PLT calls that were
relaxed to a JAL to not fit their offset in an I-immediate anymore. This
is something that will be addressed in a later patch.

Changes to the BOLT core are relatively minor. Two things were tricky to
implement and needed slightly larger changes. I'll explain those below.

The R_RISCV_CALL(_PLT) relocation is put on the first instruction of a
AUIPC/JALR pair, the second does not get any relocation (unlike other
PCREL pairs). This causes issues with the combinations of the way BOLT
processes binaries and the RISC-V MC-layer handles relocations:
- BOLT reassembles instructions one by one and since the JALR doesn't
  have a relocation, it simply gets copied without modification;
- Even though the MC-layer handles R_RISCV_CALL properly (adjusts both
  the AUIPC and the JALR), it assumes the immediates of both
  instructions are 0 (to be able to or-in a new value). This will most
  likely not be the case for the JALR that got copied over.

To handle this difficulty without resorting to RISC-V-specific hacks in
the BOLT core, a new binary pass was added that searches for
AUIPC/JALR pairs and zeroes-out the immediate of the JALR.

A second difficulty was supporting ABS symbols. As far as I can tell,
ABS symbols were not handled at all, causing __global_pointer$ to break.
RewriteInstance::analyzeRelocation was updated to handle these
generically.

Tests are provided for all supported relocations. Note that in order to
test the correct handling of PLT entries, an ELF file produced by GCC
had to be used. While I tried to strip the YAML representation, it's
still quite large. Any suggestions on how to improve this would be
appreciated.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D145687
  • Loading branch information
mtvec committed Jun 16, 2023
1 parent f9f92f1 commit f873029
Show file tree
Hide file tree
Showing 26 changed files with 1,740 additions and 11 deletions.
2 changes: 1 addition & 1 deletion bolt/CMakeLists.txt
Expand Up @@ -9,7 +9,7 @@ list(INSERT CMAKE_MODULE_PATH 0 "${BOLT_SOURCE_DIR}/cmake/modules")

# Determine default set of targets to build -- the intersection of
# those BOLT supports and those LLVM is targeting.
set(BOLT_TARGETS_TO_BUILD_all "AArch64;X86")
set(BOLT_TARGETS_TO_BUILD_all "AArch64;X86;RISCV")
set(BOLT_TARGETS_TO_BUILD_default)
foreach (tgt ${BOLT_TARGETS_TO_BUILD_all})
if (tgt IN_LIST LLVM_TARGETS_TO_BUILD)
Expand Down
4 changes: 4 additions & 0 deletions bolt/CODE_OWNERS.TXT
Expand Up @@ -20,3 +20,7 @@ D: DWARF support
N: Vladislav Khmelevsky
E: och95@yandex.ru
D: AArch64 backend

N: Job Noorman
E: jnoorman@igalia.com
D: RISC-V backend
2 changes: 2 additions & 0 deletions bolt/include/bolt/Core/BinaryContext.h
Expand Up @@ -722,6 +722,8 @@ class BinaryContext {
TheTriple->getArch() == llvm::Triple::x86_64;
}

bool isRISCV() const { return TheTriple->getArch() == llvm::Triple::riscv64; }

// AArch64-specific functions to check if symbol is used to delimit
// code/data in .text. Code is marked by $x, data by $d.
MarkerSymType getMarkerType(const SymbolRef &Symbol) const;
Expand Down
10 changes: 10 additions & 0 deletions bolt/include/bolt/Core/MCPlusBuilder.h
Expand Up @@ -633,6 +633,12 @@ class MCPlusBuilder {
return false;
}

/// Returns true if First/Second is a AUIPC/JALR call pair.
virtual bool isRISCVCall(const MCInst &First, const MCInst &Second) const {
llvm_unreachable("not implemented");
return false;
}

/// If non-zero, this is used to fill the executable space with instructions
/// that will trap. Defaults to 0.
virtual unsigned getTrapFillValue() const { return 0; }
Expand Down Expand Up @@ -2032,6 +2038,10 @@ MCPlusBuilder *createAArch64MCPlusBuilder(const MCInstrAnalysis *,
const MCInstrInfo *,
const MCRegisterInfo *);

MCPlusBuilder *createRISCVMCPlusBuilder(const MCInstrAnalysis *,
const MCInstrInfo *,
const MCRegisterInfo *);

} // namespace bolt
} // namespace llvm

Expand Down
38 changes: 38 additions & 0 deletions bolt/include/bolt/Passes/FixRISCVCallsPass.h
@@ -0,0 +1,38 @@
//===- bolt/Passes/FixRISCVCallsPass.h --------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file declares the FixRISCVCallsPass class, which sets the JALR immediate
// to 0 for AUIPC/JALR pairs with a R_RISCV_CALL(_PLT) relocation. This is
// necessary since MC expects it to be zero in order to or-in fixups.
//===----------------------------------------------------------------------===//

#ifndef BOLT_PASSES_FIXRISCVCALLSPASS_H
#define BOLT_PASSES_FIXRISCVCALLSPASS_H

#include "bolt/Passes/BinaryPasses.h"

namespace llvm {
namespace bolt {

class FixRISCVCallsPass : public BinaryFunctionPass {
void runOnFunction(BinaryFunction &Function);

public:
explicit FixRISCVCallsPass(const cl::opt<bool> &PrintPass)
: BinaryFunctionPass(PrintPass) {}

const char *getName() const override { return "fix-riscv-calls"; }

/// Pass entry point
void runOnFunctions(BinaryContext &BC) override;
};

} // namespace bolt
} // namespace llvm

#endif
9 changes: 9 additions & 0 deletions bolt/include/bolt/Rewrite/RewriteInstance.h
Expand Up @@ -279,6 +279,9 @@ class RewriteInstance {
/// is the expected .plt \p Section entry function size.
void disassemblePLTSectionX86(BinarySection &Section, uint64_t EntrySize);

/// Disassemble riscv-specific .plt \p Section auxiliary function
void disassemblePLTSectionRISCV(BinarySection &Section);

/// ELF-specific part. TODO: refactor into new class.
#define ELF_FUNCTION(TYPE, FUNC) \
template <typename ELFT> TYPE FUNC(object::ELFObjectFile<ELFT> *Obj); \
Expand Down Expand Up @@ -536,6 +539,9 @@ class RewriteInstance {
const PLTSectionInfo AArch64_PLTSections[3] = {
{".plt"}, {".iplt"}, {nullptr}};

/// RISCV PLT sections.
const PLTSectionInfo RISCV_PLTSections[3] = {{".plt"}, {nullptr}};

/// Return PLT information for a section with \p SectionName or nullptr
/// if the section is not PLT.
const PLTSectionInfo *getPLTSectionInfo(StringRef SectionName) {
Expand All @@ -549,6 +555,9 @@ class RewriteInstance {
case Triple::aarch64:
PLTSI = AArch64_PLTSections;
break;
case Triple::riscv64:
PLTSI = RISCV_PLTSections;
break;
}
for (; PLTSI && PLTSI->Name; ++PLTSI)
if (SectionName == PLTSI->Name)
Expand Down
5 changes: 5 additions & 0 deletions bolt/lib/Core/BinaryContext.cpp
Expand Up @@ -128,6 +128,11 @@ BinaryContext::createBinaryContext(const ObjectFile *File, bool IsPIC,
ArchName = "aarch64";
FeaturesStr = "+all";
break;
case llvm::Triple::riscv64:
ArchName = "riscv64";
// RV64GC
FeaturesStr = "+m,+a,+f,+d,+zicsr,+zifencei,+c";
break;
default:
return createStringError(std::errc::not_supported,
"BOLT-ERROR: Unrecognized machine in ELF file");
Expand Down
4 changes: 2 additions & 2 deletions bolt/lib/Core/BinaryFunction.cpp
Expand Up @@ -1323,7 +1323,7 @@ bool BinaryFunction::disassemble() {
if (BC.isAArch64())
handleAArch64IndirectCall(Instruction, Offset);
}
} else if (BC.isAArch64()) {
} else if (BC.isAArch64() || BC.isRISCV()) {
// Check if there's a relocation associated with this instruction.
bool UsedReloc = false;
for (auto Itr = Relocations.lower_bound(Offset),
Expand All @@ -1343,7 +1343,7 @@ bool BinaryFunction::disassemble() {
UsedReloc = true;
}

if (MIB->hasPCRelOperand(Instruction) && !UsedReloc)
if (!BC.isRISCV() && MIB->hasPCRelOperand(Instruction) && !UsedReloc)
handlePCRelOperand(Instruction, AbsoluteInstrAddr, Size);
}

Expand Down

0 comments on commit f873029

Please sign in to comment.