Skip to content

Commit

Permalink
Make ISA definitions compile-time defined. (#315)
Browse files Browse the repository at this point in the history
Changed ISA code structure to use template classes for instruction information. This information is now defined at compile-time, instead of dynamically at runtime. In this change, all information about an instruction (opcode parts, fields, bit ranges, etc.) is initialized using template parameters.

## Issues Solved

This almost fully solves #306 by using template parameters to define instruction set information at compile-time instead of runtime. The only remaining thing to do here is group up instruction sets at compile-time instead of dynamically. As of this change, instruction sets are created at runtime in the `enableInstructions()` function depending on which extensions are enabled. To solve this, each ISA extension could have a struct that uses template parameters to define the extension's enabled instructions at compile-time. These structs can then be combined at run-time depending on which extensions are enabled. I have not tested this yet, but I think it would work and be an improvement over defining the enabled instructions in the `enableInstructions()` function.

This almost solves #303. ISA-specific information such as instruction encodings, register information, and other details have all been moved to the ISA library. The only remaining ISA-specific information are the RV32 and RV64 assembler classes and RISC-V relocation functions. The assembler classes are mostly boilerplate code that enables certain instructions based on which extensions are enabled. These assemblers should be replaced by a single assembler class that takes an ISA ID as input. The extensions should be detected (and instructions enabled) only in the ISA library. The relocation functions are short and should be moved to the ISA library as well.

Similarly, this almost solves #307. The main problem here is decoupling the ISA/assembler libraries from VSRTL and QT. VSRTL has only a few coupled functions, but there are many dependencies on QT classes. These problems need to be solved before the ISA/assembler libraries can be extracted.

#311 has been solved. It is true that register widths do not need to be scattered throughout many template parameters and that removing them cleans up the code base without affecting performance. This change moves register widths into 2 locations; a function called `bits()` in all `ISAInfo` sub-classes (of which there is one per ISA), and the `N` template parameter in the `BitRange` class. Although `N` keeps the register width as a template parameter, it is not as scattered throughout the code base because each ISA uses their own default parameter for `N` equal to the ISA's register width. For example, in the RISC-V C extension file:

```c++
template <unsigned start, unsigned stop>
using BitRange = Ripes::BitRange<start, stop, 16>;
```

## Future Changes

There are also a few other improvements that can be made after this change.

One is in the `RegInfoBase` class that (as of this change) defines information about all of the registers for an ISA. Although it didn't make it in time for this commit, I've been working on changing this class so that it defines a single register file of an ISA. There will then be a `RegInfoSet` class that holds all of the currently enabled register files for an ISA. This will allow indexing into registers from either a specific register file or all enabled register files as a whole.

Another improvement is adding more compile-time instruction verifications. When instructions are combined into a struct at compile-time, as described above, it can be verified at compile-time that there are no duplicate instructions or instruction mnemonics. Other similar verifications could be added to further ensure that ISA implementations are correct.

Finally, I think that the implementation for pseudo-instructions could be improved by defining the instructions that they expand into at compile-time. Currently, the `PseudoInstruction` method `expander()` defines what the pseudo-instruction expands to, at runtime. Although I have not tested it yet, I think that this could be improved by changing the `expander()` function into a struct containing the expanded instructions (defined at compile-time).
  • Loading branch information
raccog committed Nov 9, 2023
1 parent 83b1d6f commit 2a042c9
Show file tree
Hide file tree
Showing 73 changed files with 3,854 additions and 3,149 deletions.
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ add_subdirectory(external)

# Fix the name of the ripes library.
set(RIPES_LIB ripes_lib)
set(ISA_LIB isa_lib)
add_subdirectory(src)

option(RIPES_BUILD_TESTS "Build Ripes tests" OFF)
Expand Down
6 changes: 5 additions & 1 deletion src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ endfunction()
# will be linked to the newly defined library.
function(create_ripes_lib NAME)
cmake_parse_arguments(OPTIONS
"LINK_TO_RIPES_LIB;FIXED_NAME;EXCLUDE_SRC_INC" # options
"LINK_TO_RIPES_LIB;LINK_ISA_LIB;FIXED_NAME;EXCLUDE_SRC_INC" # options
"" # 1-valued keywords
"" # multi-valued keywords
${ARGN})
Expand Down Expand Up @@ -40,6 +40,10 @@ function(create_ripes_lib NAME)
if(OPTIONS_LINK_TO_RIPES_LIB)
target_link_libraries(${RIPES_LIB} PUBLIC ${LIB_NAME})
endif()

if(OPTIONS_LINK_ISA_LIB)
target_link_libraries(${LIB_NAME} PUBLIC ${ISA_LIB})
endif()
endfunction()

# Error flags on everything but MSVC
Expand Down
2 changes: 1 addition & 1 deletion src/assembler/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
create_ripes_lib(assembler LINK_TO_RIPES_LIB)
create_ripes_lib(assembler LINK_TO_RIPES_LIB LINK_ISA_LIB)
107 changes: 42 additions & 65 deletions src/assembler/assembler.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,11 @@

#include <QRegularExpression>

#include "instruction.h"
#include "isa/isainfo.h"
#include "assembler_defines.h"
#include "isa/instruction.h"
#include "isa/isa_defines.h"
#include "isa/pseudoinstruction.h"
#include "matcher.h"
#include "parserutilities.h"
#include "pseudoinstruction.h"
#include "relocation.h"
#include "ripes_types.h"

#include <cstdint>
Expand All @@ -17,6 +16,7 @@

#include "STLExtras.h"
#include "assemblerbase.h"
#include "isa/isainfo.h"

namespace Ripes {
namespace Assembler {
Expand Down Expand Up @@ -56,36 +56,15 @@ namespace Assembler {
continue; \
}

// Macro for defining type aliases for register/instruction width specific types
// of the assembler types
#define AssemblerTypes(_Reg_T) \
using _InstrVec = InstrVec<_Reg_T>; \
using _InstrMap = InstrMap<_Reg_T>; \
using _PseudoInstrVec = PseudoInstrVec<_Reg_T>; \
using _PseudoInstrMap = PseudoInstrMap<_Reg_T>; \
using _Instruction = Instruction<_Reg_T>; \
using _PseudoInstruction = PseudoInstruction<_Reg_T>; \
using _Opcode = Opcode<_Reg_T>; \
using _Imm = Imm<_Reg_T>; \
using _Reg = Reg<_Reg_T>; \
using _Matcher = Matcher<_Reg_T>; \
using _FieldLinkRequest = FieldLinkRequest<_Reg_T>; \
using _RelocationsVec = RelocationsVec<_Reg_T>; \
using _RelocationsMap = RelocationsMap<_Reg_T>; \
using _AssembleRes = AssembleRes<_Reg_T>; \
using _InstrRes = InstrRes<_Reg_T>;

/**
* Reg_T: type equal in size to the register width of the target
* Instr_T: type equal in size to the instruction width of the target
*/
template <typename Reg_T>
class Assembler : public AssemblerBase {
static_assert(std::numeric_limits<Reg_T>::is_integer,
"Register type must be integer");

public:
AssemblerTypes(Reg_T);
explicit Assembler(const ISAInfoBase *isa) : m_isa(isa) {}

AssembleResult
Expand Down Expand Up @@ -164,9 +143,9 @@ class Assembler : public AssemblerBase {
}

// Got match, disassemble
auto instruction = std::get<const _Instruction *>(match);
auto instruction = std::get<const InstructionBase *>(match);
auto tokensVar = instruction->disassemble(word, baseAddress, symbols);
if (auto *error = std::get_if<Error>(&match)) {
if (auto *error = std::get_if<Error>(&tokensVar)) {
// Error during disassembling
opres.repr = "Invalid instruction";
opres.err = *error;
Expand All @@ -185,7 +164,7 @@ class Assembler : public AssemblerBase {
return opres;
}

const _Matcher &getMatcher() { return *m_matcher; }
const Matcher &getMatcher() { return *m_matcher; }

std::set<QString> getOpcodes() const override {
std::set<QString> opcodes;
Expand All @@ -198,7 +177,7 @@ class Assembler : public AssemblerBase {
return opcodes;
}

void setRelocations(const _RelocationsVec &relocations) {
void setRelocations(const RelocationsVec &relocations) {
if (m_relocations.size() != 0) {
throw std::runtime_error("Directives already set");
}
Expand All @@ -220,11 +199,12 @@ class Assembler : public AssemblerBase {
LinkRequest(const Location &location) : Location(location) {}
Reg_T
offset; // Offset of instruction in segment which needs link resolution
Section section; // Section which instruction was emitted in
Section section; // Section which instruction was emitted in
unsigned instrAlignment; // Alignment of instruction in bytes

// Reference to the immediate field which resolves the symbol and the
// requested symbol
_FieldLinkRequest fieldRequest;
FieldLinkRequest fieldRequest;
};

Reg_T linkReqAddress(const LinkRequest &req) const {
Expand Down Expand Up @@ -403,7 +383,7 @@ class Assembler : public AssemblerBase {
addr_offset = currentSection->data.size();
if (!wasDirective) {
/// Maintain a pointer to the instruction that was assembled.
std::shared_ptr<_Instruction> assembledWith;
std::shared_ptr<InstructionBase> assembledWith;
runOperation(machineCode, assembleInstruction, line, assembledWith);
assert(assembledWith && "Expected the assembler instruction to be set");
program.sourceMapping[addr_offset].insert(line.sourceLine());
Expand All @@ -413,6 +393,7 @@ class Assembler : public AssemblerBase {
req.offset = addr_offset;
req.fieldRequest = machineCode.linksWithSymbol;
req.section = m_currentSection;
req.instrAlignment = m_isa->instrByteAlignment();
needsLinkage.push_back(req);
}

Expand Down Expand Up @@ -490,25 +471,20 @@ class Assembler : public AssemblerBase {

// Decode instruction at link-request position
assert(static_cast<unsigned>(section.size()) >=
(linkRequest.offset + 4) &&
(linkRequest.offset + linkRequest.instrAlignment) &&
"Error: position of link request is not within program");
Instr_T instr =
*reinterpret_cast<Instr_T *>(section.data() + linkRequest.offset);

// Re-apply immediate resolution using the value acquired from the symbol
// map
if (auto *immField =
dynamic_cast<const _Imm *>(linkRequest.fieldRequest.field)) {
if (auto res = immField->applySymbolResolution(
linkRequest, symbolValue, instr, linkReqAddress(linkRequest));
res.isError()) {
errors.push_back(res.error());
continue;
}
} else {
assert(
false &&
"Something other than an immediate field has requested linkage?");
assert(linkRequest.fieldRequest.resolveSymbol &&
"Something other than an immediate field has requested linkage?");
if (auto res = linkRequest.fieldRequest.resolveSymbol(
linkRequest, symbolValue, instr, linkReqAddress(linkRequest));
res.isError()) {
errors.push_back(res.error());
continue;
}

// Finally, overwrite the instruction in the section
Expand All @@ -535,10 +511,11 @@ class Assembler : public AssemblerBase {
if (auto *error = std::get_if<Error>(&res)) {
Q_UNUSED(error);
if (m_instructionMap.count(opcode) != 0) {
// If this pseudo-instruction aliases with an instruction but threw an
// error (could arise if ie. arguments provided were intended for the
// normal instruction and not the pseudoinstruction), then return as if
// not a pseudo-instruction, falling to normal instruction handling
// If this pseudo-instruction aliases with an instruction but
// threw an error (could arise if ie. arguments provided were
// intended for the normal instruction and not the
// pseudoinstruction), then return as if not a pseudo-instruction,
// falling to normal instruction handling
return Error(line, "");
}
}
Expand All @@ -548,9 +525,9 @@ class Assembler : public AssemblerBase {
return {res};
}

virtual _AssembleRes
virtual AssembleRes
assembleInstruction(const TokenizedSrcLine &line,
std::shared_ptr<_Instruction> &assembledWith) const {
std::shared_ptr<InstructionBase> &assembledWith) const {
if (line.tokens.empty()) {
return {
Error(line, "Empty source lines should be impossible at this point")};
Expand All @@ -564,7 +541,7 @@ class Assembler : public AssemblerBase {
return assembledWith->assemble(line);
}

void setPseudoInstructions(const _PseudoInstrVec &pseudoInstructions) {
void setPseudoInstructions(const PseudoInstrVec &pseudoInstructions) {
if (m_pseudoInstructions.size() != 0) {
throw std::runtime_error("Pseudoinstructions already set");
}
Expand Down Expand Up @@ -612,18 +589,18 @@ class Assembler : public AssemblerBase {
return {remainingTokens};
}

void initialize(const _InstrVec &instructions,
const _PseudoInstrVec &pseudoinstructions,
void initialize(const InstrVec &instructions,
const PseudoInstrVec &pseudoinstructions,
const DirectiveVec &directives,
const _RelocationsVec &relocations) {
const RelocationsVec &relocations) {
setInstructions(instructions);
setPseudoInstructions(pseudoinstructions);
setDirectives(directives);
setRelocations(relocations);
m_matcher = std::make_unique<_Matcher>(m_instructions);
m_matcher = std::make_unique<Matcher>(m_instructions);
}

void setInstructions(const _InstrVec &instructions) {
void setInstructions(const InstrVec &instructions) {
if (m_instructions.size() != 0) {
throw std::runtime_error("Instructions already set");
}
Expand All @@ -643,25 +620,25 @@ class Assembler : public AssemblerBase {
* @brief m_instructions is the set of instructions which can be matched from
* an instruction string as well as be disassembled from a program.
*/
_InstrVec m_instructions;
_InstrMap m_instructionMap;
InstrVec m_instructions;
InstrMap m_instructionMap;

/**
* @brief m_pseudoInstructions is the set of instructions which can be matched
* from an instruction string but cannot be disassembled from a program.
* Typically, pseudoinstructions will expand to one or more non-pseudo
* instructions.
*/
_PseudoInstrVec m_pseudoInstructions;
_PseudoInstrMap m_pseudoInstructionMap;
PseudoInstrVec m_pseudoInstructions;
PseudoInstrMap m_pseudoInstructionMap;

/**
* @brief m_relocations is the set of supported assembler relocation hints
*/
_RelocationsVec m_relocations;
_RelocationsMap m_relocationsMap;
RelocationsVec m_relocations;
RelocationsMap m_relocationsMap;

std::unique_ptr<_Matcher> m_matcher;
std::unique_ptr<Matcher> m_matcher;

const ISAInfoBase *m_isa;
};
Expand Down
31 changes: 1 addition & 30 deletions src/assembler/assembler_defines.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,41 +8,12 @@
#include <set>
#include <variant>

#include "assemblererror.h"
#include "isa/isa_defines.h"
#include "program.h"

namespace Ripes {
namespace Assembler {

/// Type for instruction data. Should encompass all possible instruction widths.
using Instr_T = uint64_t;

class Token : public QString {
public:
Token(const QString &t) : QString(t) {}
Token(const QString &t, const QString &relocation)
: QString(t), m_relocation(relocation) {}
Token() : QString() {}
void setRelocation(const QString &relocation) { m_relocation = relocation; }
bool hasRelocation() const { return !m_relocation.isEmpty(); }
const QString &relocation() const { return m_relocation; }

private:
QString m_relocation;
};
using LineTokens = QVector<Token>;
using LineTokensVec = std::vector<LineTokens>;
using Symbols = std::set<Symbol>;
using DirectiveLinePair = std::pair<QString, LineTokens>;

struct TokenizedSrcLine : public Location {
explicit TokenizedSrcLine(unsigned sourceLine) : Location(sourceLine) {}
Symbols symbols;
LineTokens tokens;
QString directive;
AInt programAddress = -1;
};

using SymbolLinePair = std::pair<Symbols, LineTokens>;
using SourceProgram = std::vector<TokenizedSrcLine>;
using NoPassResult = std::monostate;
Expand Down
2 changes: 1 addition & 1 deletion src/assembler/assemblerbase.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
#include "assembler_defines.h"
#include "directive.h"
#include "expreval.h"
#include "symbolmap.h"
#include "isa/symbolmap.h"

namespace Ripes {
namespace Assembler {
Expand Down

0 comments on commit 2a042c9

Please sign in to comment.