-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JITLink][AArch32] Add TableGen Backend for Instr Encodings #76996
base: main
Are you sure you want to change the base?
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
The current state is not very clean with the build system. I would appreciate any comment on including the generated file into the header file itself. Though being a temporary solution, you can find the generated file at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello everyone, before we start with detailed reviews, it might be worth mentioning that this is more of a design sketch and not ready to land yet. @eymay Can we mark it as Draft
please?
I really like the idea of pulling all kinds of data from TableGen! And it's good to see that it can work already for opcodes, masks, register- and immediate bits. Thanks for working on this @eymay! As explained inline, I think the generated structures must still be extensible and one question is how much noise a macro-based approach would add.
Another question is how much linker-specific information can be distilled from TableGen. For example, can TableGen tell us the position of the BLX
or H
bits in an ARM BL
instruction? And otherwise, would it be interesting to add this data to the .td
files? @smithp35 What do you think?
Thanks for the note! Yes it is a proof of concept at the moment. |
1269f9b
to
b5ee968
Compare
static constexpr uint32_t Opcode = 0x03000000; | ||
}; | ||
template <> | ||
struct FixupInfo<Arm_MovwAbsNC> : public FixupInfoArmMov<Arm_MovwAbsNC> {}; | ||
|
||
template <> struct FixupInfo<Thumb_Jump24> : public FixupInfoThumb { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thumb Mov's are not modified yet since we can discuss the method of extracting further. Arm Fixup changes provide a baseline of what is possible currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that's pretty neat. Thanks for the update!
I realize that we pull in a lot more instructions (~3.7K) than we actually need (<50). We only include the header in 6 compile units, but still they will be copied into each of them. Is there any way to filter out unrelated instructions from the .inc
file? For example, all arithmetic instructions?
/// FixupInfo checks for Arm edge kinds work on 32-bit words | ||
template <EdgeKind_aarch32 Kind> struct FixupInfoArm : public FixupInfoArmBase { | ||
static constexpr uint32_t Opcode = | ||
InstrTable[getInstrFromJITLinkEdgeKind(Kind)].Opcode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we turn these into something like getTableGenInfoArm(Kind).Opcode
? Then getInstrFromJITLinkEdgeKind()
could become an implementation detail and for Thumb we can also convert to HalfWords
in there.
@@ -212,19 +272,18 @@ template <> struct FixupInfo<Arm_Call> : public FixupInfoArmBranch { | |||
static constexpr uint32_t BitBlx = 0x10000000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we could infer this as:
BitBLX = InstrTable[BLXi] & ~InstrTable[BL] == 0xfa000000 & 0x14000000 == 0x10000000
Opcode and OpcodeMask don't match exactly what we have. They seem to include the condition mask and H bit. Is there a way to get this info from TableGen? And if so, can we integrate it without adding another InstrInfo
entry for each line in the generated .inc
?
Otherwise, we will have to do some adjustments in the implementation. In general, I'd be happy to follow whatever TableGen can give us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is my attempt at integrating the TableGen info of branch instructions with our bit level info. I realized as we diverge more from TableGen in terms of information, it gets harder to inspect and make sense of the resulting encodings. I plan to revert this branch specific TableGen integration since more complexity is added compared to simply using the previous values.
This TableGen backend uses the Target info in lib/Target/ARM to produce instruction encodings necessary for JITLink AArch32 backend. Currently opcode, opcode mask, register mask and immediate mask are generated. These were used to replace the `mov` related instruction information in aarch32.h.
Generating an includable file in header requires the TableGen backend to be independent of the headers, if not a cyclic dependency occurs. A macro based generation is prioritised and its manipulation is done in the header file by constexpr lookups.
a219c96
to
ce1db9d
Compare
Absolutely! We now emit only Move and Branch instructions, around 20 instructions. |
We discussed this in our last call. The approach is interesting and it would be great to integrate TableGen, but we are not sure about the details yet. In particular, it seems useful to have the specific byte-sequences at hand and easily discoverable. The TableGen approach will hide them from the code. For the moment we leave this here as a proposal for further consideration. Some situations to consider:
|
This TableGen backend uses the Target info in lib/Target/ARM to
produce instruction encodings necessary for JITLink AArch32 backend.
Currently opcode, opcode mask, register mask and immediate mask are
generated. These were used to replace the
mov
related instructioninformation in
aarch32.h
.