[JITLink][AArch32] Add TableGen Backend for Instr Encodings #76996

eymay · 2024-01-04T20:12:49Z

This TableGen backend uses the Target info in lib/Target/ARM to
produce instruction encodings necessary for JITLink AArch32 backend.
Currently opcode, opcode mask, register mask and immediate mask are
generated. These were used to replace the mov related instruction
information in aarch32.h.

github-actions · 2024-01-04T20:15:48Z

✅ With the latest revision this PR passed the C/C++ code formatter.

eymay · 2024-01-04T20:24:36Z

The current state is not very clean with the build system. I would appreciate any comment on including the generated file into the header file itself. Though being a temporary solution, you can find the generated file at llvm/unittests/ExecutionEngine/JITLink/JITLinkAArch32.inc.

weliveindetail

Hello everyone, before we start with detailed reviews, it might be worth mentioning that this is more of a design sketch and not ready to land yet. @eymay Can we mark it as Draft please?

I really like the idea of pulling all kinds of data from TableGen! And it's good to see that it can work already for opcodes, masks, register- and immediate bits. Thanks for working on this @eymay! As explained inline, I think the generated structures must still be extensible and one question is how much noise a macro-based approach would add.

Another question is how much linker-specific information can be distilled from TableGen. For example, can TableGen tell us the position of the BLX or H bits in an ARM BL instruction? And otherwise, would it be interesting to add this data to the .td files? @smithp35 What do you think?

llvm/lib/ExecutionEngine/JITLink/aarch32.cpp

llvm/unittests/ExecutionEngine/JITLink/JITLinkAArch32.inc

llvm/include/llvm/ExecutionEngine/JITLink/aarch32.h

eymay · 2024-01-06T18:38:38Z

Hello everyone, before we start with detailed reviews, it might be worth mentioning that this is more of a design sketch and not ready to land yet. @eymay Can we mark it as Draft please?

Thanks for the note! Yes it is a proof of concept at the moment.

eymay · 2024-01-06T19:10:45Z

llvm/include/llvm/ExecutionEngine/JITLink/aarch32.h

-  static constexpr uint32_t Opcode = 0x03000000;
-};
+template <>
+struct FixupInfo<Arm_MovwAbsNC> : public FixupInfoArmMov<Arm_MovwAbsNC> {};

 template <> struct FixupInfo<Thumb_Jump24> : public FixupInfoThumb {


Thumb Mov's are not modified yet since we can discuss the method of extracting further. Arm Fixup changes provide a baseline of what is possible currently.

weliveindetail

Wow, that's pretty neat. Thanks for the update!

I realize that we pull in a lot more instructions (~3.7K) than we actually need (<50). We only include the header in 6 compile units, but still they will be copied into each of them. Is there any way to filter out unrelated instructions from the .inc file? For example, all arithmetic instructions?

weliveindetail · 2024-01-08T12:00:17Z

llvm/include/llvm/ExecutionEngine/JITLink/aarch32.h

+/// FixupInfo checks for Arm edge kinds work on 32-bit words
+template <EdgeKind_aarch32 Kind> struct FixupInfoArm : public FixupInfoArmBase {
+  static constexpr uint32_t Opcode =
+      InstrTable[getInstrFromJITLinkEdgeKind(Kind)].Opcode;


Can we turn these into something like getTableGenInfoArm(Kind).Opcode? Then getInstrFromJITLinkEdgeKind() could become an implementation detail and for Thumb we can also convert to HalfWords in there.

weliveindetail · 2024-01-08T12:30:53Z

llvm/include/llvm/ExecutionEngine/JITLink/aarch32.h

@@ -212,19 +272,18 @@ template <> struct FixupInfo<Arm_Call> : public FixupInfoArmBranch {
  static constexpr uint32_t BitBlx = 0x10000000;


Looks like we could infer this as:

BitBLX = InstrTable[BLXi] & ~InstrTable[BL] == 0xfa000000 & 0x14000000 == 0x10000000

Opcode and OpcodeMask don't match exactly what we have. They seem to include the condition mask and H bit. Is there a way to get this info from TableGen? And if so, can we integrate it without adding another InstrInfo entry for each line in the generated .inc?

Otherwise, we will have to do some adjustments in the implementation. In general, I'd be happy to follow whatever TableGen can give us.

Here is my attempt at integrating the TableGen info of branch instructions with our bit level info. I realized as we diverge more from TableGen in terms of information, it gets harder to inspect and make sense of the resulting encodings. I plan to revert this branch specific TableGen integration since more complexity is added compared to simply using the previous values.

This TableGen backend uses the Target info in lib/Target/ARM to produce instruction encodings necessary for JITLink AArch32 backend. Currently opcode, opcode mask, register mask and immediate mask are generated. These were used to replace the `mov` related instruction information in aarch32.h.

Generating an includable file in header requires the TableGen backend to be independent of the headers, if not a cyclic dependency occurs. A macro based generation is prioritised and its manipulation is done in the header file by constexpr lookups.

eymay · 2024-01-26T18:36:20Z

Wow, that's pretty neat. Thanks for the update!

I realize that we pull in a lot more instructions (~3.7K) than we actually need (<50). We only include the header in 6 compile units, but still they will be copied into each of them. Is there any way to filter out unrelated instructions from the .inc file? For example, all arithmetic instructions?

Absolutely! We now emit only Move and Branch instructions, around 20 instructions.

weliveindetail · 2024-02-02T10:22:04Z

We discussed this in our last call. The approach is interesting and it would be great to integrate TableGen, but we are not sure about the details yet. In particular, it seems useful to have the specific byte-sequences at hand and easily discoverable. The TableGen approach will hide them from the code. For the moment we leave this here as a proposal for further consideration. Some situations to consider:

in tests we write assembly to produce the expected opcodes: TableGen names are similar to assembly instructions, but not always obvious -- we still need to compare instruction bytes
we often work with disassembly when debugging and compare instruction bytes
instruction bytes are very relevant for reviews as well
we must be able to match byte sequences for a range of instructions, for example B/B.W/BL/BLX (we'd need to combine TableGen results with bitwise operations)
encode/decode functions are hand-written (and we probably want to keep that)

eymay requested a review from weliveindetail January 4, 2024 20:15

eymay requested review from weirdsmiley and smithp35 January 4, 2024 20:19

tschuett requested a review from MaskRay January 4, 2024 22:09

weliveindetail reviewed Jan 5, 2024

View reviewed changes

llvm/lib/ExecutionEngine/JITLink/aarch32.cpp Outdated Show resolved Hide resolved

llvm/unittests/ExecutionEngine/JITLink/JITLinkAArch32.inc Outdated Show resolved Hide resolved

llvm/include/llvm/ExecutionEngine/JITLink/aarch32.h Show resolved Hide resolved

eymay marked this pull request as draft January 5, 2024 13:20

eymay force-pushed the tablegen_jitlink branch from 1269f9b to b5ee968 Compare January 6, 2024 19:07

eymay commented Jan 6, 2024

View reviewed changes

weliveindetail reviewed Jan 8, 2024

View reviewed changes

eymay added 5 commits January 26, 2024 11:48

Reformat and indent by 2

63e28a2

Include the generated file in header

7e63dd2

Generating an includable file in header requires the TableGen backend to be independent of the headers, if not a cyclic dependency occurs. A macro based generation is prioritised and its manipulation is done in the header file by constexpr lookups.

Remove unused header

f6b812d

Reduce generated instructions by selecting mov and branch only

ce1db9d

eymay force-pushed the tablegen_jitlink branch from a219c96 to ce1db9d Compare January 26, 2024 18:12

Demo of branch instructions being integrated

b4c3546

eymay added 4 commits January 26, 2024 21:37

clang format

3b762f1

Revert branch instruction demo

ae2bafd

TableGen Thumb HalfWords and generate Thumb Movs

05577fa

clang-format

33af899

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JITLink][AArch32] Add TableGen Backend for Instr Encodings #76996

[JITLink][AArch32] Add TableGen Backend for Instr Encodings #76996

eymay commented Jan 4, 2024

github-actions bot commented Jan 4, 2024 •

edited

eymay commented Jan 4, 2024

weliveindetail left a comment

eymay commented Jan 6, 2024

eymay Jan 6, 2024

weliveindetail left a comment

weliveindetail Jan 8, 2024

weliveindetail Jan 8, 2024

eymay Jan 26, 2024

eymay commented Jan 26, 2024

weliveindetail commented Feb 2, 2024

		@@ -212,19 +272,18 @@ template <> struct FixupInfo<Arm_Call> : public FixupInfoArmBranch {
		static constexpr uint32_t BitBlx = 0x10000000;

[JITLink][AArch32] Add TableGen Backend for Instr Encodings #76996

Are you sure you want to change the base?

[JITLink][AArch32] Add TableGen Backend for Instr Encodings #76996

Conversation

eymay commented Jan 4, 2024

github-actions bot commented Jan 4, 2024 • edited

eymay commented Jan 4, 2024

weliveindetail left a comment

Choose a reason for hiding this comment

eymay commented Jan 6, 2024

eymay Jan 6, 2024

Choose a reason for hiding this comment

weliveindetail left a comment

Choose a reason for hiding this comment

weliveindetail Jan 8, 2024

Choose a reason for hiding this comment

weliveindetail Jan 8, 2024

Choose a reason for hiding this comment

eymay Jan 26, 2024

Choose a reason for hiding this comment

eymay commented Jan 26, 2024

weliveindetail commented Feb 2, 2024

github-actions bot commented Jan 4, 2024 •

edited