-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JITLink][AArch32] Add dynamic lookup for relocation fixup infos #71649
[JITLink][AArch32] Add dynamic lookup for relocation fixup infos #71649
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a first function version. It's missing some polishing, comments, etc. and I think it will still evolve a bit, but it would be nice to hear some early feedback. Thanks!
@@ -326,26 +359,16 @@ Expected<int64_t> readAddendData(LinkGraph &G, Block &B, const Edge &E) { | |||
Expected<int64_t> readAddendArm(LinkGraph &G, Block &B, const Edge &E) { | |||
ArmRelocation R(B.getContent().data() + E.getOffset()); | |||
Edge::Kind Kind = E.getKind(); | |||
if (!checkOpcode(R, Kind)) | |||
return makeUnexpectedOpcodeError(G, R, Kind); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will get the dynamic opcode check here and get rid of the "templated" checks below. I think it significantly reduces noise and allows for much better grouping of case labels. What do you think?
assert(getFixupInfoTableArm().at(Kind - FirstArmRelocation) != nullptr && | ||
"Call populateFixupInfos() before dynamic access"); | ||
return getFixupInfoTableArm()[Kind - FirstArmRelocation]->checkOpcode(R.Wd); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The opcode checks are on the hot-path of relocation processing. Doing them dynamically involves a virtual function call. This will add a little overhead. I haven't benchmarked it yet, but my gut feeling is: measurable but acceptable (the amount of code going through this backend won't be gigabytes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick check running all llvm-jitlink
invocations from existing LIT tests and measuring CPU time suggests a 0.67% slowdown. If this is getting too expensive, I think we can still make the opcode checks debug-only.
#!/bin/bash
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_arm_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs external_func=0x76bbe880 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_thumb_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_arm_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs external_func=0x76bbe880 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_thumb_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_arm_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs external_func=0x76bbe880 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_thumb_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs target=0x76bbe88f test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_data_reloc.s.tmp_armv7.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs target=0x76bbe88f test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_data_reloc.s.tmp_thumbv7.o
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the co-maintainer of this backend I think you're in the best position to decide on the performance trade-offs.
I'm going to be pushing for more speed in JITLink in general, especially in the backends I'm responsible for (JITLink is being used in increasingly performance-sensitive areas), but this change would only affect aarch32 so there's no problem there from my perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using explicit function points instead of virtual functions (as you proposed below) gave a significant speedup and the slowdown turned into noise.
populateFixupInfos<FirstArmRelocation, LastArmRelocation>(getFixupInfoTableArm()); | ||
populateFixupInfos<FirstThumbRelocation, LastThumbRelocation>(getFixupInfoTableThumb()); | ||
}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a precondition for using dynamic opcode checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest patch uses ManagedStatic
to avoid the explicit init call.
961aeaa
to
5f919ac
Compare
339f27a
to
c7ef2b3
Compare
Local branch amd-gfx 72cb88f Merged main:d896b1f5a614 into amd-gfx:741a0a2ce1bd Remote branch main b86420c [JITLink][AArch32] Add dynamic lookup for relocation fixup infos (llvm#71649)
…classes (NFC) This fixes various buildbots reporting subobject-linkage warnings after #71649: <class> has a base <name> whose type uses the anonymous namespace
Specifying relocation fixup constants with name and type facilitates readability and compile-time checks. The
FixupInfo<EdgeKind>
facade organizes the information into entries per relocation type and provides uniform access across Arm and Thumb relocations. Since it uses template specializations, it doesn't limit potential entries. We cannot access the entries dynamically though, becauseEdgeKind
must be given as a compile-time constant. It's a drawback that this patch aims to compensate for.