Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JITLink][AArch32] Add dynamic lookup for relocation fixup infos #71649

Merged

Conversation

weliveindetail
Copy link
Contributor

Specifying relocation fixup constants with name and type facilitates readability and compile-time checks. The FixupInfo<EdgeKind> facade organizes the information into entries per relocation type and provides uniform access across Arm and Thumb relocations. Since it uses template specializations, it doesn't limit potential entries. We cannot access the entries dynamically though, because EdgeKind must be given as a compile-time constant. It's a drawback that this patch aims to compensate for.

Copy link

github-actions bot commented Nov 8, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor Author

@weliveindetail weliveindetail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first function version. It's missing some polishing, comments, etc. and I think it will still evolve a bit, but it would be nice to hear some early feedback. Thanks!

@@ -326,26 +359,16 @@ Expected<int64_t> readAddendData(LinkGraph &G, Block &B, const Edge &E) {
Expected<int64_t> readAddendArm(LinkGraph &G, Block &B, const Edge &E) {
ArmRelocation R(B.getContent().data() + E.getOffset());
Edge::Kind Kind = E.getKind();
if (!checkOpcode(R, Kind))
return makeUnexpectedOpcodeError(G, R, Kind);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will get the dynamic opcode check here and get rid of the "templated" checks below. I think it significantly reduces noise and allows for much better grouping of case labels. What do you think?

assert(getFixupInfoTableArm().at(Kind - FirstArmRelocation) != nullptr &&
"Call populateFixupInfos() before dynamic access");
return getFixupInfoTableArm()[Kind - FirstArmRelocation]->checkOpcode(R.Wd);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opcode checks are on the hot-path of relocation processing. Doing them dynamically involves a virtual function call. This will add a little overhead. I haven't benchmarked it yet, but my gut feeling is: measurable but acceptable (the amount of code going through this backend won't be gigabytes).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick check running all llvm-jitlink invocations from existing LIT tests and measuring CPU time suggests a 0.67% slowdown. If this is getting too expensive, I think we can still make the opcode checks debug-only.

#!/bin/bash
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_arm_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs external_func=0x76bbe880 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_thumb_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_arm_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs external_func=0x76bbe880 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_thumb_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_arm_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs external_func=0x76bbe880 test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_thumb_reloc.s.tmp.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs target=0x76bbe88f test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_data_reloc.s.tmp_armv7.o
bin/llvm-jitlink -noexec -slab-address 0x76ff0000 -slab-allocate 10Kb -slab-page-size 4096 -abs target=0x76bbe88f test/ExecutionEngine/JITLink/AArch32/Output/ELF_static_data_reloc.s.tmp_thumbv7.o

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the co-maintainer of this backend I think you're in the best position to decide on the performance trade-offs.

I'm going to be pushing for more speed in JITLink in general, especially in the backends I'm responsible for (JITLink is being used in increasingly performance-sensitive areas), but this change would only affect aarch32 so there's no problem there from my perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using explicit function points instead of virtual functions (as you proposed below) gave a significant speedup and the slowdown turned into noise.

populateFixupInfos<FirstArmRelocation, LastArmRelocation>(getFixupInfoTableArm());
populateFixupInfos<FirstThumbRelocation, LastThumbRelocation>(getFixupInfoTableThumb());
});
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a precondition for using dynamic opcode checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest patch uses ManagedStatic to avoid the explicit init call.

@weliveindetail weliveindetail merged commit b86420c into llvm:main Nov 24, 2023
3 checks passed
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Nov 30, 2023
Local branch amd-gfx 72cb88f Merged main:d896b1f5a614 into amd-gfx:741a0a2ce1bd
Remote branch main b86420c [JITLink][AArch32] Add dynamic lookup for relocation fixup infos (llvm#71649)
weliveindetail added a commit that referenced this pull request Dec 12, 2023
…classes (NFC)

This fixes various buildbots reporting subobject-linkage warnings after #71649:
<class> has a base <name> whose type uses the anonymous namespace
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants