-
Notifications
You must be signed in to change notification settings - Fork 5.1k
SPMI: Handle BRANCH26 relocs similar to REL32 relocs #116340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
BRANCH26 relocs were handled by seeing if we can fit a relative offset to the actual target in the instruction, and if so, doing that. Otherwise we would hardcode a jump to the end of the code. It turns out that outside macOS we never hit the former case, so we always hardcode a jump to the end of the code section. That makes diffing work outside macOS. On macOS, it happens rarely that the target (that comes from the original SPMI collection) is actually within range of the code that was allocated during SPMI replay. When that happens we end up with a two different immediates provided to the instruction by the base and the diff compilers, and hence diffing fails. Now, `NearDiffer::compareOffsets` actually has code that tries to compensate for relative offsets like this, by seeing if the absolute offset computed by "address + instrLen + immediate" is the same for the base and the diff compilers. However, it turns out that LLVM does not scale the immediate in the b/bl instruction representation. So for the above to work, it would actually need to be something like `instrLen + immediate << 2`. We could do that, but it seems better to align this with x64, which just skips all of the troubles by hardcoding the lower bits of the absolute offsets as if it was the relative offset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adjusts ARM64 BRANCH26 relocation handling to mimic x64’s IMAGE_REL_BASED_REL32 strategy by embedding the target’s lower bits directly, removing previous range checks and jump-stub fallback.
- Removed delta calculation and FitsInRel28 check
- Always call
PutArm64Rel28
with the absolute target value - Added comment aligning behavior with x64 rel32 handling
Comments suppressed due to low confidence (1)
src/coreclr/tools/superpmi/superpmi-shared/compileresult.cpp:860
- Passing
tmp.target
(an absolute address) directly toPutArm64Rel28
will embed the wrong immediate. You need to derive the 28-bit relative displacement or mask the lower bits oftmp.target
(e.g.,tmp.target & Rel28Mask
) so the instruction encodes the correct branch offset.
PutArm64Rel28((UINT32*)address, (INT32)tmp.target);
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
cc @dotnet/jit-contrib PTAL @kunalspathak There's a superpmi-diffs run over in #116299 that now shows no diffs (it was showing spurious diffs without this change). |
do you know when/why it broke? |
Ah, we found out because we have (will) start running differ natively on osx, i assume. |
Yes, I expect this has always been causing spurious diffs when running natively on macOS. |
@kunalspathak can you approve? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
BRANCH26 relocs were handled by seeing if we can fit a relative offset to the actual target in the instruction, and if so, doing that. Otherwise we would hardcode a jump to the end of the code.
It turns out that outside macOS we never hit the former case, so we always hardcode a jump to the end of the code section. That makes diffing work outside macOS.
On macOS, it happens rarely that the target (that comes from the original SPMI collection) is actually within range of the code that was allocated during SPMI replay. When that happens we end up with a two different immediates provided to the instruction by the base and the diff compilers, and hence diffing fails.
Now,
NearDiffer::compareOffsets
actually has code that tries to compensate for relative offsets like this, by seeing if the absolute offset computed by "address + instrLen + immediate" is the same for the base and the diff compilers. However, it turns out that LLVM does not scale the immediate in the b/bl instruction representation. So for the above to work, it would actually need to be something likeinstrLen + immediate << 2
. We could do that, but it seems better to align this with x64, which just skips all of the troubles by hardcoding the lower bits of the absolute offsets as if it was the relative offset.