-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Fix pairing different types of registers when computing CSRs. #66642
Conversation
@TNorthover could you please have a look at the patch and add reviewers for me (i don't have commit access)? @kyulee-com @kyulee1 |
Thanks for finding and fixing this issue!
|
First of all let me thank you for the great work on the However, the app instantly crashed when I opened it. The crash log indicated that there were illegal instructions in the outlined helper functions. After some debugging around Moreover, I think it is fair to call it a bug, which should be addressed, for
As long as the compact unwind info is not generated for the unpaired cases I think we are safe. This patch is not changing the behavior of unwind info generation. Correct me if i am wrong.
I wanted to put minimized effort to make the pass work for the unpaired cases, so I decided to keep the load/store pair form. If this introduces harmful side effect that I am not aware of, please let me know. I'm OK to add |
19cc004
to
f004f15
Compare
Intuitively, this doesn't seem right as you also save and restore If you compare your test before/after I suggest adding a real test case to confirm the change. Also suggest adding more comments as the underlying assumption was changed quite a bit. In fact, I prefer to have a right implementation for easier maintenance/understanding although it might lead to more substantial changes. Another note/tip on debugging. Although the original implementation does not have a flag to expand the helper in place, I think the transformation should be still semantically valid even if |
If a function has odd number of same type of registers to save, and the calling convention also requires odd number of such type of CSRs, an FP register would be accidentally marked as saved when producePairRegisters returns true. This patch also fixes the AArch64LowerHomogeneousPrologEpilog pass not handling AArch64::NoRegister; actually this pass must be fixed along with the register pairing so i can write a test for it.
f004f15
to
24d840e
Compare
I think this is the expected behavior. When Sample IR: declare void @bar(i32 %i)
define void @test(i32 %i) nounwind minsize {
call void asm sideeffect "mov x0, #42", "~{x0},~{x19},~{x20},~{x21},~{x22},~{x23},~{x24},~{x25},~{x26},~{x27}"() nounwind
call void @bar(i32 %i)
ret void
} Output: .section __TEXT,__text,regular,pure_instructions
.ios_version_min 7, 0
.globl _test ; -- Begin function test
.p2align 2
_test: ; @test
; %bb.0:
stp x28, x27, [sp, #-96]! ; 16-byte Folded Spill
stp x26, x25, [sp, #16] ; 16-byte Folded Spill
mov w8, w0
stp x24, x23, [sp, #32] ; 16-byte Folded Spill
stp x22, x21, [sp, #48] ; 16-byte Folded Spill
stp x20, x19, [sp, #64] ; 16-byte Folded Spill
stp x29, x30, [sp, #80] ; 16-byte Folded Spill
; InlineAsm Start
mov x0, #42 ; =0x2a
; InlineAsm End
mov w0, w8
bl _bar
ldp x29, x30, [sp, #80] ; 16-byte Folded Reload
ldp x20, x19, [sp, #64] ; 16-byte Folded Reload
ldp x22, x21, [sp, #48] ; 16-byte Folded Reload
ldp x24, x23, [sp, #32] ; 16-byte Folded Reload
ldp x26, x25, [sp, #16] ; 16-byte Folded Reload
ldp x28, x27, [sp], #96 ; 16-byte Folded Reload
ret
; -- End function
.subsections_via_symbols But I noticed that
I believe that |
llvm/lib/Target/AArch64/AArch64LowerHomogeneousPrologEpilog.cpp
Outdated
Show resolved
Hide resolved
Thanks for clearing this up! Overall it looks good to me modulo minor comments. |
Fixed the typos and a bad test expectation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@kyulee-com Could you please merge the PR for me since I don't have commit access? Or do you think it is necessary for someone else to review it? |
If a function has odd number of same type of registers to save, and the calling convention also requires odd number of such type of CSRs, an FP register would be accidentally marked as saved when producePairRegisters returns true.
This patch also fixes the AArch64LowerHomogeneousPrologEpilog pass not handling AArch64::NoRegister; actually this pass must be fixed along with the register pairing so i can write a test for it.