-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Fix tryMergeAdjacentSTG function in PrologEpilog pass #68873
Conversation
The tryMergeAdjacentSTG function tries to merge multiple stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV flag before moving around STGloop which also alters NZCV flags. This was not issue before the patch 5e612bc as these stack tag stores does not alter the NZCV flags. But after the change, this merge function leads to miscompilation because of control flow change in instructions. Added the check to to see if the first instruction after insert point reads or writes to NZCV flag and it's liveout state. This check happens after the filling of merge list just before merge and bails out if necessary.
@llvm/pr-subscribers-backend-aarch64 Author: None (kartcq) ChangesThe tryMergeAdjacentSTG function tries to merge multiple stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV flag before moving around STGloop which also alters NZCV flags. This was not issue before the patch 5e612bc as these stack tag stores does not alter the NZCV flags. But after the change, this merge function leads to miscompilation because of control flow change in instructions. Added the check to to see if the first instruction after insert point reads or writes to NZCV flag and it's liveout state. This check happens after the filling of merge list just before merge and bails out if necessary. Full diff: https://github.com/llvm/llvm-project/pull/68873.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index 78f85faaf69bf96..3adf005243cd8a2 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -3703,6 +3703,17 @@ bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset,
return true;
}
+bool isNZCVLiveOut(MachineBasicBlock &MBB) {
+ // Loop over all of the successors of the basic block, checking to see if
+ // the value is either live in the block, or if it is killed in the block.
+ // We are checking for LiveIns alone
+ for (MachineBasicBlock *SuccMBB : MBB.successors()) {
+ if (SuccMBB->isLiveIn(AArch64::NZCV))
+ return true;
+ }
+ return false;
+}
+
// Detect a run of memory tagging instructions for adjacent stack frame slots,
// and replace them with a shorter instruction sequence:
// * replace STG + STG with ST2G
@@ -3763,6 +3774,33 @@ MachineBasicBlock::iterator tryMergeAdjacentSTG(MachineBasicBlock::iterator II,
MachineBasicBlock::iterator InsertI = Instrs.back().MI;
InsertI++;
+ // All the gathered stack tag instructions are merged and placed after
+ // last tag store in the list. The check should be made if the nzcv
+ // flag that we might be overriding will affect upcoming instructions
+ // that reads the flag(Here NZCV) or if the register is live out. If
+ // the flag is defined again we don't have this problem as the flag
+ // is anyway subject to modification.
+ // Though there could be different ways to curb this maltransformation,
+ // this seems to be simple and straight forward without changing
+ // existing algorithm.
+ // FIXME : This approach of bailing out from merge is also conservative
+ // in some ways like even if stg loops are not present after merge
+ // these checks are done (which are not needed).
+ bool modifiesFirst = false, readsFirst = false;
+ for (MachineBasicBlock::iterator I = InsertI, E = MBB->end(); I != E; I++) {
+ MachineInstr &MI = *I;
+ if (MI.readsRegister(AArch64::NZCV)) {
+ readsFirst = true;
+ break;
+ }
+ if (MI.modifiesRegister(AArch64::NZCV)) {
+ modifiesFirst = true;
+ break;
+ }
+ }
+ if (!modifiesFirst && (readsFirst || isNZCVLiveOut(*MBB)))
+ return InsertI;
+
llvm::stable_sort(Instrs,
[](const TagStoreInstr &Left, const TagStoreInstr &Right) {
return Left.Offset < Right.Offset;
diff --git a/llvm/test/CodeGen/AArch64/settag-merge.ll b/llvm/test/CodeGen/AArch64/settag-merge.ll
index 0c00931a1fd0c74..1f56368c41df75c 100644
--- a/llvm/test/CodeGen/AArch64/settag-merge.ll
+++ b/llvm/test/CodeGen/AArch64/settag-merge.ll
@@ -289,3 +289,72 @@ entry:
call void @llvm.aarch64.settag(ptr %c2, i64 128)
ret void
}
+
+; Function Attrs: nounwind
+declare i32 @printf(ptr, ...) #0
+
+@.str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1
+
+; Case 1
+; Insert point of stg merge is followed by nzcv read
+; Don't merge in this case
+
+define i32 @stg_no_merge(i32 %in) {
+entry:
+; CHECK-LABEL: stg_no_merge:
+; CHECK: stg sp, [sp, #528]
+; CHECK-NEXT: .LBB10_1:
+; CHECK: st2g x9, [x9], #32
+; CHECK-NEXT: subs x8, x8, #32
+; CHECK-NEXT: b.ne .LBB10_1
+; CHECK-NEXT: // %bb.2:
+; CHECK-NEXT: cmp w0, #10
+; CHECK-NEXT: stg sp, [sp]
+; CHECK-NEXT: b.ge .LBB10_4
+
+ %a = alloca i8, i32 16, align 16
+ %b = alloca i8, i32 512, align 16
+ %c = alloca i8, i32 16, align 16
+ call void @llvm.aarch64.settag(ptr %a, i64 16)
+ call void @llvm.aarch64.settag(ptr %b, i64 512)
+ %cmp = icmp slt i32 %in, 10
+ call void @llvm.aarch64.settag(ptr %c, i64 16)
+ br i1 %cmp, label %return0, label %return1
+
+return0: ; preds = %entry
+ %call = call i32 (ptr, ...) @printf(ptr @.str, i32 10) #1
+ ret i32 0
+
+return1:
+ ret i32 1
+}
+
+; Case 2
+; Insert point of stg merge is not followed by nzcv read
+; Merge in this case
+
+define i32 @stg_merge(i32 %in) {
+entry:
+; CHECK-LABEL: stg_merge:
+; CHECK: mov x8, #544
+; CHECK-NEXT: .LBB11_1:
+; CHECK: st2g sp, [sp], #32
+; CHECK-NEXT: subs x8, x8, #32
+; CHECK-NEXT: b.ne .LBB11_1
+
+
+ %a = alloca i8, i32 16, align 16
+ %b = alloca i8, i32 512, align 16
+ %c = alloca i8, i32 16, align 16
+ call void @llvm.aarch64.settag(ptr %a, i64 16)
+ call void @llvm.aarch64.settag(ptr %b, i64 512)
+ call void @llvm.aarch64.settag(ptr %c, i64 16)
+ br label %return1
+
+return0: ; preds = %entry
+ %call = call i32 (ptr, ...) @printf(ptr @.str, i32 10) #1
+ ret i32 0
+
+return1:
+ ret i32 1
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -3703,6 +3703,17 @@ bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset, | |||
return true; | |||
} | |||
|
|||
bool isNZCVLiveOut(MachineBasicBlock &MBB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering why this need a new function to be implemented.
Wouldn't calling MBB.isLiveOut(MBB, AArch64::NZCV)
work the same as calling isNZCVLiveOut(MBB)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can not find MBB.isLiveOut
anywhere. Does it exist? There is LiveVariables::isLiveOut, but that only works for virtual regs.
Anyway, I've not touched backend code in a long while. This seems very conservative but correct to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it seems like there is no MBB.isLiveOut
.
@hctim FYI |
Thanks for the heads up. This looks like it fixes the stack miscompile that prevents fullmte from booting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this patch mostly looks good.
I have one question to verify that I'm understanding the code correctly; and 1 very minor suggestion for an improvement of the name of one of the added tests.
@@ -3703,6 +3703,17 @@ bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset, | |||
return true; | |||
} | |||
|
|||
bool isNZCVLiveOut(MachineBasicBlock &MBB) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it seems like there is no MBB.isLiveOut
.
break; | ||
} | ||
} | ||
if (!modifiesFirst && (readsFirst || isNZCVLiveOut(*MBB))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that a more accurate condition for when this transform needs to be blocked would be:
- The last STG instruction in
Instrs
does not already clobberNZCV
. e.g. the last STG instruction is not one of the loop instructions. - The merged instruction being produced does clobber
NZCV
(e.g. at least one of the "merged" instructions inserted does clobberNZCV
) NZCV
is live at the location of the last STG instruction inInstrs
.
Assuming the above is correct, this code basically checks the 3rd condition: is NZCV
live at the location of the last STG instruction in Instrs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to look into this @kbeyls. Yes you are right. From what I understand first two conditions are most accurate to block the merging. But the third condition is kind of superset of both. Once we are steer clear of the third the first two should not matter. That's the reason I mentioned the patch is conservative and might block merging at places where we need not to. But Checking Liveness ensures safety and avoid mis-compilation even if we did not include all possible conditions right now.
Also I am trying to replace the Liveness logic I put above using the LivePhysRegs class APIs. I am seeing some crashed while running tests. Debugging on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking liveness of NZCV flag using more appropriate APIs.
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a few minor comments mostly on variable naming; otherwise LGTM.
I'm not the best reviewer for the contents of the patch, but I can confirm the most up-to-date version continues to fix the boot issue on stack-MTE enabled Android builds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
…#68873) The tryMergeAdjacentSTG function tries to merge multiple stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV flag before moving around STGloop which also alters NZCV flags. This was not issue before the patch 5e612bc as these stack tag stores does not alter the NZCV flags. But after the change, this merge function leads to miscompilation because of control flow change in instructions. Added the check to to see if the first instruction after insert point reads or writes to NZCV flag and it's liveout state. This check happens after the filling of merge list just before merge and bails out if necessary.
The tryMergeAdjacentSTG function tries to merge multiple stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV flag before moving around STGloop which also alters NZCV flags. This was not issue before the patch 5e612bc as these stack tag stores does not alter the NZCV flags. But after the change, this merge function leads to miscompilation because of control flow change in instructions. Added the check to to see if the first instruction after insert point reads or writes to NZCV flag and it's liveout state. This check happens after the filling of merge list just before merge and bails out if necessary.