Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Fix tryMergeAdjacentSTG function in PrologEpilog pass #68873

Merged
merged 5 commits into from
Nov 14, 2023

Conversation

kartcq
Copy link
Contributor

@kartcq kartcq commented Oct 12, 2023

The tryMergeAdjacentSTG function tries to merge multiple stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV flag before moving around STGloop which also alters NZCV flags. This was not issue before the patch 5e612bc as these stack tag stores does not alter the NZCV flags. But after the change, this merge function leads to miscompilation because of control flow change in instructions. Added the check to to see if the first instruction after insert point reads or writes to NZCV flag and it's liveout state. This check happens after the filling of merge list just before merge and bails out if necessary.

The tryMergeAdjacentSTG function tries to merge multiple
stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV
flag before moving around STGloop which also alters NZCV flags. This
was not issue before the patch 5e612bc as these stack tag stores does
not alter the NZCV flags. But after the change, this merge function
leads to miscompilation because of control flow change in instructions.
Added the check to to see if the first instruction after insert point
reads or writes to NZCV flag and it's liveout state. This check happens
after the filling of merge list just before merge and bails out if
necessary.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 12, 2023

@llvm/pr-subscribers-backend-aarch64

Author: None (kartcq)

Changes

The tryMergeAdjacentSTG function tries to merge multiple stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV flag before moving around STGloop which also alters NZCV flags. This was not issue before the patch 5e612bc as these stack tag stores does not alter the NZCV flags. But after the change, this merge function leads to miscompilation because of control flow change in instructions. Added the check to to see if the first instruction after insert point reads or writes to NZCV flag and it's liveout state. This check happens after the filling of merge list just before merge and bails out if necessary.


Full diff: https://github.com/llvm/llvm-project/pull/68873.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+38)
  • (modified) llvm/test/CodeGen/AArch64/settag-merge.ll (+69)
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index 78f85faaf69bf96..3adf005243cd8a2 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -3703,6 +3703,17 @@ bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset,
   return true;
 }
 
+bool isNZCVLiveOut(MachineBasicBlock &MBB) {
+  // Loop over all of the successors of the basic block, checking to see if
+  // the value is either live in the block, or if it is killed in the block.
+  // We are checking for LiveIns alone
+  for (MachineBasicBlock *SuccMBB : MBB.successors()) {
+    if (SuccMBB->isLiveIn(AArch64::NZCV))
+      return true;
+  }
+  return false;
+}
+
 // Detect a run of memory tagging instructions for adjacent stack frame slots,
 // and replace them with a shorter instruction sequence:
 // * replace STG + STG with ST2G
@@ -3763,6 +3774,33 @@ MachineBasicBlock::iterator tryMergeAdjacentSTG(MachineBasicBlock::iterator II,
   MachineBasicBlock::iterator InsertI = Instrs.back().MI;
   InsertI++;
 
+  // All the gathered stack tag instructions are merged and placed after
+  // last tag store in the list. The check should be made if the nzcv
+  // flag that we might be overriding will affect upcoming instructions
+  // that reads the flag(Here NZCV) or if the register is live out. If
+  // the flag is defined again we don't have this problem as the flag
+  // is anyway subject to modification.
+  // Though there could be different ways to curb this maltransformation,
+  // this seems to be simple and straight forward without changing
+  // existing algorithm.
+  // FIXME : This approach of bailing out from merge is also conservative
+  // in some ways like even if stg loops are not present after merge
+  // these checks are done (which are not needed).
+  bool modifiesFirst = false, readsFirst = false;
+  for (MachineBasicBlock::iterator I = InsertI, E = MBB->end(); I != E; I++) {
+    MachineInstr &MI = *I;
+    if (MI.readsRegister(AArch64::NZCV)) {
+      readsFirst = true;
+      break;
+    }
+    if (MI.modifiesRegister(AArch64::NZCV)) {
+      modifiesFirst = true;
+      break;
+    }
+  }
+  if (!modifiesFirst && (readsFirst || isNZCVLiveOut(*MBB)))
+    return InsertI;
+
   llvm::stable_sort(Instrs,
                     [](const TagStoreInstr &Left, const TagStoreInstr &Right) {
                       return Left.Offset < Right.Offset;
diff --git a/llvm/test/CodeGen/AArch64/settag-merge.ll b/llvm/test/CodeGen/AArch64/settag-merge.ll
index 0c00931a1fd0c74..1f56368c41df75c 100644
--- a/llvm/test/CodeGen/AArch64/settag-merge.ll
+++ b/llvm/test/CodeGen/AArch64/settag-merge.ll
@@ -289,3 +289,72 @@ entry:
   call void @llvm.aarch64.settag(ptr %c2, i64 128)
   ret void
 }
+
+; Function Attrs: nounwind
+declare i32 @printf(ptr, ...) #0
+
+@.str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1
+
+; Case 1
+; Insert point of stg merge is followed by nzcv read
+; Don't merge in this case
+
+define i32 @stg_no_merge(i32 %in) {
+entry:
+; CHECK-LABEL: stg_no_merge:
+; CHECK: stg sp, [sp, #528]
+; CHECK-NEXT: .LBB10_1:
+; CHECK: st2g x9, [x9], #32
+; CHECK-NEXT: subs x8, x8, #32
+; CHECK-NEXT: b.ne .LBB10_1
+; CHECK-NEXT: // %bb.2:
+; CHECK-NEXT: cmp w0, #10
+; CHECK-NEXT: stg sp, [sp]
+; CHECK-NEXT: b.ge .LBB10_4
+
+  %a = alloca i8, i32 16, align 16
+  %b = alloca i8, i32 512, align 16
+  %c = alloca i8, i32 16, align 16
+  call void @llvm.aarch64.settag(ptr %a, i64 16)
+  call void @llvm.aarch64.settag(ptr %b, i64 512)
+  %cmp = icmp slt i32 %in, 10
+  call void @llvm.aarch64.settag(ptr %c, i64 16)
+  br i1 %cmp, label %return0, label %return1
+
+return0:                                           ; preds = %entry
+  %call = call i32 (ptr, ...) @printf(ptr @.str, i32 10) #1
+  ret i32 0
+
+return1:
+  ret i32 1
+}
+
+; Case 2
+; Insert point of stg merge is not followed by nzcv read
+; Merge in this case
+
+define i32 @stg_merge(i32 %in) {
+entry:
+; CHECK-LABEL: stg_merge:
+; CHECK: mov x8, #544
+; CHECK-NEXT: .LBB11_1:
+; CHECK: st2g sp, [sp], #32
+; CHECK-NEXT: subs x8, x8, #32
+; CHECK-NEXT: b.ne .LBB11_1
+
+
+  %a = alloca i8, i32 16, align 16
+  %b = alloca i8, i32 512, align 16
+  %c = alloca i8, i32 16, align 16
+  call void @llvm.aarch64.settag(ptr %a, i64 16)
+  call void @llvm.aarch64.settag(ptr %b, i64 512)
+  call void @llvm.aarch64.settag(ptr %c, i64 16)
+  br label %return1
+
+return0:                                           ; preds = %entry
+  %call = call i32 (ptr, ...) @printf(ptr @.str, i32 10) #1
+  ret i32 0
+
+return1:
+  ret i32 1
+}

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this!

Let me ping @eugenis as the original author of this code. @eugenis : we had to change the code sequence produced for TagLoop pseudos in #5e612bc291 to avoid a conflict with SpeculativeLoadHardening. Before the TagLoop pseudos did not clobber NZCV, now they do.

@@ -3703,6 +3703,17 @@ bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset,
return true;
}

bool isNZCVLiveOut(MachineBasicBlock &MBB) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why this need a new function to be implemented.
Wouldn't calling MBB.isLiveOut(MBB, AArch64::NZCV) work the same as calling isNZCVLiveOut(MBB)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not find MBB.isLiveOut anywhere. Does it exist? There is LiveVariables::isLiveOut, but that only works for virtual regs.

Anyway, I've not touched backend code in a long while. This seems very conservative but correct to me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like there is no MBB.isLiveOut.

@eugenis
Copy link
Contributor

eugenis commented Oct 12, 2023

@hctim FYI

@hctim
Copy link
Collaborator

hctim commented Oct 16, 2023

@hctim FYI

Thanks for the heads up. This looks like it fixes the stack miscompile that prevents fullmte from booting.

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this patch mostly looks good.
I have one question to verify that I'm understanding the code correctly; and 1 very minor suggestion for an improvement of the name of one of the added tests.

@@ -3703,6 +3703,17 @@ bool isMergeableStackTaggingInstruction(MachineInstr &MI, int64_t &Offset,
return true;
}

bool isNZCVLiveOut(MachineBasicBlock &MBB) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it seems like there is no MBB.isLiveOut.

llvm/test/CodeGen/AArch64/settag-merge.ll Outdated Show resolved Hide resolved
break;
}
}
if (!modifiesFirst && (readsFirst || isNZCVLiveOut(*MBB)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that a more accurate condition for when this transform needs to be blocked would be:

  1. The last STG instruction in Instrs does not already clobber NZCV. e.g. the last STG instruction is not one of the loop instructions.
  2. The merged instruction being produced does clobber NZCV (e.g. at least one of the "merged" instructions inserted does clobber NZCV)
  3. NZCV is live at the location of the last STG instruction in Instrs.

Assuming the above is correct, this code basically checks the 3rd condition: is NZCV live at the location of the last STG instruction in Instrs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking the time to look into this @kbeyls. Yes you are right. From what I understand first two conditions are most accurate to block the merging. But the third condition is kind of superset of both. Once we are steer clear of the third the first two should not matter. That's the reason I mentioned the patch is conservative and might block merging at places where we need not to. But Checking Liveness ensures safety and avoid mis-compilation even if we did not include all possible conditions right now.
Also I am trying to replace the Liveness logic I put above using the LivePhysRegs class APIs. I am seeing some crashed while running tests. Debugging on that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kbeyls @eugenis @hctim
Have pushed updated patch. Kindly go through and provide review comments whenever possible.

Checking liveness of NZCV flag using more appropriate APIs.
Copy link

github-actions bot commented Nov 7, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a few minor comments mostly on variable naming; otherwise LGTM.

llvm/lib/Target/AArch64/AArch64FrameLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AArch64/AArch64FrameLowering.cpp Outdated Show resolved Hide resolved
@hctim
Copy link
Collaborator

hctim commented Nov 9, 2023

I'm not the best reviewer for the contents of the patch, but I can confirm the most up-to-date version continues to fix the boot issue on stack-MTE enabled Android builds.

@kartcq
Copy link
Contributor Author

kartcq commented Nov 10, 2023

Thanks @kbeyls and @hctim for looking into this. Resolved the minor suggestion changes from @kbeyls.

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@efriedma-quic efriedma-quic merged commit 6726c99 into llvm:main Nov 14, 2023
2 of 3 checks passed
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…#68873)

The tryMergeAdjacentSTG function tries to merge multiple
stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV
flag before moving around STGloop which also alters NZCV flags. This was
not issue before the patch 5e612bc as these stack tag stores does not
alter the NZCV flags. But after the change, this merge function leads to
miscompilation because of control flow change in instructions. Added the
check to to see if the first instruction after insert point reads or
writes to NZCV flag and it's liveout state. This check happens after the
filling of merge list just before merge and bails out if necessary.
@kartcq kartcq deleted the my_change branch December 12, 2023 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants