Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reland "[llvm][AArch64] Copy all operands when expanding BLR_BTI bundle (#78267)" #78719

Merged
merged 1 commit into from
Jan 23, 2024

Conversation

DavidSpickett
Copy link
Collaborator

This reverts commit 955417a.

The problem with the previous version was that the bundle instruction had arguments like "target arg1 arg2". When it's expanded we produced a BL or BLR which can only accept one argument, the target of the branch.

Now I realise why expandCALL_RVMARKER has to copy them in mutiple steps. The operands for the called function need to be changed to implicit arguments of the branch instruction.

  • Copy the branch target.
  • Copy all register operands, marking them as implicit.
  • Copy any other operands without modifying them.

Prior to any attempt to fix #77915:
BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit-def dead $lr, implicit $sp, implicit-def $sp

Which is dropping the use of the arguments for the called function.

My first fix attempt produced:
BL @_setjmp, $x0, $w1, <regmask $fp ...>, implicit-def $lr, implicit $sp, implicit-def dead $lr, implicit $sp, implicit-def $sp

It copied the arguments but as explicit arguments to the BL which only expects 1, failing verification.

With this new change we produce:
BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit $x0, implicit $w1, implicit-def dead $lr, implicit $sp, implicit-def $sp

Note specifically the added "implicit $x0, implicit $w1". So BL only has 1 explicit argument, but the arguments to the function are still used.

…le (llvm#78267)"

This reverts commit 955417a.

The problem with the previous version was that the bundle instruction had arguments
like "target arg1 arg2". When it's expanded we produced a BL or BLR which can only
accept one argument, the target of the branch.

Now I realise why expandCALL_RVMARKER has to copy them in mutiple steps.
The operands for the called function need to be changed to implicit arguments
of the branch instruction.

* Copy the branch target.
* Copy all register operands, marking them as implicit.
* Copy any other operands without modifying them.

Prior to any attempt to fix llvm#77915:
BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit-def dead $lr, implicit $sp, implicit-def $sp

Which is dropping the use of the arguments for the called function.

My first fix attempt produced:
BL @_setjmp, $x0, $w1, <regmask $fp ...>, implicit-def $lr, implicit $sp, implicit-def dead $lr, implicit $sp, implicit-def $sp

It copied the arguments but as explicit arguments to the BL which only expects 1, failing verification.

With this new change we produce:
BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit $x0, implicit $w1, implicit-def dead $lr, implicit $sp, implicit-def $sp

Note specifically the added "implicit $x0, implicit $w1". So BL only has 1 explicit argument, but the arguments
to the function are still used.
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 19, 2024

@llvm/pr-subscribers-backend-aarch64

Author: David Spickett (DavidSpickett)

Changes

This reverts commit 955417a.

The problem with the previous version was that the bundle instruction had arguments like "target arg1 arg2". When it's expanded we produced a BL or BLR which can only accept one argument, the target of the branch.

Now I realise why expandCALL_RVMARKER has to copy them in mutiple steps. The operands for the called function need to be changed to implicit arguments of the branch instruction.

  • Copy the branch target.
  • Copy all register operands, marking them as implicit.
  • Copy any other operands without modifying them.

Prior to any attempt to fix #77915:
BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit-def dead $lr, implicit $sp, implicit-def $sp

Which is dropping the use of the arguments for the called function.

My first fix attempt produced:
BL @_setjmp, $x0, $w1, <regmask $fp ...>, implicit-def $lr, implicit $sp, implicit-def dead $lr, implicit $sp, implicit-def $sp

It copied the arguments but as explicit arguments to the BL which only expects 1, failing verification.

With this new change we produce:
BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit $x0, implicit $w1, implicit-def dead $lr, implicit $sp, implicit-def $sp

Note specifically the added "implicit $x0, implicit $w1". So BL only has 1 explicit argument, but the arguments to the function are still used.


Full diff: https://github.com/llvm/llvm-project/pull/78719.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp (+17-1)
  • (added) llvm/test/CodeGen/AArch64/blr-bti-preserves-operands.mir (+28)
  • (removed) llvm/test/CodeGen/AArch64/blr-bti-preserves-regmask.mir (-23)
diff --git a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
index bb7f4d907ffd7f..352c61d48e2fff 100644
--- a/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
@@ -843,8 +843,24 @@ bool AArch64ExpandPseudo::expandCALL_BTI(MachineBasicBlock &MBB,
   MachineInstr *Call =
       BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(Opc)).getInstr();
   Call->addOperand(CallTarget);
+
+  // 1 because we already added the branch target above.
+  unsigned RegMaskStartIdx = 1;
+  // The branch is BL <target>, so we cannot attach the arguments of the called
+  // function to it. Those must be added as implicitly used by the branch.
+  while (!MI.getOperand(RegMaskStartIdx).isRegMask()) {
+    auto MOP = MI.getOperand(RegMaskStartIdx);
+    assert(MOP.isReg() && "can only add register operands");
+    Call->addOperand(MachineOperand::CreateReg(
+        MOP.getReg(), /*Def=*/false, /*Implicit=*/true, /*isKill=*/false,
+        /*isDead=*/false, /*isUndef=*/MOP.isUndef()));
+    RegMaskStartIdx++;
+  }
+  for (const MachineOperand &MO :
+       llvm::drop_begin(MI.operands(), RegMaskStartIdx))
+    Call->addOperand(MO);
+
   Call->setCFIType(*MBB.getParent(), MI.getCFIType());
-  Call->copyImplicitOps(*MBB.getParent(), MI);
 
   MachineInstr *BTI =
       BuildMI(MBB, MBBI, MI.getDebugLoc(), TII->get(AArch64::HINT))
diff --git a/llvm/test/CodeGen/AArch64/blr-bti-preserves-operands.mir b/llvm/test/CodeGen/AArch64/blr-bti-preserves-operands.mir
new file mode 100644
index 00000000000000..760ae4794e3043
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/blr-bti-preserves-operands.mir
@@ -0,0 +1,28 @@
+# RUN: llc -mtriple=aarch64-none-linux-gnu -run-pass=aarch64-expand-pseudo -o - %s | FileCheck %s
+
+# When expanding a BLR_BTI, we should copy all the operands to the branch in the
+# bundle. Otherwise we could end up using a register after the BL which was
+# clobbered by the function that was called, or overwriting an argument to that
+# function before we take the branch.
+#
+# The arguments to the call must become implicit arguments, because the branch
+# only expects to get 1 explicit operand which is the branch target.
+
+# CHECK:    BUNDLE implicit-def $lr, implicit-def $w30, implicit-def $sp, implicit-def $wsp, implicit $sp, implicit $x0, implicit $w1 {
+# CHECK:      BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit $x0, implicit $w1, implicit-def dead $lr, implicit $sp, implicit-def $sp
+# CHECK:      HINT 36
+# CHECK:    }
+
+--- |
+  define void @a() {
+    ret void
+  }
+
+  declare void @_setjmp(...)
+...
+---
+name: a
+body: |
+  bb.0:
+    BLR_BTI @_setjmp, $x0, $w1, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
+...
diff --git a/llvm/test/CodeGen/AArch64/blr-bti-preserves-regmask.mir b/llvm/test/CodeGen/AArch64/blr-bti-preserves-regmask.mir
deleted file mode 100644
index 91652c6e20c8f8..00000000000000
--- a/llvm/test/CodeGen/AArch64/blr-bti-preserves-regmask.mir
+++ /dev/null
@@ -1,23 +0,0 @@
-# RUN: llc -mtriple=aarch64-none-linux-gnu -run-pass=aarch64-expand-pseudo -o - %s | FileCheck %s
-
-# When expanding a BLR_BTI, we should keep the regmask that was attached to it.
-# Otherwise we could end up using a register after the BL which was clobbered by
-# the function that was called.
-# CHECK:    BUNDLE implicit-def $lr, implicit-def $w30, implicit-def $sp, implicit-def $wsp, implicit $sp {
-# CHECK:      BL @_setjmp, csr_aarch64_aapcs, implicit-def $lr, implicit $sp, implicit-def dead $lr, implicit $sp, implicit-def $sp
-# CHECK:      HINT 36
-# CHECK:    }
-
---- |
-  define void @a() {
-    ret void
-  }
-
-  declare void @_setjmp(...)
-...
----
-name: a
-body: |
-  bb.0:
-    BLR_BTI @_setjmp, csr_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
-...

@DavidSpickett
Copy link
Collaborator Author

Turns out expandCALL_RVMARKER is doing the right thing, just took me a while to understand it. So the code is almost the same now. I could factor it out but keeping the PR small for now.

@efriedma-quic
Copy link
Collaborator

Oh, I see, that makes more sense.

Is there some reason we can't just change the definition of BLR_BTI so it takes the same operands as BLR? Defining it to take different operands, then rewriting them to the form we want, seems like it's more complicated than necessary.

@mstorsjo
Copy link
Member

Oh, I see, that makes more sense.

Is there some reason we can't just change the definition of BLR_BTI so it takes the same operands as BLR? Defining it to take different operands, then rewriting them to the form we want, seems like it's more complicated than necessary.

FWIW, 18.x gets branched on Tuesday, and I’d love to have this issue fixed there (not necessarily in the first RC, but at least by the first stable release). Does changing the definition of BLR_BTI make it less backportable (especially if the fix lands after the branch), rather than applying this fix first and looking into changing the definition later?

@DavidSpickett
Copy link
Collaborator Author

I think I'd prefer to land this first regardless of whether I do Eli's suggestion, to have a solid foundation to work from.

On Monday I will looking into changing BLR_BTI, so we can know the trade-offs before Tuesday's deadline.

@mstorsjo
Copy link
Member

I think I'd prefer to land this first regardless of whether I do Eli's suggestion, to have a solid foundation to work from.

On Monday I will looking into changing BLR_BTI, so we can know the trade-offs before Tuesday's deadline.

Thanks, that sounds like a very reasonable plan!

@DavidSpickett
Copy link
Collaborator Author

Is there some reason we can't just change the definition of BLR_BTI so it takes the same operands as BLR? Defining it to take different operands, then rewriting them to the form we want, seems like it's more complicated than necessary.

I think I see what you mean. We don't have custom lowering code for BL and BLR so why would we need it for these pseudo instructions.

We select how we're going to lower the calls in AArch64CallLowering::lowerCall. In there there is a comment:

  // Create a temporarily-floating call instruction so we can add the implicit
  // uses of arg registers.

Which is exactly what I'm doing here, and RVMARKER does currently. But I cannot see where the arg registers are actually added.

It must happen somewhere otherwise the normal branches wouldn't work., looking for something along the lines of if known branch types copy.....

@DavidSpickett
Copy link
Collaborator Author

But given my track record on this fix, I think the best way to proceed is to land this for 18 and I'll continue to look for a better solution on trunk.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can just do this for now. LGTM

@DavidSpickett
Copy link
Collaborator Author

Windows failure is Driver/linker-wrapper.c which is unrelated (fixed since this PR was posted I think), so going ahead with landing this.

@DavidSpickett DavidSpickett merged commit f205566 into llvm:main Jan 23, 2024
4 of 5 checks passed
@DavidSpickett DavidSpickett deleted the bti-attempt-2 branch January 23, 2024 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Miscompilation of calls to setjmp after "Preserve regmask when expanding the BLR_BTI pseudo instruction"
4 participants