-
Notifications
You must be signed in to change notification settings - Fork 15k
[AArch64] Widen GPR32 zero cycle zeroing #164244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Widen GPR32 zero cycle zeroing #164244
Conversation
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
64d1a8e to
0462d59
Compare
|
@llvm/pr-subscribers-backend-aarch64 Author: Tomer Shafir (tomershafir) ChangesGiven a GPR32 zeroing instruction, if the target supports zero cycle zeroing for GPR64 but not for GPR32, widen the zeroing instruction. It also aligns naming in the generic zeroing test. Full diff: https://github.com/llvm/llvm-project/pull/164244.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 84b54b3c707af..b74202e70d7a3 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -5116,7 +5116,15 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
// GPR32 zeroing
if (AArch64::GPR32spRegClass.contains(DestReg) && SrcReg == AArch64::WZR) {
- if (Subtarget.hasZeroCycleZeroingGPR32()) {
+ if (Subtarget.hasZeroCycleZeroingGPR64() &&
+ !Subtarget.hasZeroCycleZeroingGPR32()) {
+ MCRegister DestRegX = RI.getMatchingSuperReg(DestReg, AArch64::sub_32,
+ &AArch64::GPR64spRegClass);
+ assert(DestRegX.isValid() && "Destination super-reg not valid");
+ BuildMI(MBB, I, DL, get(AArch64::MOVZXi), DestRegX)
+ .addImm(0)
+ .addImm(AArch64_AM::getShifterImm(AArch64_AM::LSL, 0));
+ } else if (Subtarget.hasZeroCycleZeroingGPR32()) {
BuildMI(MBB, I, DL, get(AArch64::MOVZWi), DestReg)
.addImm(0)
.addImm(AArch64_AM::getShifterImm(AArch64_AM::LSL, 0));
diff --git a/llvm/test/CodeGen/AArch64/arm64-copy-phys-zero-reg.mir b/llvm/test/CodeGen/AArch64/arm64-copy-phys-zero-reg.mir
index f34d3ed510a97..6b2a31b02c097 100644
--- a/llvm/test/CodeGen/AArch64/arm64-copy-phys-zero-reg.mir
+++ b/llvm/test/CodeGen/AArch64/arm64-copy-phys-zero-reg.mir
@@ -35,7 +35,7 @@ body: |
; CHECK-NOZCZ-GPR32-ZCZ-GPR64-LABEL: name: f0
; CHECK-NOZCZ-GPR32-ZCZ-GPR64: liveins: $x0, $lr
; CHECK-NOZCZ-GPR32-ZCZ-GPR64-NEXT: {{ $}}
- ; CHECK-NOZCZ-GPR32-ZCZ-GPR64-NEXT: $w0 = ORRWrr $wzr, $wzr
+ ; CHECK-NOZCZ-GPR32-ZCZ-GPR64-NEXT: $x0 = MOVZXi 0, 0
; CHECK-NOZCZ-GPR32-ZCZ-GPR64-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
;
; CHECK-ZCZ-GPR32-ZCZ-GPR64-LABEL: name: f0
diff --git a/llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing-gpr.ll b/llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing-gpr.ll
index dc643062d8697..0f284aa9ad282 100644
--- a/llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing-gpr.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-zero-cycle-zeroing-gpr.ll
@@ -1,41 +1,44 @@
-; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s -check-prefixes=ALL,NOZCZ-GPR
+; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s -check-prefixes=ALL,NOZCZ-GPR32-NOZCZ-GPR64
; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=+zcz-gpr32 | FileCheck %s -check-prefixes=ALL,ZCZ-GPR32
-; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=+zcz-gpr64 | FileCheck %s -check-prefixes=ALL,ZCZ-GPR64
-; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=generic | FileCheck %s -check-prefixes=ALL,NOZCZ-GPR
+; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=+zcz-gpr64 | FileCheck %s -check-prefixes=ALL,NOZCZ-GPR32-ZCZ-GPR64
+; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=generic | FileCheck %s -check-prefixes=ALL,NOZCZ-GPR32-NOZCZ-GPR64
; RUN: llc < %s -mtriple=arm64-apple-ios -mcpu=cyclone | FileCheck %s -check-prefixes=ALL,ZCZ-GPR32,ZCZ-GPR64
; RUN: llc < %s -mtriple=arm64-apple-macosx -mcpu=apple-m1 | FileCheck %s -check-prefixes=ALL,ZCZ-GPR32,ZCZ-GPR64
-; RUN: llc < %s -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 | FileCheck %s -check-prefixes=ALL,NOZCZ-GPR
+; RUN: llc < %s -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 | FileCheck %s -check-prefixes=ALL,NOZCZ-GPR32-NOZCZ-GPR64
; RUN: llc < %s -mtriple=aarch64-linux-gnu -mcpu=kryo | FileCheck %s -check-prefixes=ALL,ZCZ-GPR32,ZCZ-GPR64
; RUN: llc < %s -mtriple=aarch64-linux-gnu -mcpu=falkor | FileCheck %s -check-prefixes=ALL,ZCZ-GPR32,ZCZ-GPR64
define i8 @ti8() {
entry:
; ALL-LABEL: ti8:
-; NOZCZ-GPR: mov w0, wzr
+; NOZCZ-GPR32-NOZCZ-GPR64: mov w0, wzr
; ZCZ-GPR32: mov w0, #0
+; NOZCZ-GPR32-ZCZ-GPR64: mov x0, #0
ret i8 0
}
define i16 @ti16() {
entry:
; ALL-LABEL: ti16:
-; NOZCZ-GPR: mov w0, wzr
+; NOZCZ-GPR32-NOZCZ-GPR64: mov w0, wzr
; ZCZ-GPR32: mov w0, #0
+; NOZCZ-GPR32-ZCZ-GPR64: mov x0, #0
ret i16 0
}
define i32 @ti32() {
entry:
; ALL-LABEL: ti32:
-; NOZCZ-GPR: mov w0, wzr
+; NOZCZ-GPR32-NOZCZ-GPR64: mov w0, wzr
; ZCZ-GPR32: mov w0, #0
+; NOZCZ-GPR32-ZCZ-GPR64: mov x0, #0
ret i32 0
}
define i64 @ti64() {
entry:
; ALL-LABEL: ti64:
-; NOZCZ-GPR: mov x0, xzr
+; NOZCZ-GPR32-NOZCZ-GPR64 mov x0, xzr
; ZCZ-GPR64: mov x0, #0
ret i64 0
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message could be a bit more precise, mentioning that you emit $xn = MOVZXi 0, 0 even when copying into $wn to take advantage of the GPR64 zero-cycle move.
Other than that, lgtm.
Given a GPR32 zeroing instruction, if the target supports zero cycle zeroing for GPR64 but not for GPR32, widen the instruction to 64 bit `$xn = MOVZXi 0, 0` instead of writing to `$wn` to exploit zero cycle zeroing. It also aligns naming in the generic zeroing test.
0462d59 to
d56419f
Compare
Given a GPR32 zeroing instruction, if the target supports zero cycle zeroing for GPR64 but not for GPR32, widen the instruction to 64 bit `$xn = MOVZXi 0, 0` instead of writing to `$wn` to exploit zero cycle zeroing. It also aligns naming in the generic zeroing test.
Given a GPR32 zeroing instruction, if the target supports zero cycle zeroing for GPR64 but not for GPR32, widen the instruction to 64 bit `$xn = MOVZXi 0, 0` instead of writing to `$wn` to exploit zero cycle zeroing. It also aligns naming in the generic zeroing test.
Given a GPR32 zeroing instruction, if the target supports zero cycle zeroing for GPR64 but not for GPR32, widen the instruction to 64 bit `$xn = MOVZXi 0, 0` instead of writing to `$wn` to exploit zero cycle zeroing. It also aligns naming in the generic zeroing test.
Given a GPR32 zeroing instruction, if the target supports zero cycle zeroing for GPR64 but not for GPR32, widen the instruction to 64 bit
$xn = MOVZXi 0, 0instead of writing to$wnto exploit zero cycle zeroing.It also aligns naming in the generic zeroing test.