-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVM][CodeGen][AArch64] Improve PTEST removal by looking through copies. #132041
Conversation
…ies. The general predicates of the PTEST and PTEST_like instructions may belong to different register classes. This can lead to the insertion of a COPY instruction, making them appear different. However, for the purpose of PTEST removal, such copies are irrelevant, and we can look through them to improve the likelihood of finding a match.
@llvm/pr-subscribers-backend-aarch64 Author: Paul Walker (paulwalker-arm) ChangesThe general predicates of the PTEST and PTEST_like instructions may belong to different register classes. This can lead to the insertion of a COPY instruction, making them appear different. However, for the purpose of PTEST removal, such copies are irrelevant, and we can look through them to improve the likelihood of finding a match. Full diff: https://github.com/llvm/llvm-project/pull/132041.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index c91590fa43601..9f8082b64ab18 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -1491,13 +1491,22 @@ AArch64InstrInfo::canRemovePTestInstr(MachineInstr *PTest, MachineInstr *Mask,
if ((Mask == Pred) && PTest->getOpcode() == AArch64::PTEST_PP_ANY)
return PredOpcode;
+ auto PTestLikeMask = MRI->getUniqueVRegDef(Pred->getOperand(1).getReg());
+
+ // If the PTEST like instruction's general predicate is not `Mask`, attempt
+ // to look through a copy and try again. This is because some instructions
+ // take a predicate whose register class is a subset of its result class.
+ if (Mask != PTestLikeMask && PTestLikeMask->isFullCopy() &&
+ PTestLikeMask->getOperand(1).getReg().isVirtual())
+ PTestLikeMask =
+ MRI->getUniqueVRegDef(PTestLikeMask->getOperand(1).getReg());
+
// For PTEST(PTRUE_ALL, PTEST_LIKE), the PTEST is redundant if the
// the element size matches and either the PTEST_LIKE instruction uses
// the same all active mask or the condition is "any".
if (isPTrueOpcode(MaskOpcode) && Mask->getOperand(1).getImm() == 31 &&
getElementSizeForOpcode(MaskOpcode) ==
getElementSizeForOpcode(PredOpcode)) {
- auto PTestLikeMask = MRI->getUniqueVRegDef(Pred->getOperand(1).getReg());
if (Mask == PTestLikeMask || PTest->getOpcode() == AArch64::PTEST_PP_ANY)
return PredOpcode;
}
@@ -1524,7 +1533,6 @@ AArch64InstrInfo::canRemovePTestInstr(MachineInstr *PTest, MachineInstr *Mask,
// active flag, whereas the PTEST instruction with the same mask doesn't.
// For PTEST_ANY this doesn't apply as the flags in this case would be
// identical regardless of element size.
- auto PTestLikeMask = MRI->getUniqueVRegDef(Pred->getOperand(1).getReg());
uint64_t PredElementSize = getElementSizeForOpcode(PredOpcode);
if (Mask == PTestLikeMask && (PredElementSize == AArch64::ElementSizeB ||
PTest->getOpcode() == AArch64::PTEST_PP_ANY))
diff --git a/llvm/test/CodeGen/AArch64/sve-ptest-removal-cmpeq.mir b/llvm/test/CodeGen/AArch64/sve-ptest-removal-cmpeq.mir
index bd1ff4c2c9726..7ae9b60a3d331 100644
--- a/llvm/test/CodeGen/AArch64/sve-ptest-removal-cmpeq.mir
+++ b/llvm/test/CodeGen/AArch64/sve-ptest-removal-cmpeq.mir
@@ -661,3 +661,48 @@ body: |
RET_ReallyLR implicit $w0
...
+---
+name: cmpeq_nxv16i8_ptest_with_register_class_mismatch
+alignment: 2
+tracksRegLiveness: true
+registers:
+ - { id: 0, class: ppr }
+ - { id: 1, class: zpr }
+ - { id: 2, class: zpr }
+ - { id: 3, class: ppr_3b }
+ - { id: 4, class: ppr }
+ - { id: 5, class: gpr32 }
+ - { id: 6, class: gpr32 }
+liveins:
+ - { reg: '$z0', virtual-reg: '%1' }
+ - { reg: '$z1', virtual-reg: '%2' }
+frameInfo:
+ maxCallFrameSize: 0
+body: |
+ bb.0:
+ liveins: $z0, $z1
+
+ ; CHECK-LABEL: name: cmpeq_nxv16i8_ptest_with_register_class_mismatch
+ ; CHECK: liveins: $z0, $z1
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:zpr = COPY $z0
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:zpr = COPY $z1
+ ; CHECK-NEXT: [[PTRUE_B:%[0-9]+]]:ppr = PTRUE_B 31, implicit $vg
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:ppr_3b = COPY [[PTRUE_B]]
+ ; CHECK-NEXT: [[CMPEQ_PPzZZ_B:%[0-9]+]]:ppr = CMPEQ_PPzZZ_B [[COPY2]], [[COPY]], [[COPY1]], implicit-def $nzcv
+ ; CHECK-NEXT: [[COPY3:%[0-9]+]]:gpr32 = COPY $wzr
+ ; CHECK-NEXT: [[CSINCWr:%[0-9]+]]:gpr32 = CSINCWr [[COPY3]], $wzr, 0, implicit $nzcv
+ ; CHECK-NEXT: $w0 = COPY [[CSINCWr]]
+ ; CHECK-NEXT: RET_ReallyLR implicit $w0
+ %1:zpr = COPY $z0
+ %2:zpr = COPY $z1
+ %0:ppr = PTRUE_B 31, implicit $vg
+ %3:ppr_3b = COPY %0
+ %4:ppr = CMPEQ_PPzZZ_B %3, %1, %2, implicit-def dead $nzcv
+ PTEST_PP %0, killed %4, implicit-def $nzcv
+ %5:gpr32 = COPY $wzr
+ %6:gpr32 = CSINCWr %5, $wzr, 0, implicit $nzcv
+ $w0 = COPY %6
+ RET_ReallyLR implicit $w0
+
+...
|
@@ -689,8 +689,7 @@ body: | | |||
; CHECK-NEXT: [[COPY1:%[0-9]+]]:zpr = COPY $z1 | |||
; CHECK-NEXT: [[PTRUE_B:%[0-9]+]]:ppr = PTRUE_B 31, implicit $vg | |||
; CHECK-NEXT: [[COPY2:%[0-9]+]]:ppr_3b = COPY [[PTRUE_B]] | |||
; CHECK-NEXT: [[CMPEQ_PPzZZ_B:%[0-9]+]]:ppr = CMPEQ_PPzZZ_B [[COPY2]], [[COPY]], [[COPY1]], implicit-def dead $nzcv | |||
; CHECK-NEXT: PTEST_PP [[PTRUE_B]], killed [[CMPEQ_PPzZZ_B]], implicit-def $nzcv | |||
; CHECK-NEXT: [[CMPEQ_PPzZZ_B:%[0-9]+]]:ppr = CMPEQ_PPzZZ_B [[COPY2]], [[COPY]], [[COPY1]], implicit-def $nzcv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also have the other case where you have something like:
[[PTRUE_B:%[0-9]+]]:ppr_3b = PTRUE_B 31, implicit $vg
[[CMPEQ_PPzZZ_B:%[0-9]+]]:ppr = CMPEQ_PPzZZ_B [[PTRUE_B]], [[COPY]], [[COPY1]], implicit-def dead $nzcv
[[COPY2:%[0-9]+]]:ppr = COPY [[PTRUE_B]]
PTEST_PP [[COPY2]], killed [[CMPEQ_PPzZZ_B]], implicit-def $nzcv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so because PTRUE_B
supports all predicate registers and so at this stage I would expect its virtual register to always be a PPR
rather than any of the subclasses.
The scenario can happen after register coalescing but that occurs after the peephole pass in which the PTEST
removal is performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/9382 Here is the relevant piece of the build log for the reference
|
The general predicates of the PTEST and PTEST_like instructions may belong to different register classes. This can lead to the insertion of a COPY instruction, making them appear different. However, for the purpose of PTEST removal, such copies are irrelevant, and we can look through them to improve the likelihood of finding a match.