-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[RISCV] Combine vslide{up,down} x, poison -> x #169013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to poison:
- A slide down can be achieved with @llvm.vector.splice(%x, poison, slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)
E.g.:
splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>
These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown.
|
@llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesThe motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer. We can do this today with @llvm.vector.splice by setting one operand to poison:
E.g.: These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown. Full diff: https://github.com/llvm/llvm-project/pull/169013.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 2d6bb06d689c3..209e2969046c9 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -21834,6 +21834,11 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
return N->getOperand(0);
break;
}
+ case RISCVISD::VSLIDEDOWN_VL:
+ case RISCVISD::VSLIDEUP_VL:
+ if (N->getOperand(1)->isUndef())
+ return N->getOperand(0);
+ break;
case RISCVISD::VSLIDE1UP_VL:
case RISCVISD::VFSLIDE1UP_VL: {
using namespace SDPatternMatch;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
new file mode 100644
index 0000000000000..42d1c0321cfd1
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc -mtriple riscv32 -mattr=+v < %s | FileCheck %s
+; RUN: llc -mtriple riscv64 -mattr=+v < %s | FileCheck %s
+; RUN: llc -mtriple riscv32 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s
+; RUN: llc -mtriple riscv64 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s
+
+define <4 x i32> @splice_v4i32_slidedown(<4 x i32> %a, <4 x i32> %b) #0 {
+; CHECK-LABEL: splice_v4i32_slidedown:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT: vrgather.vi v9, v8, 3
+; CHECK-NEXT: vmv.v.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <4 x i32> @llvm.vector.splice(<4 x i32> %a, <4 x i32> poison, i32 3)
+ ret <4 x i32> %res
+}
+
+define <4 x i32> @splice_4i32_slideup(<4 x i32> %a) #0 {
+; CHECK-LABEL: splice_4i32_slideup:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT: vrgather.vi v9, v8, 0
+; CHECK-NEXT: vmv.v.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <4 x i32> @llvm.vector.splice(<4 x i32> poison, <4 x i32> %a, i32 -3)
+ ret <4 x i32> %res
+}
+
+define <8 x i32> @splice_v8i32_slidedown(<8 x i32> %a, <8 x i32> %b) #0 {
+; CHECK-LABEL: splice_v8i32_slidedown:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT: vslidedown.vi v8, v8, 3
+; CHECK-NEXT: ret
+ %res = call <8 x i32> @llvm.vector.splice(<8 x i32> %a, <8 x i32> poison, i32 3)
+ ret <8 x i32> %res
+}
+
+define <8 x i32> @splice_v8i32_slideup(<8 x i32> %a) #0 {
+; CHECK-LABEL: splice_v8i32_slideup:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT: vslideup.vi v10, v8, 3
+; CHECK-NEXT: vmv.v.v v8, v10
+; CHECK-NEXT: ret
+ %res = call <8 x i32> @llvm.vector.splice(<8 x i32> poison, <8 x i32> %a, i32 -3)
+ ret <8 x i32> %res
+}
+
diff --git a/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll b/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
index acc2a97dd5d1f..31936d3a084b2 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
@@ -4247,4 +4247,34 @@ define <vscale x 8 x double> @splice_nxv8f64_offset_max(<vscale x 8 x double> %a
ret <vscale x 8 x double> %res
}
+define <vscale x 2 x i32> @splice_nxv2i32_slidedown(<vscale x 2 x i32> %a) #0 {
+; NOVLDEP-LABEL: splice_nxv2i32_slidedown:
+; NOVLDEP: # %bb.0:
+; NOVLDEP-NEXT: vsetvli a0, zero, e32, m1, ta, ma
+; NOVLDEP-NEXT: vslidedown.vi v8, v8, 3
+; NOVLDEP-NEXT: ret
+;
+; VLDEP-LABEL: splice_nxv2i32_slidedown:
+; VLDEP: # %bb.0:
+; VLDEP-NEXT: csrr a0, vlenb
+; VLDEP-NEXT: srli a0, a0, 2
+; VLDEP-NEXT: addi a0, a0, -3
+; VLDEP-NEXT: vsetvli zero, a0, e32, m1, ta, ma
+; VLDEP-NEXT: vslidedown.vi v8, v8, 3
+; VLDEP-NEXT: ret
+ %res = call <vscale x 2 x i32> @llvm.vector.splice(<vscale x 2 x i32> %a, <vscale x 2 x i32> poison, i32 3)
+ ret <vscale x 2 x i32> %res
+}
+
+define <vscale x 2 x i32> @splice_nxv2i32_slideup(<vscale x 2 x i32> %a) #0 {
+; CHECK-LABEL: splice_nxv2i32_slideup:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT: vslideup.vi v9, v8, 3
+; CHECK-NEXT: vmv.v.v v8, v9
+; CHECK-NEXT: ret
+ %res = call <vscale x 2 x i32> @llvm.vector.splice(<vscale x 2 x i32> poison, <vscale x 2 x i32> %a, i32 -3)
+ ret <vscale x 2 x i32> %res
+}
+
attributes #0 = { vscale_range(2,0) }
|
🐧 Linux x64 Test Results
|
4vtomat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM~
wangpc-pp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
preames
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| ; CHECK-LABEL: splice_v4i32_slidedown: | ||
| ; CHECK: # %bb.0: | ||
| ; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma | ||
| ; CHECK-NEXT: vrgather.vi v9, v8, 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OT: We should probably be preferring vslidedown.vi for case where we have a single element in the resulting shuffle mask. vrgather.vi forces the use of an extra register where vslidedown.vi doesn't. (This only works for vslidedown, vslideup does have the register constraint.)
| } | ||
| case RISCVISD::VSLIDEDOWN_VL: | ||
| case RISCVISD::VSLIDEUP_VL: | ||
| if (N->getOperand(1)->isUndef()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with isPoison? isUndef returns true for poison and for undef, but we only tested poison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
independent from what we test: I think this transformation still holds for undef, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's valid for undef.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add an undef test then the undef detector linter will complain.
For the use case in the loop vectorizer just handling poison is fine if we want to avoid isUndef. Are we moving away from it in SelectionDAG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SelectionDAG support for poison is still in an early stage I think. It probably converts ISD::POISON to ISD::UNDEF in many cases still. I wouldn't be surprised if the vector was ISD::UNDEF by the time it gets here. Especially for the fixed vector case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, just want to check then should we add a test case with undef anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please ignore the linter.
| ; RUN: llc -mtriple riscv32 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s | ||
| ; RUN: llc -mtriple riscv64 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s | ||
|
|
||
| define <4 x i32> @splice_v4i32_slidedown(<4 x i32> %a, <4 x i32> %b) #0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#0
nit: dangling attribute, you probably can remove it.
You can test this locally with the following command:git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef([^a-zA-Z0-9_-]|$)|UndefValue::get)' 'HEAD~1' HEAD llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/test/CodeGen/RISCV/rvv/vector-splice.llThe following files introduce new uses of undef:
Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields In tests, avoid using For example, this is considered a bad practice: define void @fn() {
...
br i1 undef, ...
}Please use the following instead: define void @fn(i1 %cond) {
...
br i1 %cond, ...
}Please refer to the Undefined Behavior Manual for more information. |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/45/builds/18477 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/18454 Here is the relevant piece of the build log for the reference |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/42472 Here is the relevant piece of the build log for the reference |
The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to
poison:
- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)
E.g.:
splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>
These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to
poison:
- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)
E.g.:
splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>
These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
The motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer.
We can do this today with @llvm.vector.splice by setting one operand to poison:
E.g.:
These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown.