Skip to content

Conversation

@lukel97
Copy link
Contributor

@lukel97 lukel97 commented Nov 21, 2025

The motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer.

We can do this today with @llvm.vector.splice by setting one operand to poison:

  • A slide down can be achieved with @llvm.vector.splice(%x, poison, slideamt)
  • A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)

E.g.:

splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>

These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown.

The motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer.

We can do this today with @llvm.vector.splice by setting one operand to poison:

- A slide down can be achieved with @llvm.vector.splice(%x, poison, slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)

E.g.:

    splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
    splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>

These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown.
@llvmbot
Copy link
Member

llvmbot commented Nov 21, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

The motivation for this is that it would be useful to express a vslideup/vslidedown in a target independent way e.g. from the loop vectorizer.

We can do this today with @llvm.vector.splice by setting one operand to poison:

  • A slide down can be achieved with @llvm.vector.splice(%x, poison, slideamt)
  • A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)

E.g.:

splice(&lt;a,b,c,d&gt;, poison, 3) = &lt;d,poison,poison,poison&gt;
splice(poison, &lt;a,b,c,d&gt;, -3) = &lt;poison,poison,poison,a&gt;

These splices get lowered to a vslideup + vslidedown pair with one of the vs2s being poison. We can optimize this away so that we are just left with a single slideup/slidedown.


Full diff: https://github.com/llvm/llvm-project/pull/169013.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+5)
  • (added) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll (+49)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-splice.ll (+30)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 2d6bb06d689c3..209e2969046c9 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -21834,6 +21834,11 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
       return N->getOperand(0);
     break;
   }
+  case RISCVISD::VSLIDEDOWN_VL:
+  case RISCVISD::VSLIDEUP_VL:
+    if (N->getOperand(1)->isUndef())
+      return N->getOperand(0);
+    break;
   case RISCVISD::VSLIDE1UP_VL:
   case RISCVISD::VFSLIDE1UP_VL: {
     using namespace SDPatternMatch;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
new file mode 100644
index 0000000000000..42d1c0321cfd1
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc -mtriple riscv32 -mattr=+v < %s | FileCheck %s
+; RUN: llc -mtriple riscv64 -mattr=+v < %s | FileCheck %s
+; RUN: llc -mtriple riscv32 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s
+; RUN: llc -mtriple riscv64 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s
+
+define <4 x i32> @splice_v4i32_slidedown(<4 x i32> %a, <4 x i32> %b) #0 {
+; CHECK-LABEL: splice_v4i32_slidedown:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT:    vrgather.vi v9, v8, 3
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %res = call <4 x i32> @llvm.vector.splice(<4 x i32> %a, <4 x i32> poison, i32 3)
+  ret <4 x i32> %res
+}
+
+define <4 x i32> @splice_4i32_slideup(<4 x i32> %a) #0 {
+; CHECK-LABEL: splice_4i32_slideup:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
+; CHECK-NEXT:    vrgather.vi v9, v8, 0
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %res = call <4 x i32> @llvm.vector.splice(<4 x i32> poison, <4 x i32> %a, i32 -3)
+  ret <4 x i32> %res
+}
+
+define <8 x i32> @splice_v8i32_slidedown(<8 x i32> %a, <8 x i32> %b) #0 {
+; CHECK-LABEL: splice_v8i32_slidedown:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT:    vslidedown.vi v8, v8, 3
+; CHECK-NEXT:    ret
+  %res = call <8 x i32> @llvm.vector.splice(<8 x i32> %a, <8 x i32> poison, i32 3)
+  ret <8 x i32> %res
+}
+
+define <8 x i32> @splice_v8i32_slideup(<8 x i32> %a) #0 {
+; CHECK-LABEL: splice_v8i32_slideup:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 8, e32, m2, ta, ma
+; CHECK-NEXT:    vslideup.vi v10, v8, 3
+; CHECK-NEXT:    vmv.v.v v8, v10
+; CHECK-NEXT:    ret
+  %res = call <8 x i32> @llvm.vector.splice(<8 x i32> poison, <8 x i32> %a, i32 -3)
+  ret <8 x i32> %res
+}
+
diff --git a/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll b/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
index acc2a97dd5d1f..31936d3a084b2 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vector-splice.ll
@@ -4247,4 +4247,34 @@ define <vscale x 8 x double> @splice_nxv8f64_offset_max(<vscale x 8 x double> %a
   ret <vscale x 8 x double> %res
 }
 
+define <vscale x 2 x i32> @splice_nxv2i32_slidedown(<vscale x 2 x i32> %a) #0 {
+; NOVLDEP-LABEL: splice_nxv2i32_slidedown:
+; NOVLDEP:       # %bb.0:
+; NOVLDEP-NEXT:    vsetvli a0, zero, e32, m1, ta, ma
+; NOVLDEP-NEXT:    vslidedown.vi v8, v8, 3
+; NOVLDEP-NEXT:    ret
+;
+; VLDEP-LABEL: splice_nxv2i32_slidedown:
+; VLDEP:       # %bb.0:
+; VLDEP-NEXT:    csrr a0, vlenb
+; VLDEP-NEXT:    srli a0, a0, 2
+; VLDEP-NEXT:    addi a0, a0, -3
+; VLDEP-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; VLDEP-NEXT:    vslidedown.vi v8, v8, 3
+; VLDEP-NEXT:    ret
+  %res = call <vscale x 2 x i32> @llvm.vector.splice(<vscale x 2 x i32> %a, <vscale x 2 x i32> poison, i32 3)
+  ret <vscale x 2 x i32> %res
+}
+
+define <vscale x 2 x i32> @splice_nxv2i32_slideup(<vscale x 2 x i32> %a) #0 {
+; CHECK-LABEL: splice_nxv2i32_slideup:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT:    vslideup.vi v9, v8, 3
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %res = call <vscale x 2 x i32> @llvm.vector.splice(<vscale x 2 x i32> poison, <vscale x 2 x i32> %a, i32 -3)
+  ret <vscale x 2 x i32> %res
+}
+
 attributes #0 = { vscale_range(2,0) }

@github-actions
Copy link

🐧 Linux x64 Test Results

  • 186433 tests passed
  • 4868 tests skipped

Copy link
Member

@4vtomat 4vtomat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

; CHECK-LABEL: splice_v4i32_slidedown:
; CHECK: # %bb.0:
; CHECK-NEXT: vsetivli zero, 4, e32, m1, ta, ma
; CHECK-NEXT: vrgather.vi v9, v8, 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OT: We should probably be preferring vslidedown.vi for case where we have a single element in the resulting shuffle mask. vrgather.vi forces the use of an extra register where vslidedown.vi doesn't. (This only works for vslidedown, vslideup does have the register constraint.)

}
case RISCVISD::VSLIDEDOWN_VL:
case RISCVISD::VSLIDEUP_VL:
if (N->getOperand(1)->isUndef())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work with isPoison? isUndef returns true for poison and for undef, but we only tested poison.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

independent from what we test: I think this transformation still holds for undef, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's valid for undef.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add an undef test then the undef detector linter will complain.

For the use case in the loop vectorizer just handling poison is fine if we want to avoid isUndef. Are we moving away from it in SelectionDAG?

Copy link
Collaborator

@topperc topperc Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectionDAG support for poison is still in an early stage I think. It probably converts ISD::POISON to ISD::UNDEF in many cases still. I wouldn't be surprised if the vector was ISD::UNDEF by the time it gets here. Especially for the fixed vector case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just want to check then should we add a test case with undef anyway?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please ignore the linter.

; RUN: llc -mtriple riscv32 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s
; RUN: llc -mtriple riscv64 -mattr=+v,+vl-dependent-latency < %s | FileCheck %s

define <4 x i32> @splice_v4i32_slidedown(<4 x i32> %a, <4 x i32> %b) #0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#0

nit: dangling attribute, you probably can remove it.

@lukel97 lukel97 enabled auto-merge (squash) November 24, 2025 03:28
@github-actions
Copy link

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef([^a-zA-Z0-9_-]|$)|UndefValue::get)' 'HEAD~1' HEAD llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/test/CodeGen/RISCV/rvv/vector-splice.ll

The following files introduce new uses of undef:

  • llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vector-splice.ll
  • llvm/test/CodeGen/RISCV/rvv/vector-splice.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@lukel97 lukel97 merged commit 7851b8a into llvm:main Nov 24, 2025
8 of 10 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 24, 2025

LLVM Buildbot has detected a new failure on builder publish-sphinx-docs running on as-worker-4 while building llvm at step 6 "Publish docs-llvm-html".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/45/builds/18477

Here is the relevant piece of the build log for the reference
Step 6 (Publish docs-llvm-html) failure: 'rsync -vrl ...' (failure)
Connection closed by 54.67.122.174 port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(232) [sender=3.2.7]
Step 11 (Publish docs-flang-html) failure: 'rsync -vrl ...' (failure)
Connection closed by 54.67.122.174 port 22
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(232) [sender=3.2.7]

@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 24, 2025

LLVM Buildbot has detected a new failure on builder openmp-s390x-linux running on systemz-1 while building llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/88/builds/18454

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libomp :: tasking/issue-94260-2.c' FAILED ********************
Exit Code: -11

Command Output (stdout):
--
# RUN: at line 1
/home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp   -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic && /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/./bin/clang -fopenmp -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test -L /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -fno-omit-frame-pointer -mbackchain -I /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/ompt /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.src/openmp/runtime/test/tasking/issue-94260-2.c -o /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp -lm -latomic
# executed command: /home/uweigand/sandbox/buildbot/openmp-s390x-linux/llvm.build/runtimes/runtimes-bins/openmp/runtime/test/tasking/Output/issue-94260-2.c.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -11

--

********************


@llvm-ci
Copy link
Collaborator

llvm-ci commented Nov 24, 2025

LLVM Buildbot has detected a new failure on builder ppc64le-flang-rhel-clang running on ppc64le-flang-rhel-test while building llvm at step 7 "test-build-unified-tree-check-flang-rt".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/157/builds/42472

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-flang-rt) failure: 1200 seconds without output running [b'ninja', b'check-flang-rt'], attempting to kill

aadeshps-mcw pushed a commit to aadeshps-mcw/llvm-project that referenced this pull request Nov 26, 2025
The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.

We can do this today with @llvm.vector.splice by setting one operand to
poison:

- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)

E.g.:

    splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
    splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>

These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
Priyanshu3820 pushed a commit to Priyanshu3820/llvm-project that referenced this pull request Nov 26, 2025
The motivation for this is that it would be useful to express a
vslideup/vslidedown in a target independent way e.g. from the loop
vectorizer.

We can do this today with @llvm.vector.splice by setting one operand to
poison:

- A slide down can be achieved with @llvm.vector.splice(%x, poison,
slideamt)
- A slide up can be done by @llvm.vector.splice(poison, %x, -slideamt)

E.g.:

    splice(<a,b,c,d>, poison, 3) = <d,poison,poison,poison>
    splice(poison, <a,b,c,d>, -3) = <poison,poison,poison,a>

These splices get lowered to a vslideup + vslidedown pair with one of
the vs2s being poison. We can optimize this away so that we are just
left with a single slideup/slidedown.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants