Skip to content

[RISCV] Add isCommutable for VDOTA4 and VDOTA4U#190090

Merged
tclin914 merged 2 commits intollvm:mainfrom
tclin914:add-iscommutable-vdota4
Apr 8, 2026
Merged

[RISCV] Add isCommutable for VDOTA4 and VDOTA4U#190090
tclin914 merged 2 commits intollvm:mainfrom
tclin914:add-iscommutable-vdota4

Conversation

@tclin914
Copy link
Copy Markdown
Contributor

@tclin914 tclin914 commented Apr 2, 2026

Mark PseudoVDOTA4_VV and PseudoVDOTA4U_VV as commutable since both
source operands have the same signedness. VDOTA4SU is left
non-commutable because its operands differ in signedness (signed x
unsigned).

Add findCommutedOpIndices cases for the new commutable pseudos and
a test covering commutable and non-commutable dot product variants.

tclin914 and others added 2 commits April 1, 2026 15:24
Mark PseudoVDOTA4_VV and PseudoVDOTA4U_VV as commutable since both
source operands have the same signedness. VDOTA4SU is left
non-commutable because its operands differ in signedness (signed x
unsigned).

Add findCommutedOpIndices cases for the new commutable pseudos and
a test covering commutable and non-commutable dot product variants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 2, 2026

@llvm/pr-subscribers-backend-risc-v

Author: Jim Lin (tclin914)

Changes

Mark PseudoVDOTA4_VV and PseudoVDOTA4U_VV as commutable since both
source operands have the same signedness. VDOTA4SU is left
non-commutable because its operands differ in signedness (signed x
unsigned).

Add findCommutedOpIndices cases for the new commutable pseudos and
a test covering commutable and non-commutable dot product variants.


Full diff: https://github.com/llvm/llvm-project/pull/190090.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+10)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoZvdot4a8i.td (+4-5)
  • (added) llvm/test/CodeGen/RISCV/rvv/commutable-zvdot4a8i.ll (+126)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 089683a43f800..fd13c0e3607b3 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -4116,6 +4116,16 @@ bool RISCVInstrInfo::findCommutedOpIndices(const MachineInstr &MI,
   case CASE_RVV_OPCODE(VAADD_VV):
   case CASE_RVV_OPCODE(VAADDU_VV):
   case CASE_RVV_OPCODE(VSMUL_VV):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4_VV, MF2):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4_VV, M1):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4_VV, M2):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4_VV, M4):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4_VV, M8):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4U_VV, MF2):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4U_VV, M1):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4U_VV, M2):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4U_VV, M4):
+  case CASE_RVV_OPCODE_LMUL(VDOTA4U_VV, M8):
     // Operands 2 and 3 are commutable.
     return fixCommutedOpIndices(SrcOpIdx1, SrcOpIdx2, 2, 3);
   case CASE_VFMA_SPLATS(FMADD):
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZvdot4a8i.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZvdot4a8i.td
index b9fbe3e5286ee..d8c60dc9a584c 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZvdot4a8i.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZvdot4a8i.td
@@ -61,9 +61,9 @@ let HasPassthruOp = true, HasMaskOp = true in {
 // Pseudo Instructions for CodeGen
 //===----------------------------------------------------------------------===//
 
-multiclass VPseudoVDOTA4_VV_VX {
+multiclass VPseudoVDOTA4_VV_VX<bit Commutable = 0> {
   foreach m = MxSet<32>.m in {
-    defm "" : VPseudoBinaryV_VV<m>,
+    defm "" : VPseudoBinaryV_VV<m, Commutable=Commutable>,
               SchedBinary<"WriteVIMulAddV", "ReadVIMulAddV", "ReadVIMulAddV", m.MX,
                           forcePassthruRead=true>;
     defm "" : VPseudoBinaryV_VX<m>,
@@ -72,11 +72,10 @@ multiclass VPseudoVDOTA4_VV_VX {
   }
 }
 
-// TODO: Add isCommutable for VDOTA4 and VDOTA4U
 let Predicates = [HasStdExtZvdot4a8i], mayLoad = 0, mayStore = 0,
     hasSideEffects = 0 in {
-  defm PseudoVDOTA4 : VPseudoVDOTA4_VV_VX;
-  defm PseudoVDOTA4U : VPseudoVDOTA4_VV_VX;
+  defm PseudoVDOTA4 : VPseudoVDOTA4_VV_VX<Commutable=1>;
+  defm PseudoVDOTA4U : VPseudoVDOTA4_VV_VX<Commutable=1>;
   defm PseudoVDOTA4SU : VPseudoVDOTA4_VV_VX;
   // VDOTA4US does not have a VV variant
   foreach m = MxListVF4 in {
diff --git a/llvm/test/CodeGen/RISCV/rvv/commutable-zvdot4a8i.ll b/llvm/test/CodeGen/RISCV/rvv/commutable-zvdot4a8i.ll
new file mode 100644
index 0000000000000..e5b3324651fce
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/commutable-zvdot4a8i.ll
@@ -0,0 +1,126 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -mattr=+zve64x,+experimental-zvdot4a8i \
+; RUN:   -verify-machineinstrs | FileCheck %s
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -mattr=+zve64x,+experimental-zvdot4a8i \
+; RUN:   -verify-machineinstrs | FileCheck %s
+
+; vdota4.vv - commutable
+define <vscale x 2 x i32> @commutable_vdota4_vv(<vscale x 2 x i32> %0, <vscale x 2 x i32> %1, iXLen %2) nounwind {
+; CHECK-LABEL: commutable_vdota4_vv:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; CHECK-NEXT:    vdota4.vv v8, v8, v9
+; CHECK-NEXT:    vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v8, v8, v8
+; CHECK-NEXT:    ret
+entry:
+  %a = call <vscale x 2 x i32> @llvm.riscv.vdota4.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %0,
+    <vscale x 2 x i32> %1,
+    iXLen %2, iXLen 1)
+  %b = call <vscale x 2 x i32> @llvm.riscv.vdota4.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %1,
+    <vscale x 2 x i32> %0,
+    iXLen %2, iXLen 1)
+  %ret = add <vscale x 2 x i32> %a, %b
+  ret <vscale x 2 x i32> %ret
+}
+
+define <vscale x 2 x i32> @commutable_vdota4_vv_masked(<vscale x 2 x i32> %0, <vscale x 2 x i32> %1, <vscale x 2 x i1> %mask, iXLen %2) {
+; CHECK-LABEL: commutable_vdota4_vv_masked:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; CHECK-NEXT:    vdota4.vv v8, v8, v9, v0.t
+; CHECK-NEXT:    vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v8, v8, v8
+; CHECK-NEXT:    ret
+  %a = call <vscale x 2 x i32> @llvm.riscv.vdota4.mask.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %0,
+    <vscale x 2 x i32> %1,
+    <vscale x 2 x i1> %mask,
+    iXLen %2, iXLen 1)
+  %b = call <vscale x 2 x i32> @llvm.riscv.vdota4.mask.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %1,
+    <vscale x 2 x i32> %0,
+    <vscale x 2 x i1> %mask,
+    iXLen %2, iXLen 1)
+  %ret = add <vscale x 2 x i32> %a, %b
+  ret <vscale x 2 x i32> %ret
+}
+
+; vdota4u.vv - commutable
+define <vscale x 2 x i32> @commutable_vdota4u_vv(<vscale x 2 x i32> %0, <vscale x 2 x i32> %1, iXLen %2) nounwind {
+; CHECK-LABEL: commutable_vdota4u_vv:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; CHECK-NEXT:    vdota4u.vv v8, v8, v9
+; CHECK-NEXT:    vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v8, v8, v8
+; CHECK-NEXT:    ret
+entry:
+  %a = call <vscale x 2 x i32> @llvm.riscv.vdota4u.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %0,
+    <vscale x 2 x i32> %1,
+    iXLen %2, iXLen 1)
+  %b = call <vscale x 2 x i32> @llvm.riscv.vdota4u.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %1,
+    <vscale x 2 x i32> %0,
+    iXLen %2, iXLen 1)
+  %ret = add <vscale x 2 x i32> %a, %b
+  ret <vscale x 2 x i32> %ret
+}
+
+define <vscale x 2 x i32> @commutable_vdota4u_vv_masked(<vscale x 2 x i32> %0, <vscale x 2 x i32> %1, <vscale x 2 x i1> %mask, iXLen %2) {
+; CHECK-LABEL: commutable_vdota4u_vv_masked:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; CHECK-NEXT:    vdota4u.vv v8, v8, v9, v0.t
+; CHECK-NEXT:    vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v8, v8, v8
+; CHECK-NEXT:    ret
+  %a = call <vscale x 2 x i32> @llvm.riscv.vdota4u.mask.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %0,
+    <vscale x 2 x i32> %1,
+    <vscale x 2 x i1> %mask,
+    iXLen %2, iXLen 1)
+  %b = call <vscale x 2 x i32> @llvm.riscv.vdota4u.mask.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %1,
+    <vscale x 2 x i32> %0,
+    <vscale x 2 x i1> %mask,
+    iXLen %2, iXLen 1)
+  %ret = add <vscale x 2 x i32> %a, %b
+  ret <vscale x 2 x i32> %ret
+}
+
+; vdota4su.vv - NOT commutable (signed x unsigned, operand order matters)
+define <vscale x 2 x i32> @commutable_vdota4su_vv(<vscale x 2 x i32> %0, <vscale x 2 x i32> %1, iXLen %2) nounwind {
+; CHECK-LABEL: commutable_vdota4su_vv:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
+; CHECK-NEXT:    vdota4su.vv v10, v8, v9
+; CHECK-NEXT:    vdota4su.vv v8, v9, v8
+; CHECK-NEXT:    vsetvli a0, zero, e32, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v8, v10, v8
+; CHECK-NEXT:    ret
+entry:
+  %a = call <vscale x 2 x i32> @llvm.riscv.vdota4su.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %0,
+    <vscale x 2 x i32> %1,
+    iXLen %2, iXLen 1)
+  %b = call <vscale x 2 x i32> @llvm.riscv.vdota4su.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> poison,
+    <vscale x 2 x i32> %1,
+    <vscale x 2 x i32> %0,
+    iXLen %2, iXLen 1)
+  %ret = add <vscale x 2 x i32> %a, %b
+  ret <vscale x 2 x i32> %ret
+}

Copy link
Copy Markdown
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@sweiglbosker sweiglbosker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tclin914 tclin914 merged commit 0f16b90 into llvm:main Apr 8, 2026
12 checks passed
@tclin914 tclin914 deleted the add-iscommutable-vdota4 branch April 8, 2026 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants