[RISCV][llvm] Support VFADD, VFSUB, VFMUL codegen for Zvfbfa #170612

4vtomat · 2025-12-04T06:35:43Z

Support both fixed-length vectors and scalable vectors.

Note: VP version is not gonna be supported for trivial instructions
since they're going to be removed soon.

Support both fixed-length vectors and scalable vectors. Note: VP version is not gonna be supported for trivial instructions since they're going to be removed soon.

llvmbot · 2025-12-04T06:36:14Z

@llvm/pr-subscribers-backend-risc-v

Author: Brandon Wu (4vtomat)

Changes

Support both fixed-length vectors and scalable vectors.

Note: VP version is not gonna be supported for trivial instructions
since they're going to be removed soon.

Patch is 129.93 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/170612.diff

10 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+40-5)
(modified) llvm/lib/Target/RISCV/RISCVInstrInfoZvfbf.td (+75)
(added) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfadd-sdnode.ll (+163)
(added) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfmul-sdnode.ll (+163)
(added) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfrsub-sdnode.ll (+76)
(added) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfsub-sdnode.ll (+163)
(modified) llvm/test/CodeGen/RISCV/rvv/vfadd-sdnode.ll (+18-124)
(modified) llvm/test/CodeGen/RISCV/rvv/vfmul-sdnode.ll (+653-177)
(added) llvm/test/CodeGen/RISCV/rvv/vfrsub-sdnode.ll (+75)
(modified) llvm/test/CodeGen/RISCV/rvv/vfsub-sdnode.ll (+653-177)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index ab2652eac3823..5942236a1ce8f 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -90,8 +90,9 @@ static cl::opt<bool>
 // TODO: Support more ops
 static const unsigned ZvfbfaVPOps[] = {
     ISD::VP_FNEG, ISD::VP_FABS, ISD::VP_FCOPYSIGN, ISD::EXPERIMENTAL_VP_SPLAT};
-static const unsigned ZvfbfaOps[] = {ISD::FNEG, ISD::FABS, ISD::FCOPYSIGN,
-                                     ISD::SPLAT_VECTOR};
+static const unsigned ZvfbfaOps[] = {
+    ISD::FNEG, ISD::FABS, ISD::FCOPYSIGN, ISD::SPLAT_VECTOR,
+    ISD::FADD, ISD::FSUB, ISD::FMUL};
 
 RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
                                          const RISCVSubtarget &STI)
@@ -1090,6 +1091,36 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
         ISD::VECREDUCE_FMINIMUM,
         ISD::VECREDUCE_FMAXIMUM};
 
+    // TODO: support more ops.
+    static const unsigned ZvfbfaPromoteOps[] = {ISD::FMINNUM,
+                                                ISD::FMAXNUM,
+                                                ISD::FMINIMUMNUM,
+                                                ISD::FMAXIMUMNUM,
+                                                ISD::FDIV,
+                                                ISD::FMA,
+                                                ISD::FSQRT,
+                                                ISD::FCEIL,
+                                                ISD::FTRUNC,
+                                                ISD::FFLOOR,
+                                                ISD::FROUND,
+                                                ISD::FROUNDEVEN,
+                                                ISD::FRINT,
+                                                ISD::FNEARBYINT,
+                                                ISD::IS_FPCLASS,
+                                                ISD::SETCC,
+                                                ISD::FMAXIMUM,
+                                                ISD::FMINIMUM,
+                                                ISD::STRICT_FADD,
+                                                ISD::STRICT_FSUB,
+                                                ISD::STRICT_FMUL,
+                                                ISD::STRICT_FDIV,
+                                                ISD::STRICT_FSQRT,
+                                                ISD::STRICT_FMA,
+                                                ISD::VECREDUCE_FMIN,
+                                                ISD::VECREDUCE_FMAX,
+                                                ISD::VECREDUCE_FMINIMUM,
+                                                ISD::VECREDUCE_FMAXIMUM};
+
     // TODO: support more vp ops.
     static const unsigned ZvfhminZvfbfminPromoteVPOps[] = {
         ISD::VP_FADD,
@@ -1294,11 +1325,11 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
 
       // Custom split nxv32[b]f16 since nxv32[b]f32 is not legal.
       if (getLMUL(VT) == RISCVVType::LMUL_8) {
-        setOperationAction(ZvfhminZvfbfminPromoteOps, VT, Custom);
+        setOperationAction(ZvfbfaPromoteOps, VT, Custom);
         setOperationAction(ZvfhminZvfbfminPromoteVPOps, VT, Custom);
       } else {
         MVT F32VecVT = MVT::getVectorVT(MVT::f32, VT.getVectorElementCount());
-        setOperationPromotedToType(ZvfhminZvfbfminPromoteOps, VT, F32VecVT);
+        setOperationPromotedToType(ZvfbfaPromoteOps, VT, F32VecVT);
         setOperationPromotedToType(ZvfhminZvfbfminPromoteVPOps, VT, F32VecVT);
       }
     };
@@ -1615,7 +1646,11 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
           // TODO: could split the f16 vector into two vectors and do promotion.
           if (!isTypeLegal(F32VecVT))
             continue;
-          setOperationPromotedToType(ZvfhminZvfbfminPromoteOps, VT, F32VecVT);
+
+          if (Subtarget.hasStdExtZvfbfa())
+            setOperationPromotedToType(ZvfbfaPromoteOps, VT, F32VecVT);
+          else
+            setOperationPromotedToType(ZvfhminZvfbfminPromoteOps, VT, F32VecVT);
           setOperationPromotedToType(ZvfhminZvfbfminPromoteVPOps, VT, F32VecVT);
           continue;
         }
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZvfbf.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZvfbf.td
index ffb2ac0756da4..7faac137fd41d 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZvfbf.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZvfbf.td
@@ -523,6 +523,71 @@ multiclass VPatConversionVF_WF_BF16<string intrinsic, string instruction,
   }
 }
 
+multiclass VPatBinaryFPSDNode_VV_VF_RM_BF16<SDPatternOperator vop,
+                                            string instruction_name> {
+  foreach vti = AllBF16Vectors in {
+    let Predicates = GetVTypePredicates<vti>.Predicates in {
+      def : VPatBinarySDNode_VV_RM<vop, instruction_name,
+                                   vti.Vector, vti.Vector, vti.Log2SEW,
+                                   vti.LMul, vti.AVL, vti.RegClass, isSEWAware=1>;
+      def : VPatBinarySDNode_VF_RM<vop, instruction_name#"_V"#vti.ScalarSuffix,
+                                   vti.Vector, vti.Vector, vti.Scalar,
+                                   vti.Log2SEW, vti.LMul, vti.AVL, vti.RegClass,
+                                   vti.ScalarRegClass, isSEWAware=1>;
+    }
+  }
+}
+
+multiclass VPatBinaryFPSDNode_R_VF_RM_BF16<SDPatternOperator vop, string instruction_name> {
+  foreach fvti = AllBF16Vectors in
+    let Predicates = GetVTypePredicates<fvti>.Predicates in
+    def : Pat<(fvti.Vector (vop (fvti.Vector (SplatFPOp fvti.Scalar:$rs2)),
+                                (fvti.Vector fvti.RegClass:$rs1))),
+              (!cast<Instruction>(
+                             instruction_name#"_V"#fvti.ScalarSuffix#"_"#fvti.LMul.MX#"_E"#fvti.SEW)
+                           (fvti.Vector (IMPLICIT_DEF)),
+                           fvti.RegClass:$rs1,
+                           (fvti.Scalar fvti.ScalarRegClass:$rs2),
+                           // Value to indicate no rounding mode change in
+                           // RISCVInsertReadWriteCSR
+                           FRM_DYN,
+                           fvti.AVL, fvti.Log2SEW, TA_MA)>;
+}
+
+multiclass VPatBinaryFPVL_VV_VF_RM_BF16<SDPatternOperator vop, string instruction_name> {
+  foreach vti = AllBF16Vectors in {
+    let Predicates = GetVTypePredicates<vti>.Predicates in {
+      def : VPatBinaryVL_V_RM<vop, instruction_name, "VV",
+                              vti.Vector, vti.Vector, vti.Vector, vti.Mask,
+                              vti.Log2SEW, vti.LMul, vti.RegClass, vti.RegClass,
+                              vti.RegClass, isSEWAware=1>;
+      def : VPatBinaryVL_VF_RM<vop, instruction_name#"_V"#vti.ScalarSuffix,
+                               vti.Vector, vti.Vector, vti.Vector, vti.Mask,
+                               vti.Log2SEW, vti.LMul, vti.RegClass, vti.RegClass,
+                               vti.ScalarRegClass, isSEWAware=1>;
+      }
+  }
+}
+
+multiclass VPatBinaryFPVL_R_VF_RM_BF16<SDPatternOperator vop, string instruction_name> {
+  foreach fvti = AllBF16Vectors in {
+    let Predicates = GetVTypePredicates<fvti>.Predicates in
+    def : Pat<(fvti.Vector (vop (SplatFPOp fvti.ScalarRegClass:$rs2),
+                                fvti.RegClass:$rs1,
+                                (fvti.Vector fvti.RegClass:$passthru),
+                                (fvti.Mask VMV0:$vm),
+                                VLOpFrag)),
+              (!cast<Instruction>(instruction_name#"_V"#fvti.ScalarSuffix#"_"#fvti.LMul.MX#"_E"#fvti.SEW#"_MASK")
+                   fvti.RegClass:$passthru,
+                   fvti.RegClass:$rs1, fvti.ScalarRegClass:$rs2,
+                   (fvti.Mask VMV0:$vm),
+                   // Value to indicate no rounding mode change in
+                   // RISCVInsertReadWriteCSR
+                   FRM_DYN,
+                   GPR:$vl, fvti.Log2SEW, TAIL_AGNOSTIC)>;
+  }
+}
+
 let Predicates = [HasStdExtZvfbfa] in {
 defm : VPatBinaryV_VV_VX_RM<"int_riscv_vfadd", "PseudoVFADD_ALT",
                             AllBF16Vectors, isSEWAware = 1>;
@@ -783,4 +848,14 @@ let Predicates = [HasStdExtZvfbfa] in {
                    TAIL_AGNOSTIC)>;
     }
   }
+
+  defm : VPatBinaryFPSDNode_VV_VF_RM_BF16<any_fadd, "PseudoVFADD_ALT">;
+  defm : VPatBinaryFPSDNode_VV_VF_RM_BF16<any_fsub, "PseudoVFSUB_ALT">;
+  defm : VPatBinaryFPSDNode_VV_VF_RM_BF16<any_fmul, "PseudoVFMUL_ALT">;
+  defm : VPatBinaryFPSDNode_R_VF_RM_BF16<any_fsub, "PseudoVFRSUB_ALT">;
+
+  defm : VPatBinaryFPVL_VV_VF_RM_BF16<any_riscv_fadd_vl, "PseudoVFADD_ALT">;
+  defm : VPatBinaryFPVL_VV_VF_RM_BF16<any_riscv_fsub_vl, "PseudoVFSUB_ALT">;
+  defm : VPatBinaryFPVL_VV_VF_RM_BF16<any_riscv_fmul_vl, "PseudoVFMUL_ALT">;
+  defm : VPatBinaryFPVL_R_VF_RM_BF16<any_riscv_fsub_vl, "PseudoVFRSUB_ALT">;
 } // Predicates = [HasStdExtZvfbfa]
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfadd-sdnode.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfadd-sdnode.ll
new file mode 100644
index 0000000000000..14432bd9b1a45
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfadd-sdnode.ll
@@ -0,0 +1,163 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv32 -mattr=+experimental-zvfbfa,+v \
+; RUN:     -target-abi=ilp32d -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv64 -mattr=+experimental-zvfbfa,+v \
+; RUN:     -target-abi=lp64d -verify-machineinstrs < %s | FileCheck %s
+
+define <1 x bfloat> @vfadd_vv_v1bf16(<1 x bfloat> %va, <1 x bfloat> %vb) {
+; CHECK-LABEL: vfadd_vv_v1bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 1, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfadd.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fadd <1 x bfloat> %va, %vb
+  ret <1 x bfloat> %vc
+}
+
+define <1 x bfloat> @vfadd_vf_v1bf16(<1 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfadd_vf_v1bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 1, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfadd.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <1 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <1 x bfloat> %head, <1 x bfloat> poison, <1 x i32> zeroinitializer
+  %vc = fadd <1 x bfloat> %va, %splat
+  ret <1 x bfloat> %vc
+}
+
+define <2 x bfloat> @vfadd_vv_v2bf16(<2 x bfloat> %va, <2 x bfloat> %vb) {
+; CHECK-LABEL: vfadd_vv_v2bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 2, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfadd.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fadd <2 x bfloat> %va, %vb
+  ret <2 x bfloat> %vc
+}
+
+define <2 x bfloat> @vfadd_vf_v2bf16(<2 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfadd_vf_v2bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 2, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfadd.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <2 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <2 x bfloat> %head, <2 x bfloat> poison, <2 x i32> zeroinitializer
+  %vc = fadd <2 x bfloat> %va, %splat
+  ret <2 x bfloat> %vc
+}
+
+define <4 x bfloat> @vfadd_vv_v4bf16(<4 x bfloat> %va, <4 x bfloat> %vb) {
+; CHECK-LABEL: vfadd_vv_v4bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 4, e16alt, mf2, ta, ma
+; CHECK-NEXT:    vfadd.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fadd <4 x bfloat> %va, %vb
+  ret <4 x bfloat> %vc
+}
+
+define <4 x bfloat> @vfadd_vf_v4bf16(<4 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfadd_vf_v4bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 4, e16alt, mf2, ta, ma
+; CHECK-NEXT:    vfadd.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <4 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <4 x bfloat> %head, <4 x bfloat> poison, <4 x i32> zeroinitializer
+  %vc = fadd <4 x bfloat> %va, %splat
+  ret <4 x bfloat> %vc
+}
+
+define <8 x bfloat> @vfadd_vv_v8bf16(<8 x bfloat> %va, <8 x bfloat> %vb) {
+; CHECK-LABEL: vfadd_vv_v8bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 8, e16alt, m1, ta, ma
+; CHECK-NEXT:    vfadd.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fadd <8 x bfloat> %va, %vb
+  ret <8 x bfloat> %vc
+}
+
+define <8 x bfloat> @vfadd_vf_v8bf16(<8 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfadd_vf_v8bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 8, e16alt, m1, ta, ma
+; CHECK-NEXT:    vfadd.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <8 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <8 x bfloat> %head, <8 x bfloat> poison, <8 x i32> zeroinitializer
+  %vc = fadd <8 x bfloat> %va, %splat
+  ret <8 x bfloat> %vc
+}
+
+define <16 x bfloat> @vfadd_vv_v16bf16(<16 x bfloat> %va, <16 x bfloat> %vb) {
+; CHECK-LABEL: vfadd_vv_v16bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 16, e16alt, m2, ta, ma
+; CHECK-NEXT:    vfadd.vv v8, v8, v10
+; CHECK-NEXT:    ret
+  %vc = fadd <16 x bfloat> %va, %vb
+  ret <16 x bfloat> %vc
+}
+
+define <16 x bfloat> @vfadd_vf_v16bf16(<16 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfadd_vf_v16bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 16, e16alt, m2, ta, ma
+; CHECK-NEXT:    vfadd.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <16 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <16 x bfloat> %head, <16 x bfloat> poison, <16 x i32> zeroinitializer
+  %vc = fadd <16 x bfloat> %va, %splat
+  ret <16 x bfloat> %vc
+}
+
+define <32 x bfloat> @vfadd_vv_v32bf16(<32 x bfloat> %va, <32 x bfloat> %vb) {
+; CHECK-LABEL: vfadd_vv_v32bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a0, 32
+; CHECK-NEXT:    vsetvli zero, a0, e16alt, m4, ta, ma
+; CHECK-NEXT:    vfadd.vv v8, v8, v12
+; CHECK-NEXT:    ret
+  %vc = fadd <32 x bfloat> %va, %vb
+  ret <32 x bfloat> %vc
+}
+
+define <32 x bfloat> @vfadd_vf_v32bf16(<32 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfadd_vf_v32bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a0, 32
+; CHECK-NEXT:    vsetvli zero, a0, e16alt, m4, ta, ma
+; CHECK-NEXT:    vfadd.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <32 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <32 x bfloat> %head, <32 x bfloat> poison, <32 x i32> zeroinitializer
+  %vc = fadd <32 x bfloat> %va, %splat
+  ret <32 x bfloat> %vc
+}
+
+define <64 x bfloat> @vfadd_vv_v64bf16(<64 x bfloat> %va, <64 x bfloat> %vb) {
+; CHECK-LABEL: vfadd_vv_v64bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a0, 64
+; CHECK-NEXT:    vsetvli zero, a0, e16alt, m8, ta, ma
+; CHECK-NEXT:    vfadd.vv v8, v8, v16
+; CHECK-NEXT:    ret
+  %vc = fadd <64 x bfloat> %va, %vb
+  ret <64 x bfloat> %vc
+}
+
+define <64 x bfloat> @vfadd_vf_v64bf16(<64 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfadd_vf_v64bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a0, 64
+; CHECK-NEXT:    vsetvli zero, a0, e16alt, m8, ta, ma
+; CHECK-NEXT:    vfadd.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <64 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <64 x bfloat> %head, <64 x bfloat> poison, <64 x i32> zeroinitializer
+  %vc = fadd <64 x bfloat> %va, %splat
+  ret <64 x bfloat> %vc
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfmul-sdnode.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfmul-sdnode.ll
new file mode 100644
index 0000000000000..8dca21f85c5f9
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfmul-sdnode.ll
@@ -0,0 +1,163 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv32 -mattr=+experimental-zvfbfa,+v \
+; RUN:     -target-abi=ilp32d -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv64 -mattr=+experimental-zvfbfa,+v \
+; RUN:     -target-abi=lp64d -verify-machineinstrs < %s | FileCheck %s
+
+define <1 x bfloat> @vfmul_vv_v1bf16(<1 x bfloat> %va, <1 x bfloat> %vb) {
+; CHECK-LABEL: vfmul_vv_v1bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 1, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfmul.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fmul <1 x bfloat> %va, %vb
+  ret <1 x bfloat> %vc
+}
+
+define <1 x bfloat> @vfmul_vf_v1bf16(<1 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfmul_vf_v1bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 1, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfmul.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <1 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <1 x bfloat> %head, <1 x bfloat> poison, <1 x i32> zeroinitializer
+  %vc = fmul <1 x bfloat> %va, %splat
+  ret <1 x bfloat> %vc
+}
+
+define <2 x bfloat> @vfmul_vv_v2bf16(<2 x bfloat> %va, <2 x bfloat> %vb) {
+; CHECK-LABEL: vfmul_vv_v2bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 2, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfmul.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fmul <2 x bfloat> %va, %vb
+  ret <2 x bfloat> %vc
+}
+
+define <2 x bfloat> @vfmul_vf_v2bf16(<2 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfmul_vf_v2bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 2, e16alt, mf4, ta, ma
+; CHECK-NEXT:    vfmul.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <2 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <2 x bfloat> %head, <2 x bfloat> poison, <2 x i32> zeroinitializer
+  %vc = fmul <2 x bfloat> %va, %splat
+  ret <2 x bfloat> %vc
+}
+
+define <4 x bfloat> @vfmul_vv_v4bf16(<4 x bfloat> %va, <4 x bfloat> %vb) {
+; CHECK-LABEL: vfmul_vv_v4bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 4, e16alt, mf2, ta, ma
+; CHECK-NEXT:    vfmul.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fmul <4 x bfloat> %va, %vb
+  ret <4 x bfloat> %vc
+}
+
+define <4 x bfloat> @vfmul_vf_v4bf16(<4 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfmul_vf_v4bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 4, e16alt, mf2, ta, ma
+; CHECK-NEXT:    vfmul.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <4 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <4 x bfloat> %head, <4 x bfloat> poison, <4 x i32> zeroinitializer
+  %vc = fmul <4 x bfloat> %va, %splat
+  ret <4 x bfloat> %vc
+}
+
+define <8 x bfloat> @vfmul_vv_v8bf16(<8 x bfloat> %va, <8 x bfloat> %vb) {
+; CHECK-LABEL: vfmul_vv_v8bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 8, e16alt, m1, ta, ma
+; CHECK-NEXT:    vfmul.vv v8, v8, v9
+; CHECK-NEXT:    ret
+  %vc = fmul <8 x bfloat> %va, %vb
+  ret <8 x bfloat> %vc
+}
+
+define <8 x bfloat> @vfmul_vf_v8bf16(<8 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfmul_vf_v8bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 8, e16alt, m1, ta, ma
+; CHECK-NEXT:    vfmul.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <8 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <8 x bfloat> %head, <8 x bfloat> poison, <8 x i32> zeroinitializer
+  %vc = fmul <8 x bfloat> %va, %splat
+  ret <8 x bfloat> %vc
+}
+
+define <16 x bfloat> @vfmul_vv_v16bf16(<16 x bfloat> %va, <16 x bfloat> %vb) {
+; CHECK-LABEL: vfmul_vv_v16bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 16, e16alt, m2, ta, ma
+; CHECK-NEXT:    vfmul.vv v8, v8, v10
+; CHECK-NEXT:    ret
+  %vc = fmul <16 x bfloat> %va, %vb
+  ret <16 x bfloat> %vc
+}
+
+define <16 x bfloat> @vfmul_vf_v16bf16(<16 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfmul_vf_v16bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetivli zero, 16, e16alt, m2, ta, ma
+; CHECK-NEXT:    vfmul.vf v8, v8, fa0
+; CHECK-NEXT:    ret
+  %head = insertelement <16 x bfloat> poison, bfloat %b, i32 0
+  %splat = shufflevector <16 x bfloat> %head, <16 x bfloat> poison, <16 x i32> zeroinitializer
+  %vc = fmul <16 x bfloat> %va, %splat
+  ret <16 x bfloat> %vc
+}
+
+define <32 x bfloat> @vfmul_vv_v32bf16(<32 x bfloat> %va, <32 x bfloat> %vb) {
+; CHECK-LABEL: vfmul_vv_v32bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a0, 32
+; CHECK-NEXT:    vsetvli zero, a0, e16alt, m4, ta, ma
+; CHECK-NEXT:    vfmul.vv v8, v8, v12
+; CHECK-NEXT:    ret
+  %vc = fmul <32 x bfloat> %va, %vb
+  ret <32 x bfloat> %vc
+}
+
+define <32 x bfloat> @vfmul_vf_v32bf16(<32 x bfloat> %va, bfloat %b) {
+; CHECK-LABEL: vfmul_vf_v32bf16:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    li a0, 32
+; CHECK-NEXT:    vsetvli zero, a0, e16alt, m4, ta, ma
+; CHECK...
[truncated]

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfadd-sdnode.ll

lukel97 · 2025-12-04T06:59:12Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

        ISD::VECREDUCE_FMINIMUM,
        ISD::VECREDUCE_FMAXIMUM};

+    // TODO: support more ops.


Just to check, this list should be come smaller over time right? Should the comment say something like

Suggested change

// TODO: support more ops.

// TODO: Make more of these ops legal.

lukel97 · 2025-12-04T07:04:24Z

llvm/lib/Target/RISCV/RISCVInstrInfoZvfbf.td

  }
 }

+multiclass VPatBinaryFPSDNode_VV_VF_RM_BF16<SDPatternOperator vop,


Instead of creating new pattern multiclasses, can we move them into VPatBinaryFPSDNode_VV_VF_RM? That way we don't need to defm everything again

Yeah, I actually thought about it and found I also defined new ones in previous patches so I just go with it lol
But I think you're right, we should not define a new one so that it's easier to maintain

lukel97

LGTM

lukel97 · 2025-12-04T11:43:52Z

llvm/lib/Target/RISCV/RISCVInstrInfoVSDPatterns.td

-                                       bit isSEWAware = 0> {
-  foreach vti = AllFloatVectors in {
+                                       bit isSEWAware = 0, bit isBF16 = 0> {
+  foreach vti = !if(isBF16, AllBF16Vectors, AllFloatVectors) in {


Oops, I meant as in VPatBinaryFPSDNode_VV_VF_RM would generate both patterns for AllFloatVectors and AllBF16Vectors at the same time. That way you wouldn't need to define the patterns in a separate file.

But I see that you've defined all the patterns in one place for zvfbfa anyway, so maybe we can revisit this later

Oh I think it should be separate pattern definition since there’s “_ALT” suffix in pseudo instruction for bf16 types

You can just append _ALT to instruction_name when isBF16 is true

Yep you are right!

If you used a single instantiation, the isBF16 argument wouldn't exist. So you'd have to get it from fvti.

We also need to fix the Predicates. Currently it's relying on the let Predicates in RISCVInstrInfoZvfbf.td overriding the Predicates that are defined inside the class.

wangpc-pp · 2025-12-04T12:53:08Z

llvm/lib/Target/RISCV/RISCVInstrInfoZvfbf.td

  }
+
+  defm : VPatBinaryFPSDNode_VV_VF_RM<any_fadd, "PseudoVFADD_ALT",
+                                     /*isSEWAware*/ 1, /*isBF16*/ 1>;


Suggested change

/*isSEWAware*/ 1, /*isBF16*/ 1>;

isSEWAware=1, isBF16=1>;

You don't need these comments, you can use named arguments. :-)

Oops, thats right

topperc

LGTM

tclin914 · 2025-12-05T01:20:01Z

llvm/test/CodeGen/RISCV/rvv/vfadd-sdnode.ll

 ; RUN:     --check-prefixes=CHECK,ZVFHMIN
-; RUN: llc -mtriple=riscv64 -mattr=+zvfh,+experimental-zvfbfa,+v \
-; RUN:     -target-abi=lp64d -verify-machineinstrs < %s | FileCheck %s \
+; RUN: llc -mtriple=riscv32 -mattr=+d,+zvfh,+experimental-zvfbfa,+v \


Why here is +zvfh, but in vfmul-sdnode.ll/vfsub-sdnode.ll is +zvfhmin

I had a patch that support fpround and fpextend previously which only update this sd-node test case. When doing this patch I tried to not change unrelated stuffs to make test case hard to read lol
It should be zvfhmin I guess so that people won't get confused, that can be my followup NFC patch

…0612) Support both fixed-length vectors and scalable vectors. Note: VP version is not gonna be supported for trivial instructions since they're going to be removed soon.

[RISCV][llvm] Support VFADD, VFSUB, VFMUL codegen for Zvfbfa

cdb279a

Support both fixed-length vectors and scalable vectors. Note: VP version is not gonna be supported for trivial instructions since they're going to be removed soon.

llvmbot added the backend:RISC-V label Dec 4, 2025

4vtomat requested review from lukel97, tclin914, topperc and wangpc-pp December 4, 2025 06:36

lukel97 reviewed Dec 4, 2025

View reviewed changes

fixup! fix Luke's comments

8385e6c

lukel97 approved these changes Dec 4, 2025

View reviewed changes

wangpc-pp reviewed Dec 4, 2025

View reviewed changes

fixup! use named argument

d018acb

topperc approved these changes Dec 4, 2025

View reviewed changes

tclin914 reviewed Dec 5, 2025

View reviewed changes

4vtomat merged commit e48d49f into llvm:main Dec 5, 2025
10 checks passed

4vtomat deleted the improve_zvfbfa_codegen3 branch December 5, 2025 03:56

	// TODO: support more ops.
	// TODO: Make more of these ops legal.

[RISCV][llvm] Support VFADD, VFSUB, VFMUL codegen for Zvfbfa #170612

[RISCV][llvm] Support VFADD, VFSUB, VFMUL codegen for Zvfbfa #170612

Uh oh!

Conversation

4vtomat commented Dec 4, 2025

Uh oh!

llvmbot commented Dec 4, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

4vtomat Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

4vtomat Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

4vtomat Dec 4, 2025 •

edited

Loading

topperc Dec 4, 2025 •

edited

Loading

4vtomat Dec 5, 2025 •

edited

Loading