Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV][ISel] Use vaaddu with rounding mode rdn for ISD::AVGFLOORU. #76550

Merged
merged 1 commit into from
Jan 9, 2024

Conversation

sun-jacobi
Copy link
Member

@sun-jacobi sun-jacobi commented Dec 29, 2023

This patch aims to use vaaddu with rounding mode rdn (i.e vxrm[1:0] = 0b10) for ISD::AVGFLOORU.

Source code

define <8 x i8> @vaaddu_auto(ptr %x, ptr %y, ptr %z) {
  %xv = load <8 x i8>, ptr %x, align 2
  %yv = load <8 x i8>, ptr %y, align 2
  %xzv = zext <8 x i8> %xv to <8 x i16>
  %yzv = zext <8 x i8> %yv to <8 x i16>
  %add = add nuw nsw <8 x i16> %xzv, %yzv
  %div = lshr <8 x i16> %add, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
  %ret = trunc <8 x i16> %div to <8 x i8>
  ret <8 x i8> %ret 
}

Before this patch

vaaddu_auto: 
        vsetivli        zero, 8, e8, mf2, ta, ma
        vle8.v  v8, (a0)
        vle8.v  v9, (a1)
        vwaddu.vv       v10, v8, v9
        vnsrl.wi        v8, v10, 1
        ret

After this patch

vaaddu_auto: 
	vsetivli	zero, 8, e8, mf2, ta, ma
	vle8.v	v8, (a0)
	vle8.v	v9, (a1)
	csrwi	vxrm, 2
	vaaddu.vv	v8, v8, v9
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 29, 2023

@llvm/pr-subscribers-backend-risc-v

Author: Chia (sun-jacobi)

Changes

This patch aims to use vaaddu for averaging unsigned addition.

Source code

define &lt;8 x i8&gt; @<!-- -->vaaddu_auto(ptr %x, ptr %y, ptr %z) {
  %xv = load &lt;8 x i8&gt;, ptr %x, align 2
  %yv = load &lt;8 x i8&gt;, ptr %y, align 2
  %xzv = zext &lt;8 x i8&gt; %xv to &lt;8 x i16&gt;
  %yzv = zext &lt;8 x i8&gt; %yv to &lt;8 x i16&gt;
  %add = add nuw nsw &lt;8 x i16&gt; %xzv, %yzv
  %div = lshr &lt;8 x i16&gt; %add, &lt;i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1&gt;
  %ret = trunc &lt;8 x i16&gt; %div to &lt;8 x i8&gt;
  ret &lt;8 x i8&gt; %ret 
}

Before this patch

vaaddu_auto: 
        vsetivli        zero, 8, e8, mf2, ta, ma
        vle8.v  v8, (a0)
        vle8.v  v9, (a1)
        vwaddu.vv       v10, v8, v9
        vnsrl.wi        v8, v10, 1
        ret

After this patch

vaaddu_auto: 
	vsetivli	zero, 8, e8, mf2, ta, ma
	vle8.v	v8, (a0)
	vle8.v	v9, (a1)
	csrwi	vxrm, 2
	vaaddu.vv	v8, v8, v9
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu


Full diff: https://github.com/llvm/llvm-project/pull/76550.diff

1 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td (+29)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
index 33bdc3366aa3e3..67fbd6141967a6 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td
@@ -2338,6 +2338,35 @@ defm : VPatBinaryVL_VV_VX_VI<riscv_uaddsat_vl, "PseudoVSADDU">;
 defm : VPatBinaryVL_VV_VX<riscv_ssubsat_vl, "PseudoVSSUB">;
 defm : VPatBinaryVL_VV_VX<riscv_usubsat_vl, "PseudoVSSUBU">;
 
+// 12.2 Vector Single-Width Averaging Add and Subtract
+
+// trunc_vl (srl_vl (vwaddu X, Y), splat 1) -> vaaddu X, Y
+class VPatAverageAddTruncShift<string inst_name,
+                               SDNode vop,
+                               VTypeInfo vti,
+                               VTypeInfo wti>
+  : Pat<
+    (vti.Vector (riscv_trunc_vector_vl
+      (wti.Vector (riscv_srl_vl
+        (wti.Vector (vop
+          (vti.Vector vti.RegClass:$rs1),
+          (vti.Vector vti.RegClass:$rs2),
+          (wti.Vector undef), (vti.Mask V0), VLOpFrag)),
+        (wti.Vector (riscv_vmv_v_x_vl (wti.Vector undef), 1, VLOpFrag)),
+        (wti.Vector undef), (wti.Mask V0), VLOpFrag)),
+      (wti.Mask V0), VLOpFrag)),
+    (!cast<Instruction>(inst_name#"_VV_"#vti.LMul.MX#"_MASK")
+      (vti.Vector (IMPLICIT_DEF)), vti.RegClass:$rs1, vti.RegClass:$rs2,
+      (vti.Mask V0), 0b10, GPR:$vl, vti.Log2SEW, TA_MA)>;
+
+foreach vtiToWti = AllWidenableIntVectors in {
+  defvar vti = vtiToWti.Vti;
+  defvar wti = vtiToWti.Wti;
+  let Predicates = !listconcat(GetVTypePredicates<vti>.Predicates,
+                               GetVTypePredicates<wti>.Predicates) in
+    def : VPatAverageAddTruncShift<"PseudoVAADDU", riscv_vwaddu_vl, vti, wti>;
+}
+
 // 13. Vector Floating-Point Instructions
 
 // 13.2. Vector Single-Width Floating-Point Add/Subtract Instructions

@topperc
Copy link
Collaborator

topperc commented Dec 29, 2023

I don't think this is the right way to do this. We should make ISD::AVGFLOORU Custom for RISC-V and let DAGCombiner create an ISD::AVGFLOORU node that we can custom lower to a new RISCVISD::VAADDU_VL opcode.

@sun-jacobi
Copy link
Member Author

I don't think this is the right way to do this. We should make ISD::AVGFLOORU Custom for RISC-V and let DAGCombiner create an ISD::AVGFLOORU node that we can custom lower to a new RISCVISD::VAADDU_VL opcode.

Thank you for the advice, I will try this way.

@sun-jacobi sun-jacobi changed the title [RISCV] fold trunc_vl (srl_vl (vwaddu X, Y), splat 1) -> vaaddu X, Y [RISCV][ISel] Fold trunc (lshr (add (zext X), (zext Y)), 1) -> vaaddu X, Y Jan 4, 2024
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVISelLowering.cpp Outdated Show resolved Hide resolved
llvm/test/CodeGen/RISCV/rvv/vaaddu-autogen.ll Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td Outdated Show resolved Hide resolved
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vaaddu.ll Outdated Show resolved Hide resolved
llvm/test/CodeGen/RISCV/rvv/vaaddu-sdnode.ll Outdated Show resolved Hide resolved
llvm/lib/Target/RISCV/RISCVInstrInfoVVLPatterns.td Outdated Show resolved Hide resolved
Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of nits but otherwise LGTM, thanks! Should probably wait for a second approval too in case anyone has other opinions

@@ -111,6 +111,7 @@ def riscv_ctlz_vl : SDNode<"RISCVISD::CTLZ_VL", SDT_RISCVIntUnOp_VL>
def riscv_cttz_vl : SDNode<"RISCVISD::CTTZ_VL", SDT_RISCVIntUnOp_VL>;
def riscv_ctpop_vl : SDNode<"RISCVISD::CTPOP_VL", SDT_RISCVIntUnOp_VL>;

def riscv_avgflooru_vl : SDNode<"RISCVISD::AVGFLOORU_VL", SDT_RISCVIntBinOp_VL, [SDNPCommutative]>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: Can align the : here

; CHECK: # %bb.0:
; CHECK-NEXT: vsetivli zero, 8, e16, m1, ta, ma
; CHECK-NEXT: csrwi vxrm, 2
; CHECK-NEXT: vaaddu.vv v8, v8, v9
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this vx test needs a splat?

ret <8 x i8> %ret
}

define <8 x i8> @vaaddu_i8_lshr2(<8 x i8> %x, <8 x i8> %y) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit

Suggested change
define <8 x i8> @vaaddu_i8_lshr2(<8 x i8> %x, <8 x i8> %y) {
define <8 x i8> @vaaddu_v8i8_lshr2(<8 x i8> %x, <8 x i8> %y) {

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sun-jacobi sun-jacobi changed the title [RISCV][ISel] Fold trunc (lshr (add (zext X), (zext Y)), 1) -> vaaddu X, Y [RISCV][ISel] Use vaaddu with rounding mode rdn for ISD::AVGFLOORU. Jan 9, 2024
This patch aims to use `vaaddu` with rounding mode rdn (i.e `vxrm[1:0] = 0b10`) for `ISD::AVGFLOORU`.

Source code
```
define <8 x i8> @vaaddu_auto(ptr %x, ptr %y, ptr %z) {
  %xv = load <8 x i8>, ptr %x, align 2
  %yv = load <8 x i8>, ptr %y, align 2
  %xzv = zext <8 x i8> %xv to <8 x i16>
  %yzv = zext <8 x i8> %yv to <8 x i16>
  %add = add nuw nsw <8 x i16> %xzv, %yzv
  %div = lshr <8 x i16> %add, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
  %ret = trunc <8 x i16> %div to <8 x i8>
  ret <8 x i8> %ret
}
```
Before this patch
```
vaaddu_auto:
        vsetivli        zero, 8, e8, mf2, ta, ma
        vle8.v  v8, (a0)
        vle8.v  v9, (a1)
        vwaddu.vv       v10, v8, v9
        vnsrl.wi        v8, v10, 1
        ret
```
After this patch
```
vaaddu_auto:
	vsetivli	zero, 8, e8, mf2, ta, ma
	vle8.v	v8, (a0)
	vle8.v	v9, (a1)
	csrwi	vxrm, 2
	vaaddu.vv	v8, v8, v9
	ret
```

Note on signed averaging addition

Based on the rvv spec,  there is also a variant for signed averaging addition called `vaadd`.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through `vaadd`. Thus this patch only introduces `vaaddu`.
@sun-jacobi sun-jacobi merged commit 0c24c17 into llvm:main Jan 9, 2024
4 checks passed
@sun-jacobi sun-jacobi deleted the vaadd branch January 9, 2024 08:29
sun-jacobi added a commit that referenced this pull request Jan 10, 2024
…77473)

Similar to #76550, but for `ISD::AVGCEILU`.
Specifically, this patch aims to use `vaaddu` with rounding mode rnu
(i.e `vxrm[1:0] = 0b00`) for `ISD::AVGCEILU`.

### Source code 
```
define <vscale x 8 x i8> @vaaddu_vv_nxv8i8_ceil(<vscale x 8 x i8> %x, <vscale x 8 x i8> %y) {
  %xzv = zext <vscale x 8 x i8> %x to <vscale x 8 x i16>
  %yzv = zext <vscale x 8 x i8> %y to <vscale x 8 x i16>
  %add = add nuw nsw <vscale x 8 x i16> %xzv, %yzv
  %one = insertelement <vscale x 8 x i16> poison, i16 1, i32 0
  %splat = shufflevector <vscale x 8 x i16> %one, <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
  %add1 = add nuw nsw <vscale x 8 x i16> %add, %splat
  %div = lshr <vscale x 8 x i16> %add1, %splat
  %ret = trunc <vscale x 8 x i16> %div to <vscale x 8 x i8>
  ret <vscale x 8 x i8> %ret
}
```

### Before this patch 
```
vaaddu_vv_nxv8i8_ceil:
        vsetvli a0, zero, e8, m1, ta, ma
        vwaddu.vv       v10, v8, v9
        vsetvli zero, zero, e16, m2, ta, ma
        vadd.vi v10, v10, 1
        vsetvli zero, zero, e8, m1, ta, ma
        vnsrl.wi        v8, v10, 1
        ret
```
### After this patch 
```
vaaddu_vv_nxv8i8_ceil:
        vsetvli a0, zero, e8, m1, ta, ma
        csrwi vxrm, 0
        vaaddu.vv v8, v8, v9
        ret
```
nstester pushed a commit to nstester/gcc that referenced this pull request Jan 10, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…lvm#76550)

This patch aims to use `vaaddu` with rounding mode rdn (i.e `vxrm[1:0] =
0b10`) for `ISD::AVGFLOORU`.

### Source code 
```
define <8 x i8> @vaaddu_auto(ptr %x, ptr %y, ptr %z) {
  %xv = load <8 x i8>, ptr %x, align 2
  %yv = load <8 x i8>, ptr %y, align 2
  %xzv = zext <8 x i8> %xv to <8 x i16>
  %yzv = zext <8 x i8> %yv to <8 x i16>
  %add = add nuw nsw <8 x i16> %xzv, %yzv
  %div = lshr <8 x i16> %add, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
  %ret = trunc <8 x i16> %div to <8 x i8>
  ret <8 x i8> %ret 
}
```

### Before this patch 
```
vaaddu_auto: 
        vsetivli        zero, 8, e8, mf2, ta, ma
        vle8.v  v8, (a0)
        vle8.v  v9, (a1)
        vwaddu.vv       v10, v8, v9
        vnsrl.wi        v8, v10, 1
        ret
```
### After this patch 
```
vaaddu_auto: 
	vsetivli	zero, 8, e8, mf2, ta, ma
	vle8.v	v8, (a0)
	vle8.v	v9, (a1)
	csrwi	vxrm, 2
	vaaddu.vv	v8, v8, v9
	ret
```

### Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging
addition called `vaadd`.
But AFAIU, no matter in which rounding mode, we cannot achieve the
semantic of signed averaging addition through `vaadd`.
Thus this patch only introduces `vaaddu`.
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
…lvm#77473)

Similar to llvm#76550, but for `ISD::AVGCEILU`.
Specifically, this patch aims to use `vaaddu` with rounding mode rnu
(i.e `vxrm[1:0] = 0b00`) for `ISD::AVGCEILU`.

### Source code 
```
define <vscale x 8 x i8> @vaaddu_vv_nxv8i8_ceil(<vscale x 8 x i8> %x, <vscale x 8 x i8> %y) {
  %xzv = zext <vscale x 8 x i8> %x to <vscale x 8 x i16>
  %yzv = zext <vscale x 8 x i8> %y to <vscale x 8 x i16>
  %add = add nuw nsw <vscale x 8 x i16> %xzv, %yzv
  %one = insertelement <vscale x 8 x i16> poison, i16 1, i32 0
  %splat = shufflevector <vscale x 8 x i16> %one, <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
  %add1 = add nuw nsw <vscale x 8 x i16> %add, %splat
  %div = lshr <vscale x 8 x i16> %add1, %splat
  %ret = trunc <vscale x 8 x i16> %div to <vscale x 8 x i8>
  ret <vscale x 8 x i8> %ret
}
```

### Before this patch 
```
vaaddu_vv_nxv8i8_ceil:
        vsetvli a0, zero, e8, m1, ta, ma
        vwaddu.vv       v10, v8, v9
        vsetvli zero, zero, e16, m2, ta, ma
        vadd.vi v10, v10, 1
        vsetvli zero, zero, e8, m1, ta, ma
        vnsrl.wi        v8, v10, 1
        ret
```
### After this patch 
```
vaaddu_vv_nxv8i8_ceil:
        vsetvli a0, zero, e8, m1, ta, ma
        csrwi vxrm, 0
        vaaddu.vv v8, v8, v9
        ret
```
XYenChi pushed a commit to XYenChi/gcc that referenced this pull request Mar 8, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 11, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 11, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 13, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 13, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 13, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/ruyi-gcc that referenced this pull request Mar 15, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
yulong18 pushed a commit to yulong18/ruyisdk-gcc that referenced this pull request Mar 17, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 19, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 21, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
XYenChi pushed a commit to XYenChi/gcc that referenced this pull request Mar 25, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Liaoshihua pushed a commit to Liaoshihua/gcc that referenced this pull request Mar 27, 2024
This patch is inspired by LLVM patches:
llvm/llvm-project#76550
llvm/llvm-project#77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

	vsetivli	zero,8,e8,mf2,ta,ma
	csrwi	vxrm,0
	vle8.v	v1,0(a1)
	vle8.v	v2,0(a2)
	vaaddu.vv	v1,v1,v2
	vse8.v	v1,0(a0)
	ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
riscv/riscv-v-spec#935
riscv/riscv-v-spec#934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
	(avg<v_double_trunc>3_floor): New pattern.
	(<u>avg<v_double_trunc>3_ceil): Remove.
	(avg<v_double_trunc>3_ceil): New pattern.
	(uavg<mode>3_floor): Ditto.
	(uavg<mode>3_ceil): Ditto.
	* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
	(enum insn_type): Ditto.
	* config/riscv/riscv-v.cc: Ditto.
	* config/riscv/vector-iterators.md (ashiftrt): Remove.
	(ASHIFTRT): Ditto.
	* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
	* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
	* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
	* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants