[DAG]SimplifyDemandedVectorElts-add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes #86284

aniplcc · 2024-03-22T13:37:41Z

Fixes #84768

…DemandedVectorElts

llvmbot · 2024-03-22T13:38:10Z

@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-aarch64

Author: aniplcc (aniplcc)

Changes

Fixes #84768

Full diff: https://github.com/llvm/llvm-project/pull/86284.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+4)
(modified) llvm/test/CodeGen/AArch64/hadd-combine.ll (+52)

diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index da29b1d5b312f8..58d9394feb1237 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -3524,6 +3524,10 @@ bool TargetLowering::SimplifyDemandedVectorElts(
     }
     [[fallthrough]];
   }
+  case ISD::AVGCEILS:
+  case ISD::AVGCEILU:
+  case ISD::AVGFLOORS:
+  case ISD::AVGFLOORU:
   case ISD::OR:
   case ISD::XOR:
   case ISD::SUB:
diff --git a/llvm/test/CodeGen/AArch64/hadd-combine.ll b/llvm/test/CodeGen/AArch64/hadd-combine.ll
index e12502980790da..7c1c089839edc5 100644
--- a/llvm/test/CodeGen/AArch64/hadd-combine.ll
+++ b/llvm/test/CodeGen/AArch64/hadd-combine.ll
@@ -879,6 +879,58 @@ define <8 x i16> @uhadd_fixedwidth_v4i32(<8 x i16> %a0, <8 x i16> %a1)  {
   ret <8 x i16> %res
 }
 
+define <8 x i16> @shadd_demandedelts(<8 x i16> %a0, <8 x i16> %a1) {
+; CHECK-LABEL: shadd_demandedelts:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    shadd v0.8h, v0.8h, v1.8h
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    ret
+  %s0 = shufflevector <8 x i16> %a0, <8 x i16> undef, <8 x i32> zeroinitializer
+  %op = call <8 x i16> @llvm.aarch64.neon.shadd.v8i16(<8 x i16> %s0, <8 x i16> %a1)
+  %r0 = shufflevector <8 x i16> %op, <8 x i16> undef, <8 x i32> zeroinitializer
+  ret <8 x i16> %r0
+}
+
+define <8 x i16> @srhadd_demandedelts(<8 x i16> %a0, <8 x i16> %a1) {
+; CHECK-LABEL: srhadd_demandedelts:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    srhadd v0.8h, v0.8h, v1.8h
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    ret
+  %s0 = shufflevector <8 x i16> %a0, <8 x i16> undef, <8 x i32> zeroinitializer
+  %op = call <8 x i16> @llvm.aarch64.neon.srhadd.v8i16(<8 x i16> %s0, <8 x i16> %a1)
+  %r0 = shufflevector <8 x i16> %op, <8 x i16> undef, <8 x i32> zeroinitializer
+  ret <8 x i16> %r0
+}
+
+define <8 x i16> @uhadd_demandedelts(<8 x i16> %a0, <8 x i16> %a1) {
+; CHECK-LABEL: uhadd_demandedelts:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    uhadd v0.8h, v0.8h, v1.8h
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    ret
+  %s0 = shufflevector <8 x i16> %a0, <8 x i16> undef, <8 x i32> zeroinitializer
+  %op = call <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16> %s0, <8 x i16> %a1)
+  %r0 = shufflevector <8 x i16> %op, <8 x i16> undef, <8 x i32> zeroinitializer
+  ret <8 x i16> %r0
+}
+
+define <8 x i16> @urhadd_demandedelts(<8 x i16> %a0, <8 x i16> %a1) {
+; CHECK-LABEL: urhadd_demandedelts:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    urhadd v0.8h, v0.8h, v1.8h
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    ret
+  %s0 = shufflevector <8 x i16> %a0, <8 x i16> undef, <8 x i32> zeroinitializer
+  %op = call <8 x i16> @llvm.aarch64.neon.urhadd.v8i16(<8 x i16> %s0, <8 x i16> %a1)
+  %r0 = shufflevector <8 x i16> %op, <8 x i16> undef, <8 x i32> zeroinitializer
+  ret <8 x i16> %r0
+}
+
 declare <8 x i8> @llvm.aarch64.neon.shadd.v8i8(<8 x i8>, <8 x i8>)
 declare <4 x i16> @llvm.aarch64.neon.shadd.v4i16(<4 x i16>, <4 x i16>)
 declare <2 x i32> @llvm.aarch64.neon.shadd.v2i32(<2 x i32>, <2 x i32>)

RKSimon · 2024-03-22T15:28:24Z

llvm/test/CodeGen/AArch64/hadd-combine.ll

+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    shadd v0.8h, v0.8h, v1.8h
+; CHECK-NEXT:    dup v0.8h, v0.h[0]
+; CHECK-NEXT:    ret


These don't appear to be working? I'd expect:

; CHECK: // %bb.0: ; CHECK-NEXT: shadd v0.8h, v0.8h, v1.8h ; CHECK-NEXT: dup v0.8h, v0.h[0] ; CHECK-NEXT: ret

my bad, forgot to run it against build, will update

Seems like the addition of the cases don't seem to be doing anything. I don't see any change in the resulting assembly.

RKSimon · 2024-03-25T11:45:40Z

If I had to guess the problem is that we don't hit this until after lowering and the AArch64 DUP node combines aren't calling SimplifyDemandedVectorElts - @davemgreen does that make sense to you?

aniplcc · 2024-03-25T13:50:41Z

Also, the ISD:: opcode passed to SimplifyDemandedVectorElts is INTRINSIC_WO_CHAIN(45) rather than one of AVGCEIL(S/U)/AVGFLOOR(S/U)?

RKSimon · 2024-03-25T13:55:44Z

Also, the ISD:: opcode passed to SimplifyDemandedVectorElts is INTRINSIC_WO_CHAIN(45) rather than one of AVGCEIL(S/U)/AVGFLOOR(S/U)?

That's prior to lowering - after legalization the intrinsic call will lower to a ISD::AVG* node, but by that point the shuffles have become AArch64ISD::DUPLANE* nodes.

RKSimon · 2024-03-27T17:19:51Z

@aniplcc Please can you rebase after 5d3ef06 and regenerate the test checks?

RKSimon

LGTM (keep the AArch64 tests - I'll create a followup issue to improve DUPLANE demanded elts handling)

[DAG]Add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes to Simplify…

9cea11a

…DemandedVectorElts

llvmbot added backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well labels Mar 22, 2024

RKSimon requested review from RKSimon and davemgreen March 22, 2024 15:25

RKSimon reviewed Mar 22, 2024

View reviewed changes

RKSimon added a commit that referenced this pull request Mar 27, 2024

[X86] combine-pavg.ll - add demandedelts test coverage for #86284

5d3ef06

aniplcc added 2 commits March 30, 2024 10:00

Merge branch 'main' into dag1

d1c8ccf

update combine-pavg.ll

7b495b8

llvmbot added the backend:X86 label Mar 30, 2024

RKSimon approved these changes Apr 3, 2024

View reviewed changes

RKSimon mentioned this pull request Apr 3, 2024

[AArch64] Add SimplifyDemandedVectorEltsForTargetNode support for AArch64ISD::DUPLANE nodes #87497

Open

3 tasks

RKSimon merged commit d650fcd into llvm:main Apr 3, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DAG]SimplifyDemandedVectorElts-add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes #86284

[DAG]SimplifyDemandedVectorElts-add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes #86284

aniplcc commented Mar 22, 2024

llvmbot commented Mar 22, 2024 •

edited

RKSimon Mar 22, 2024

aniplcc Mar 22, 2024

aniplcc Mar 25, 2024

RKSimon commented Mar 25, 2024

aniplcc commented Mar 25, 2024

RKSimon commented Mar 25, 2024

RKSimon commented Mar 27, 2024

RKSimon left a comment

[DAG]SimplifyDemandedVectorElts-add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes #86284

[DAG]SimplifyDemandedVectorElts-add ISD::AVGCEILS/AVGCEILU/AVGFLOORS/AVGFLOORU nodes #86284

Conversation

aniplcc commented Mar 22, 2024

llvmbot commented Mar 22, 2024 • edited

RKSimon Mar 22, 2024

Choose a reason for hiding this comment

aniplcc Mar 22, 2024

Choose a reason for hiding this comment

aniplcc Mar 25, 2024

Choose a reason for hiding this comment

RKSimon commented Mar 25, 2024

aniplcc commented Mar 25, 2024

RKSimon commented Mar 25, 2024

RKSimon commented Mar 27, 2024

RKSimon left a comment

Choose a reason for hiding this comment

llvmbot commented Mar 22, 2024 •

edited