Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Remove redundant FDIV Combine. #91924

Merged

Conversation

sdesmalen-arm
Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm commented May 13, 2024

The target combine is no longer required because InstCombine will
transform the DIV by a power of 2 into a multiply, so in practice
this case will never trigger.

Additionally, the generated code would have been incorrect for
streaming(-compatible) functions, because it assumed NEON was available.

@llvmbot
Copy link
Collaborator

llvmbot commented May 13, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Sander de Smalen (sdesmalen-arm)

Changes

NEON fixed-point SCVTF instruction is not available in Streaming-SVE
mode. There is no equivalent SVE instruction, so we simply expand to a
regular FDIV operation.


Full diff: https://github.com/llvm/llvm-project/pull/91924.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-arith.ll (+18)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 7344387ffe552..a76dc7b140b07 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -17912,7 +17912,7 @@ static SDValue performFpToIntCombine(SDNode *N, SelectionDAG &DAG,
 static SDValue performFDivCombine(SDNode *N, SelectionDAG &DAG,
                                   TargetLowering::DAGCombinerInfo &DCI,
                                   const AArch64Subtarget *Subtarget) {
-  if (!Subtarget->hasNEON())
+  if (!Subtarget->isNeonAvailable())
     return SDValue();
 
   SDValue Op = N->getOperand(0);
diff --git a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-arith.ll b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-arith.ll
index c2d6ed4e9ccf9..c20e8521375a9 100644
--- a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-arith.ll
+++ b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-arith.ll
@@ -214,6 +214,24 @@ define <2 x float> @fdiv_v2f32(<2 x float> %op1, <2 x float> %op2) {
   ret <2 x float> %res
 }
 
+; Test that we don't optimise this using a NEON instruction, when
+; NEON is not available.
+define <2 x float> @fdiv_v232_pow2(<2 x i32> %in) {
+; CHECK-LABEL: fdiv_v232_pow2:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    ptrue p0.s, vl2
+; CHECK-NEXT:    // kill: def $d0 killed $d0 def $z0
+; CHECK-NEXT:    fmov z1.s, #16.00000000
+; CHECK-NEXT:    scvtf z0.s, p0/m, z0.s
+; CHECK-NEXT:    fdiv z0.s, p0/m, z0.s, z1.s
+; CHECK-NEXT:    // kill: def $d0 killed $d0 killed $z0
+; CHECK-NEXT:    ret
+entry:
+  %vcvt.i = sitofp <2 x i32> %in to <2 x float>
+  %div.i = fdiv <2 x float> %vcvt.i, <float 16.0, float 16.0>
+  ret <2 x float> %div.i
+}
+
 define <4 x float> @fdiv_v4f32(<4 x float> %op1, <4 x float> %op2) {
 ; CHECK-LABEL: fdiv_v4f32:
 ; CHECK:       // %bb.0:

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just delete this transform? It's specifically checking for a divide by a power of two... but instcombine always transforms that to a multiply, so this will basically never trigger.

The target combine is no longer required because InstCombine will
transform the DIV by a power of 2 into a multiply, so in practice
this case will never trigger.

Additionally, the generated code would have been incorrect for
streaming(-compatible) functions, because it assumed NEON was available.
@sdesmalen-arm sdesmalen-arm force-pushed the sme-fix-neon-scvtf-in-streaming-mode branch from 9d4307a to f744db3 Compare May 13, 2024 10:19
@sdesmalen-arm sdesmalen-arm changed the title [AArch64] Avoid NEON fixed-point SCVTF in Streaming-SVE mode. [AArch64] Remove redundant FDIV Combine. May 13, 2024
@sdesmalen-arm
Copy link
Collaborator Author

Can we just delete this transform? It's specifically checking for a divide by a power of two... but instcombine always transforms that to a multiply, so this will basically never trigger.

Good point! I've repurposed the PR to instead remove the combine.

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see aarch64_neon_vcvtfxs2fp used anywhere else so I guess this means we're missing the equivalent fmul combine, which seems unfortunate but not your fault.

@sdesmalen-arm sdesmalen-arm merged commit 3d6f18d into llvm:main May 14, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants