Skip to content

Conversation

@sgundapa
Copy link
Contributor

fclass intrinsics generate a sub-optimal code by doing a predicate transfer and compare. This patch optimizes out and directly uses the predicate.

fclass intrinsics generate a sub-optimal code by doing a predicate
transfer and compare. This patch optimizes out and directly uses the
predicate.
@llvmbot
Copy link
Member

llvmbot commented Oct 30, 2025

@llvm/pr-subscribers-backend-hexagon

Author: Sumanth Gundapaneni (sgundapa)

Changes

fclass intrinsics generate a sub-optimal code by doing a predicate transfer and compare. This patch optimizes out and directly uses the predicate.


Full diff: https://github.com/llvm/llvm-project/pull/165735.diff

2 Files Affected:

  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+11)
  • (added) llvm/test/CodeGen/Hexagon/isel-fclass.ll (+86)
diff --git a/llvm/lib/Target/Hexagon/HexagonPatterns.td b/llvm/lib/Target/Hexagon/HexagonPatterns.td
index 85ce9447c2028..44e9d2402d0a3 100644
--- a/llvm/lib/Target/Hexagon/HexagonPatterns.td
+++ b/llvm/lib/Target/Hexagon/HexagonPatterns.td
@@ -3434,6 +3434,17 @@ let AddedComplexity = 100 in {
            (C2_not (S4_stored_locked I32:$Rs, I64:$Rt))>;
 }
 
+multiclass FloatClass<SDPatternOperator IntOp, InstHexagon MI,
+                      PatFrag RegPred> {
+  def: Pat<(i1 (seteq (IntOp RegPred:$Rs, u5_0ImmPred_timm:$u5), 0)),
+            (C2_not (MI RegPred:$Rs, u5_0ImmPred_timm:$u5))>;
+  def: Pat<(i1 (setne (IntOp RegPred:$Rs, u5_0ImmPred_timm:$u5), 0)),
+           (MI RegPred:$Rs, u5_0ImmPred_timm:$u5)>;
+}
+
+defm : FloatClass<int_hexagon_F2_sfclass, F2_sfclass, F32>;
+defm : FloatClass<int_hexagon_F2_dfclass, F2_dfclass, F64>;
+
 def: Pat<(int_hexagon_instrprof_custom (HexagonAtPcrel tglobaladdr:$addr), u32_0ImmPred:$I),
          (PS_call_instrprof_custom tglobaladdr:$addr, imm:$I)>;
 
diff --git a/llvm/test/CodeGen/Hexagon/isel-fclass.ll b/llvm/test/CodeGen/Hexagon/isel-fclass.ll
new file mode 100644
index 0000000000000..96b02106fa807
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/isel-fclass.ll
@@ -0,0 +1,86 @@
+; Tests lowering of sfclass/dfclass compares.
+; Sub-optimal code
+;         {
+;                p0 = sfclass(r0,#16)
+;                r0 = sfadd(r0,r0)
+;        }
+;        {
+;                r2 = p0
+;        }
+;        {
+;                if (p0.new) r0 = ##1065353216
+;                p0 = cmp.eq(r2,#0)
+;                jumpr r31
+;        }
+; With the patterns added, we should be generating
+;        {
+;                p0 = sfclass(r0,#16)
+;                r0 = sfadd(r0,r0)
+;        }
+;        {
+;                if (!p0) r0 = ##1065353216
+;                jumpr r31
+;        }
+
+; RUN: llc -march=hexagon -stop-after=hexagon-isel %s -o - | FileCheck %s
+
+; CHECK: bb.0.entry1
+; CHECK: F2_sfclass
+; CHECK-NOT: C2_cmp
+; CHECK: C2_not
+; CHECK: F2_sfadd
+; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none)
+define float @test1(float noundef %x) {
+entry1:
+  %0 = tail call i32 @llvm.hexagon.F2.sfclass(float %x, i32 16)
+  %tobool.not = icmp eq i32 %0, 0
+  %add = fadd float %x, %x
+  %spec.select = select i1 %tobool.not, float 1.000000e+00, float %add
+  ret float %spec.select
+}
+
+; CHECK: bb.0.entry2
+; CHECK: F2_sfclass
+; CHECK-NOT: C2_cmp
+; CHECK: F2_sfadd
+define float @test2(float noundef %x) {
+entry2:
+  %0 = tail call i32 @llvm.hexagon.F2.sfclass(float %x, i32 16)
+  %tobool.not = icmp eq i32 %0, 0
+  %add = fadd float %x, %x
+  %spec.select = select i1 %tobool.not, float %add, float 1.000000e+00
+  ret float %spec.select
+}
+
+; CHECK: bb.0.entry3
+; CHECK: F2_dfclass
+; CHECK-NOT: C2_cmp
+; CHECK: C2_not
+; CHECK: F2_dfadd
+define double @test3(double noundef %x) {
+entry3:
+  %0 = tail call i32 @llvm.hexagon.F2.dfclass(double %x, i32 16)
+  %tobool.not = icmp eq i32 %0, 0
+  %add = fadd double %x, %x
+  %spec.select = select i1 %tobool.not, double 1.000000e+00, double %add
+  ret double %spec.select
+}
+
+; CHECK: bb.0.entry4
+; CHECK: F2_dfclass
+; CHECK-NOT: C2_cmp
+; CHECK: F2_dfadd
+define double @test4(double noundef %x) {
+entry4:
+  %0 = tail call i32 @llvm.hexagon.F2.dfclass(double %x, i32 16)
+  %tobool.not = icmp eq i32 %0, 0
+  %add = fadd double %x, %x
+  %spec.select = select i1 %tobool.not, double %add, double 1.000000e+00
+  ret double %spec.select
+}
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(none)
+declare i32 @llvm.hexagon.F2.dfclass(double, i32 immarg)
+
+; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn memory(none)
+declare i32 @llvm.hexagon.F2.sfclass(float, i32 immarg)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request optimizes the code generation for Hexagon floating-point class comparison operations (sfclass/dfclass). The optimization eliminates unnecessary comparison and register transfer instructions when the result of F2_sfclass/F2_dfclass intrinsics is compared to zero.

Key changes:

  • Added pattern matching rules to fold comparisons of sfclass/dfclass results with zero directly into predicate operations
  • Eliminates unnecessary C2_cmp instructions and register moves between integer and predicate registers

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
llvm/lib/Target/Hexagon/HexagonPatterns.td Adds FloatClass multiclass with patterns to fold F2_sfclass/F2_dfclass comparisons with zero into predicate operations
llvm/test/CodeGen/Hexagon/isel-fclass.ll Adds regression tests verifying the optimization generates expected instruction patterns

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@sgundapa sgundapa requested a review from Copilot October 30, 2025 15:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

def: Pat<(i1 (seteq (IntOp RegPred:$Rs, u5_0ImmPred_timm:$u5), 0)),
(C2_not (MI RegPred:$Rs, u5_0ImmPred_timm:$u5))>;
def: Pat<(i1 (setne (IntOp RegPred:$Rs, u5_0ImmPred_timm:$u5), 0)),
(MI RegPred:$Rs, u5_0ImmPred_timm:$u5)>;
Copy link

Copilot AI Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation between lines 3441 and 3443. Line 3441 uses 14 spaces while line 3443 uses 13 spaces. Both should use the same indentation to align with the opening parenthesis of the pattern's first argument, consistent with the style used in lines 3428 and 3430 above.

Suggested change
(MI RegPred:$Rs, u5_0ImmPred_timm:$u5)>;
(MI RegPred:$Rs, u5_0ImmPred_timm:$u5)>;

Copilot uses AI. Check for mistakes.
@sgundapa sgundapa requested a review from androm3da October 30, 2025 16:03
Copy link
Contributor

@iajbar iajbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@sgundapa sgundapa merged commit 3c4fece into llvm:main Oct 31, 2025
10 checks passed
DEBADRIBASAK pushed a commit to DEBADRIBASAK/llvm-project that referenced this pull request Nov 3, 2025
fclass intrinsics generate a sub-optimal code by doing a predicate
transfer and compare. This patch optimizes out and directly uses the
predicate.
ckoparkar pushed a commit to ckoparkar/llvm-project that referenced this pull request Nov 6, 2025
fclass intrinsics generate a sub-optimal code by doing a predicate
transfer and compare. This patch optimizes out and directly uses the
predicate.
@sgundapa sgundapa deleted the hex_fclass branch November 6, 2025 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants