[DAGCombiner] Extend FP-to-Int cast without requiring nsz #161093

yichi170 · 2025-09-28T20:07:29Z

This patch updates the FP-to-Int conversion handling:

For signed integers: use ftrunc followed by clamping to the target integer range.
For unsigned integers: apply fabs + ftrunc, then clamp.

This removes the previous dependence on nsz and ensures correct lowering for both signed and unsigned cases.

I've tested the code generation of -mtriple=amdgcn. It seems that the assembly code is expected, but I'm not sure how to write a general testcase for every target.

Fixes #160623.

apply clang-format

llvmbot · 2025-09-28T20:08:02Z

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-selectiondag

Author: Yi-Chi Lee (yichi170)

Changes

This patch updates the FP-to-Int conversion handling:

For signed integers: use ftrunc followed by clamping to the target integer range.
For unsigned integers: apply fabs + ftrunc, then clamp.

This removes the previous dependence on nsz and ensures correct lowering for both signed and unsigned cases.

I've tested the code generation of -mtriple=amdgcn. It seems that the assembly code is expected, but I'm not sure how to write a general testcase for every target.

Fixes #160623.

Full diff: https://github.com/llvm/llvm-project/pull/161093.diff

1 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+41-11)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index a6ba6e518899f..65cea64e0982d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -18862,27 +18862,57 @@ SDValue DAGCombiner::visitFPOW(SDNode *N) {
 
 static SDValue foldFPToIntToFP(SDNode *N, const SDLoc &DL, SelectionDAG &DAG,
                                const TargetLowering &TLI) {
-  // We only do this if the target has legal ftrunc. Otherwise, we'd likely be
-  // replacing casts with a libcall. We also must be allowed to ignore -0.0
-  // because FTRUNC will return -0.0 for (-1.0, -0.0), but using integer
-  // conversions would return +0.0.
+  // We can fold the fpto[us]i -> [us]itofp pattern into a single ftrunc.
+  // If NoSignedZerosFPMath is enabled, this is a direct replacement.
+  // Otherwise, for strict math, we must handle edge cases:
+  // 1. For signed conversions, clamp out-of-range values to the valid
+  //    integer range before the trunc.
+  // 2. For unsigned conversions, use FABS. A negative float becomes integer 0,
+  //    which must convert back to +0.0. FTRUNC on its own could produce -0.0.
+
   // FIXME: We should be able to use node-level FMF here.
-  // TODO: If strict math, should we use FABS (+ range check for signed cast)?
   EVT VT = N->getValueType(0);
-  if (!TLI.isOperationLegal(ISD::FTRUNC, VT) ||
-      !DAG.getTarget().Options.NoSignedZerosFPMath)
+  if (!TLI.isOperationLegal(ISD::FTRUNC, VT))
     return SDValue();
 
   // fptosi/fptoui round towards zero, so converting from FP to integer and
   // back is the same as an 'ftrunc': [us]itofp (fpto[us]i X) --> ftrunc X
   SDValue N0 = N->getOperand(0);
   if (N->getOpcode() == ISD::SINT_TO_FP && N0.getOpcode() == ISD::FP_TO_SINT &&
-      N0.getOperand(0).getValueType() == VT)
-    return DAG.getNode(ISD::FTRUNC, DL, VT, N0.getOperand(0));
+      N0.getOperand(0).getValueType() == VT) {
+    if (DAG.getTarget().Options.NoSignedZerosFPMath)
+      return DAG.getNode(ISD::FTRUNC, DL, VT, N0.getOperand(0));
+
+    // Strict math: clamp to the signed integer range before truncating.
+    unsigned IntWidth = N0.getValueSizeInBits();
+    APInt APMax = APInt::getSignedMaxValue(IntWidth);
+    APInt APMin = APInt::getSignedMinValue(IntWidth);
+
+    APFloat MaxAPF(VT.getFltSemantics());
+    MaxAPF.convertFromAPInt(APMax, true, APFloat::rmTowardZero);
+    APFloat MinAPF(VT.getFltSemantics());
+    MinAPF.convertFromAPInt(APMin, true, APFloat::rmTowardZero);
+
+    SDValue MaxFP = DAG.getConstantFP(MaxAPF, DL, VT);
+    SDValue MinFP = DAG.getConstantFP(MinAPF, DL, VT);
+
+    SDValue Clamped = DAG.getNode(
+        ISD::FMINNUM, DL, VT,
+        DAG.getNode(ISD::FMAXNUM, DL, VT, N0->getOperand(0), MinFP), MaxFP);
+    return DAG.getNode(ISD::FTRUNC, DL, VT, Clamped);
+  }
 
   if (N->getOpcode() == ISD::UINT_TO_FP && N0.getOpcode() == ISD::FP_TO_UINT &&
-      N0.getOperand(0).getValueType() == VT)
-    return DAG.getNode(ISD::FTRUNC, DL, VT, N0.getOperand(0));
+      N0.getOperand(0).getValueType() == VT) {
+    if (DAG.getTarget().Options.NoSignedZerosFPMath)
+      return DAG.getNode(ISD::FTRUNC, DL, VT, N0.getOperand(0));
+
+    // Strict math: use FABS to handle negative inputs correctly.
+    if (TLI.isFAbsFree(VT)) {
+      SDValue Abs = DAG.getNode(ISD::FABS, DL, VT, N0.getOperand(0));
+      return DAG.getNode(ISD::FTRUNC, DL, VT, Abs);
+    }
+  }
 
   return SDValue();
 }

paperchalice · 2025-09-29T06:24:35Z

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

+    if (DAG.getTarget().Options.NoSignedZerosFPMath)
+      return DAG.getNode(ISD::FTRUNC, DL, VT, N0.getOperand(0));
+
+    // Strict math: use FABS to handle negative inputs correctly.


uitofp has nneg flag support. Check nneg flag before generating FABS?

No, nneg flag does not help! The problematic cases are inputs like -0.5. In that case the input to uitofp would be 0 which is not negative.

Oops, my fault, seems both nneg and nsz are requied but in current implementation it is impossible 🥲

Good catch! would address this!

paperchalice · 2025-09-29T06:28:19Z

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

+    // Strict math: clamp to the signed integer range before truncating.
+    unsigned IntWidth = N0.getValueSizeInBits();
+    APInt APMax = APInt::getSignedMaxValue(IntWidth);
+    APInt APMin = APInt::getSignedMinValue(IntWidth);


NoSignedZerosFPMath might be not enough to cover the range check part. BTW, the constrained floating point intrinsics are converted into the SDNode with prefix STRICT, which might cause confusion.

arsenm · 2025-09-29T06:38:28Z

I've tested the code generation of -mtriple=amdgcn. It seems that the assembly code is expected, but I'm not sure how to write a general testcase for every target.

It's missing from the PR?

arsenm · 2025-09-29T06:37:51Z

llvm/test/CodeGen/AArch64/cvt-fp-int-fp.ll

@@ -16,8 +21,12 @@ entry:
 define float @t2(float %x) {
 ; CHECK-LABEL: t2:
 ; CHECK:       // %bb.0: // %entry
-; CHECK-NEXT:    fcvtzs s0, s0
-; CHECK-NEXT:    scvtf s0, s0
+; CHECK-NEXT:    movi v1.2s, #207, lsl #24


Regression, this should probably be guarded on an isFAbsFree check

arsenm · 2025-09-29T06:39:26Z

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

+        ISD::FMINNUM, DL, VT,
+        DAG.getNode(ISD::FMAXNUM, DL, VT, N0->getOperand(0), MinFP), MaxFP);
+    return DAG.getNode(ISD::FTRUNC, DL, VT, Clamped);
+  }


It would be easier to handle the signed and unsigned cases in separate PRs, the signed case is less obviously profitable

Sure, will focus on handling unsigned cases in this PR.

github-actions · 2025-09-29T16:29:38Z

✅ With the latest revision this PR passed the C/C++ code formatter.

yichi170 · 2025-09-29T16:29:41Z

In the newest commit, I only handle the unsigned cases as @arsenm suggested, and I also added the test for the AMDGPU backend. It seems that it doesn't generate the expected codegen when the fp type is double. I'm not sure why.

arsenm

LGTM but should regenerate the test with the suggested run lines

arsenm · 2025-10-10T08:29:35Z

llvm/test/CodeGen/AMDGPU/fptoui_uitofp.ll

@@ -0,0 +1,209 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc < %s -mtriple=amdgcn | FileCheck %s


Suggested change

; RUN: llc < %s -mtriple=amdgcn | FileCheck %s

; RUN: llc -mtriple=amdgcn -mcpu=gfx600 < %s | FileCheck -check-prefix=GFX6 %s

; RUN: llc -mtriple=amdgcn -mcpu=gfx900 < %s | FileCheck -check-prefix=GFX9 %s

Should test with and without legal 16-bit operations, the 16-bit checks are missing the fabs

I've updated the testcase. It looks like the results are correct. I found that the gfx900 also has different codegen with f64. Are f64 operations also illegal in gfx600?

(I don't have permission to merge the PR, so if everything looks good to you, could you please help merge it?)

yichi170 added 2 commits September 28, 2025 10:42

[DAGCombiner] Extend FP-to-Int cast without requiring nsz

51bf419

[DAGCombiner] Modify the comment to fit the current implementation and

cccc073

apply clang-format

llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Sep 28, 2025

update testcases

37413f3

llvmbot added backend:AArch64 backend:AMDGPU labels Sep 29, 2025

paperchalice reviewed Sep 29, 2025

View reviewed changes

paperchalice requested a review from arsenm September 29, 2025 06:32

arsenm added the floating-point Floating-point math label Sep 29, 2025

arsenm reviewed Sep 29, 2025

View reviewed changes

[DAGCombiner] Fold fp-uint-fp to fabs + ftrunc

b127081

yichi170 force-pushed the extend-fp-to-int-combine branch from 71fddaa to b127081 Compare September 30, 2025 12:52

arsenm approved these changes Oct 10, 2025

View reviewed changes

update testcase (test with and without legal 16-bit operations)

36447e7

yichi170 force-pushed the extend-fp-to-int-combine branch from fff0273 to 36447e7 Compare October 10, 2025 14:33

arsenm approved these changes Oct 10, 2025

View reviewed changes

arsenm merged commit a9c8e94 into llvm:main Oct 10, 2025
10 checks passed

		@@ -0,0 +1,209 @@
		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
		; RUN: llc < %s -mtriple=amdgcn \| FileCheck %s

	; RUN: llc < %s -mtriple=amdgcn \| FileCheck %s
	; RUN: llc -mtriple=amdgcn -mcpu=gfx600 < %s \| FileCheck -check-prefix=GFX6 %s
	; RUN: llc -mtriple=amdgcn -mcpu=gfx900 < %s \| FileCheck -check-prefix=GFX9 %s

[DAGCombiner] Extend FP-to-Int cast without requiring nsz #161093

[DAGCombiner] Extend FP-to-Int cast without requiring nsz #161093

Uh oh!

Conversation

yichi170 commented Sep 28, 2025

Uh oh!

llvmbot commented Sep 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arsenm commented Sep 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yichi170 Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yichi170 commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

llvmbot commented Sep 28, 2025 •

edited

Loading

yichi170 Sep 29, 2025 •

edited

Loading

github-actions bot commented Sep 29, 2025 •

edited

Loading

yichi170 commented Sep 29, 2025 •

edited

Loading