[DAG] Add ComputeNumSignBits(FREEZE(X)) handling #161507

RKSimon · 2025-10-01T11:12:34Z

If X is known never under/poison then skip the freeze and return ComputeNumSignBits(X)

llvmbot · 2025-10-01T11:13:06Z

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-aarch64

Author: Simon Pilgrim (RKSimon)

Changes

If X is known never under/poison then skip the freeze and return ComputeNumSignBits(X)

Full diff: https://github.com/llvm/llvm-project/pull/161507.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+5)
(modified) llvm/test/CodeGen/AArch64/freeze.ll (-6)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 8fc7eabf90ea8..95f53fe0bfdba 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -4762,6 +4762,11 @@ unsigned SelectionDAG::ComputeNumSignBits(SDValue Op, const APInt &DemandedElts,
   case ISD::AssertZext:
     Tmp = cast<VTSDNode>(Op.getOperand(1))->getVT().getSizeInBits();
     return VTBits-Tmp;
+  case ISD::FREEZE:
+    if (isGuaranteedNotToBeUndefOrPoison(Op.getOperand(0), DemandedElts,
+                                         /*PoisonOnly=*/false))
+      return ComputeNumSignBits(Op.getOperand(0), DemandedElts, Depth + 1);
+    break;
   case ISD::MERGE_VALUES:
     return ComputeNumSignBits(Op.getOperand(Op.getResNo()), DemandedElts,
                               Depth + 1);
diff --git a/llvm/test/CodeGen/AArch64/freeze.ll b/llvm/test/CodeGen/AArch64/freeze.ll
index fae3bbe2dcfba..fb909fec90434 100644
--- a/llvm/test/CodeGen/AArch64/freeze.ll
+++ b/llvm/test/CodeGen/AArch64/freeze.ll
@@ -466,15 +466,12 @@ define <8 x i16> @freeze_urhadd(<8 x i16> %a0, <8 x i16> %a1) {
   ret <8 x i16> %masked
 }
 
-; TODO: Unnecessary sext_inreg
 define <8 x i16> @freeze_shadd(<8 x i8> %a0, <8 x i16> %a1) {
 ; CHECK-LABEL: freeze_shadd:
 ; CHECK:       // %bb.0:
 ; CHECK-NEXT:    sshll v0.8h, v0.8b, #0
 ; CHECK-NEXT:    sshr v1.8h, v1.8h, #8
 ; CHECK-NEXT:    shadd v0.8h, v0.8h, v1.8h
-; CHECK-NEXT:    shl v0.8h, v0.8h, #8
-; CHECK-NEXT:    sshr v0.8h, v0.8h, #8
 ; CHECK-NEXT:    ret
   %x0 = sext <8 x i8> %a0 to <8 x i16>
   %x1 = ashr <8 x i16> %a1, splat (i16 8)
@@ -485,15 +482,12 @@ define <8 x i16> @freeze_shadd(<8 x i8> %a0, <8 x i16> %a1) {
   ret <8 x i16> %sext
 }
 
-; TODO: Unnecessary sext_inreg
 define <8 x i16> @freeze_srhadd(<8 x i8> %a0, <8 x i16> %a1) {
 ; CHECK-LABEL: freeze_srhadd:
 ; CHECK:       // %bb.0:
 ; CHECK-NEXT:    sshll v0.8h, v0.8b, #0
 ; CHECK-NEXT:    sshr v1.8h, v1.8h, #8
 ; CHECK-NEXT:    srhadd v0.8h, v0.8h, v1.8h
-; CHECK-NEXT:    shl v0.8h, v0.8h, #8
-; CHECK-NEXT:    sshr v0.8h, v0.8h, #8
 ; CHECK-NEXT:    ret
   %x0 = sext <8 x i8> %a0 to <8 x i16>
   %x1 = ashr <8 x i16> %a1, splat (i16 8)

topperc

LGTM

bjope · 2025-10-02T00:55:56Z

I think it is a bit unclear why this is needed. If the operand is known not to be undef/poison, then I would assume that the freeze is removed. So is it some kind of internal phase order at isel, or are we keeping freeze for some reason?

RKSimon · 2025-10-02T06:45:43Z

Its the DemandedElts mask in particular which makes this useful - we can confirm that the specific elements we care about for analysis are not undef/poison, even if other elements are (e.g. due to vector widening, shuffles etc.) which prevents the removal of the freeze entirely.

Another reason is lack of topological sorting of the DAG, which is going to take years to fix at the current rate of progress :(

bjope · 2025-10-02T07:08:50Z

Its the DemandedElts mask in particular which makes this useful - we can confirm that the specific elements we care about for analysis are not undef/poison, even if other elements are (e.g. due to vector widening, shuffles etc.) which prevents the removal of the freeze entirely.

Another reason is lack of topological sorting of the DAG, which is going to take years to fix at the current rate of progress :(

Ok. That is what I suspected. Could perhaps be worth mentioning the reasoning for it in the commit msg. But LGTM on the patch.

llvm-ci · 2025-10-02T07:39:09Z

LLVM Buildbot has detected a new failure on builder mlir-nvidia running on mlir-nvidia while building llvm at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/138/builds/19860

Here is the relevant piece of the build log for the reference

Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin"  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void -O0  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0
# .---command stderr------------
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# `-----------------------------
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input
# |  // CHECK: [84, 84]
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Unranked Memref base@ = 0x59740467df60 rank = 1 offset = 0 sizes = [2] strides = [1] data = 
# | ^
# | <stdin>:2:1: note: possible intended match here
# | [42, 42]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Unranked Memref base@ = 0x59740467df60 rank = 1 offset = 0 sizes = [2] strides = [1] data =  
# | check:68'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: [42, 42] 
# | check:68'0     ~~~~~~~~~
# | check:68'1     ?         possible intended match
...

If X is known never under/poison then skip the freeze and return ComputeNumSignBits(X)

[DAG] Add ComputeNumSignBits(FREEZE(X)) handling

6f29ad5

If X is known never under/poison then skip the freeze and return ComputeNumSignBits(X)

RKSimon requested review from bjope and topperc October 1, 2025 11:12

llvmbot added backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well labels Oct 1, 2025

Merge branch 'main' into dag-computenumsignbits-freeze

2cc31ec

topperc approved these changes Oct 1, 2025

View reviewed changes

bjope added 2 commits October 2, 2025 02:51

Merge branch 'main' into dag-computenumsignbits-freeze

14588c9

Merge branch 'main' into dag-computenumsignbits-freeze

3e4e735

Merge branch 'main' into dag-computenumsignbits-freeze

b2a61c4

RKSimon enabled auto-merge (squash) October 2, 2025 06:46

RKSimon merged commit e5b8c24 into llvm:main Oct 2, 2025
9 checks passed

RKSimon deleted the dag-computenumsignbits-freeze branch October 2, 2025 07:28

RKSimon mentioned this pull request Oct 2, 2025

[DAG] Add computeKnownBits(FREEZE(X)) handling #161642

Open

mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025

[DAG] Add ComputeNumSignBits(FREEZE(X)) handling (llvm#161507)

539213e

If X is known never under/poison then skip the freeze and return ComputeNumSignBits(X)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DAG] Add ComputeNumSignBits(FREEZE(X)) handling #161507

[DAG] Add ComputeNumSignBits(FREEZE(X)) handling #161507

Uh oh!

RKSimon commented Oct 1, 2025

Uh oh!

llvmbot commented Oct 1, 2025 •

edited

Loading

Uh oh!

topperc left a comment

Uh oh!

bjope commented Oct 2, 2025

Uh oh!

RKSimon commented Oct 2, 2025

Uh oh!

bjope commented Oct 2, 2025

Uh oh!

Uh oh!

llvm-ci commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[DAG] Add ComputeNumSignBits(FREEZE(X)) handling #161507

[DAG] Add ComputeNumSignBits(FREEZE(X)) handling #161507

Uh oh!

Conversation

RKSimon commented Oct 1, 2025

Uh oh!

llvmbot commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

bjope commented Oct 2, 2025

Uh oh!

RKSimon commented Oct 2, 2025

Uh oh!

bjope commented Oct 2, 2025

Uh oh!

Uh oh!

llvm-ci commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

llvmbot commented Oct 1, 2025 •

edited

Loading