Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ValueTracking] Infer signedness from dom conditions #72049

Closed
wants to merge 1 commit into from

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Nov 12, 2023

This patch infers the signedness of value based on dominating conditions. After #71534 and this patch, we can drop zext -> zext nneg transform in RISCVCodeGenPrepare:

// Look for an opportunity to infer nneg on a zext if we can determine that
// the sign bit of X is zero via a dominating condition. This often occurs
// with widened induction variables.
if (isImpliedByDomCondition(ICmpInst::ICMP_SGE, Src,
Constant::getNullValue(Src->getType()), &ZExt,
*DL).value_or(false)) {
ZExt.setNonNeg(true);
++NumZExtToSExt;
return true;
}

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 12, 2023

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch infers the signedness of value based on dominating conditions. After #71534 and this patch, we can drop zext -> zext nneg transform in RISCVCodeGenPrepare:

// Look for an opportunity to infer nneg on a zext if we can determine that
// the sign bit of X is zero via a dominating condition. This often occurs
// with widened induction variables.
if (isImpliedByDomCondition(ICmpInst::ICMP_SGE, Src,
Constant::getNullValue(Src->getType()), &ZExt,
*DL).value_or(false)) {
ZExt.setNonNeg(true);
++NumZExtToSExt;
return true;
}


Patch is 48.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72049.diff

13 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+16)
  • (modified) llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/sub-of-negatible.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+40-40)
  • (modified) llvm/test/Transforms/LoopVectorize/if-conversion-nest.ll (+6-6)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction.ll (+21-21)
  • (modified) llvm/test/Transforms/LoopVectorize/runtime-check.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-loop-unpredicated-body-scalar-tail.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pixel-splat.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/simplifycfg-late.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll (+2-2)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 923ff043fe71057..4dac9ca016ff608 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -290,6 +290,10 @@ bool llvm::isKnownNonNegative(const Value *V, const DataLayout &DL,
                               unsigned Depth, AssumptionCache *AC,
                               const Instruction *CxtI, const DominatorTree *DT,
                               bool UseInstrInfo) {
+  if (isImpliedByDomCondition(ICmpInst::ICMP_SGE, V,
+                              Constant::getNullValue(V->getType()), CxtI, DL)
+          .value_or(false))
+    return true;
   KnownBits Known = computeKnownBits(V, DL, Depth, AC, CxtI, DT, UseInstrInfo);
   return Known.isNonNegative();
 }
@@ -300,6 +304,10 @@ bool llvm::isKnownPositive(const Value *V, const DataLayout &DL, unsigned Depth,
   if (auto *CI = dyn_cast<ConstantInt>(V))
     return CI->getValue().isStrictlyPositive();
 
+  if (isImpliedByDomCondition(ICmpInst::ICMP_SGT, V,
+                              Constant::getNullValue(V->getType()), CxtI, DL)
+          .value_or(false))
+    return true;
   // TODO: We'd doing two recursive queries here.  We should factor this such
   // that only a single query is needed.
   return isKnownNonNegative(V, DL, Depth, AC, CxtI, DT, UseInstrInfo) &&
@@ -309,6 +317,10 @@ bool llvm::isKnownPositive(const Value *V, const DataLayout &DL, unsigned Depth,
 bool llvm::isKnownNegative(const Value *V, const DataLayout &DL, unsigned Depth,
                            AssumptionCache *AC, const Instruction *CxtI,
                            const DominatorTree *DT, bool UseInstrInfo) {
+  if (isImpliedByDomCondition(ICmpInst::ICMP_SLT, V,
+                              Constant::getNullValue(V->getType()), CxtI, DL)
+          .value_or(false))
+    return true;
   KnownBits Known = computeKnownBits(V, DL, Depth, AC, CxtI, DT, UseInstrInfo);
   return Known.isNegative();
 }
@@ -3197,6 +3209,10 @@ static bool isKnownNonEqual(const Value *V1, const Value *V2, unsigned Depth,
   if (isNonEqualSelect(V1, V2, Depth, Q) || isNonEqualSelect(V2, V1, Depth, Q))
     return true;
 
+  if (isImpliedByDomCondition(ICmpInst::ICMP_NE, V1, V2, Q.CxtI, Q.DL)
+          .value_or(false))
+    return true;
+
   return false;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
index e924e17baad0075..1b02eed8bf97d21 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible-inseltpoison.ll
@@ -865,7 +865,7 @@ define i16 @negation_of_zeroext_of_nonnegative(i8 %x) {
 ; CHECK-NEXT:    [[T1:%.*]] = icmp sgt i8 [[T0]], -1
 ; CHECK-NEXT:    br i1 [[T1]], label [[NONNEG_BB:%.*]], label [[NEG_BB:%.*]]
 ; CHECK:       nonneg_bb:
-; CHECK-NEXT:    [[T2:%.*]] = zext i8 [[T0]] to i16
+; CHECK-NEXT:    [[T2:%.*]] = zext nneg i8 [[T0]] to i16
 ; CHECK-NEXT:    [[T3:%.*]] = sub nsw i16 0, [[T2]]
 ; CHECK-NEXT:    ret i16 [[T3]]
 ; CHECK:       neg_bb:
@@ -889,7 +889,7 @@ define i16 @negation_of_zeroext_of_positive(i8 %x) {
 ; CHECK-NEXT:    [[T1:%.*]] = icmp sgt i8 [[T0]], 0
 ; CHECK-NEXT:    br i1 [[T1]], label [[NONNEG_BB:%.*]], label [[NEG_BB:%.*]]
 ; CHECK:       nonneg_bb:
-; CHECK-NEXT:    [[T2:%.*]] = zext i8 [[T0]] to i16
+; CHECK-NEXT:    [[T2:%.*]] = zext nneg i8 [[T0]] to i16
 ; CHECK-NEXT:    [[T3:%.*]] = sub nsw i16 0, [[T2]]
 ; CHECK-NEXT:    ret i16 [[T3]]
 ; CHECK:       neg_bb:
@@ -961,7 +961,7 @@ define i16 @negation_of_signext_of_nonnegative__wrong_cast(i8 %x) {
 ; CHECK-NEXT:    [[T1:%.*]] = icmp sgt i8 [[T0]], -1
 ; CHECK-NEXT:    br i1 [[T1]], label [[NONNEG_BB:%.*]], label [[NEG_BB:%.*]]
 ; CHECK:       nonneg_bb:
-; CHECK-NEXT:    [[T2:%.*]] = sext i8 [[T0]] to i16
+; CHECK-NEXT:    [[T2:%.*]] = zext nneg i8 [[T0]] to i16
 ; CHECK-NEXT:    [[T3:%.*]] = sub nsw i16 0, [[T2]]
 ; CHECK-NEXT:    ret i16 [[T3]]
 ; CHECK:       neg_bb:
diff --git a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
index 2aed639da6dd7f0..23e742db94e3a87 100644
--- a/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
+++ b/llvm/test/Transforms/InstCombine/sub-of-negatible.ll
@@ -889,7 +889,7 @@ define i16 @negation_of_zeroext_of_nonnegative(i8 %x) {
 ; CHECK-NEXT:    [[T1:%.*]] = icmp sgt i8 [[T0]], -1
 ; CHECK-NEXT:    br i1 [[T1]], label [[NONNEG_BB:%.*]], label [[NEG_BB:%.*]]
 ; CHECK:       nonneg_bb:
-; CHECK-NEXT:    [[T2:%.*]] = zext i8 [[T0]] to i16
+; CHECK-NEXT:    [[T2:%.*]] = zext nneg i8 [[T0]] to i16
 ; CHECK-NEXT:    [[T3:%.*]] = sub nsw i16 0, [[T2]]
 ; CHECK-NEXT:    ret i16 [[T3]]
 ; CHECK:       neg_bb:
@@ -913,7 +913,7 @@ define i16 @negation_of_zeroext_of_positive(i8 %x) {
 ; CHECK-NEXT:    [[T1:%.*]] = icmp sgt i8 [[T0]], 0
 ; CHECK-NEXT:    br i1 [[T1]], label [[NONNEG_BB:%.*]], label [[NEG_BB:%.*]]
 ; CHECK:       nonneg_bb:
-; CHECK-NEXT:    [[T2:%.*]] = zext i8 [[T0]] to i16
+; CHECK-NEXT:    [[T2:%.*]] = zext nneg i8 [[T0]] to i16
 ; CHECK-NEXT:    [[T3:%.*]] = sub nsw i16 0, [[T2]]
 ; CHECK-NEXT:    ret i16 [[T3]]
 ; CHECK:       neg_bb:
@@ -985,7 +985,7 @@ define i16 @negation_of_signext_of_nonnegative__wrong_cast(i8 %x) {
 ; CHECK-NEXT:    [[T1:%.*]] = icmp sgt i8 [[T0]], -1
 ; CHECK-NEXT:    br i1 [[T1]], label [[NONNEG_BB:%.*]], label [[NEG_BB:%.*]]
 ; CHECK:       nonneg_bb:
-; CHECK-NEXT:    [[T2:%.*]] = sext i8 [[T0]] to i16
+; CHECK-NEXT:    [[T2:%.*]] = zext nneg i8 [[T0]] to i16
 ; CHECK-NEXT:    [[T3:%.*]] = sub nsw i16 0, [[T2]]
 ; CHECK-NEXT:    ret i16 [[T3]]
 ; CHECK:       neg_bb:
diff --git a/llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll b/llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll
index 2c78b7208c19ff2..baeaef0b67b041c 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll
@@ -20,11 +20,11 @@ define void @fp_iv_loop1(ptr noalias nocapture %A, i32 %N) #0 {
 ; AUTO_VEC-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[N:%.*]], 0
 ; AUTO_VEC-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END:%.*]]
 ; AUTO_VEC:       for.body.preheader:
-; AUTO_VEC-NEXT:    [[ZEXT:%.*]] = zext i32 [[N]] to i64
+; AUTO_VEC-NEXT:    [[ZEXT:%.*]] = zext nneg i32 [[N]] to i64
 ; AUTO_VEC-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 32
 ; AUTO_VEC-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.*]], label [[VECTOR_PH:%.*]]
 ; AUTO_VEC:       vector.ph:
-; AUTO_VEC-NEXT:    [[N_VEC:%.*]] = and i64 [[ZEXT]], 4294967264
+; AUTO_VEC-NEXT:    [[N_VEC:%.*]] = and i64 [[ZEXT]], 2147483616
 ; AUTO_VEC-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; AUTO_VEC-NEXT:    [[TMP0:%.*]] = fmul fast float [[DOTCAST]], 5.000000e-01
 ; AUTO_VEC-NEXT:    [[IND_END:%.*]] = fadd fast float [[TMP0]], 1.000000e+00
@@ -103,12 +103,12 @@ define void @fp_iv_loop2(ptr noalias nocapture %A, i32 %N) {
 ; AUTO_VEC-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[N:%.*]], 0
 ; AUTO_VEC-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END:%.*]]
 ; AUTO_VEC:       for.body.preheader:
-; AUTO_VEC-NEXT:    [[ZEXT:%.*]] = zext i32 [[N]] to i64
+; AUTO_VEC-NEXT:    [[ZEXT:%.*]] = zext nneg i32 [[N]] to i64
 ; AUTO_VEC-NEXT:    [[XTRAITER:%.*]] = and i64 [[ZEXT]], 7
 ; AUTO_VEC-NEXT:    [[TMP0:%.*]] = icmp ult i32 [[N]], 8
 ; AUTO_VEC-NEXT:    br i1 [[TMP0]], label [[FOR_END_LOOPEXIT_UNR_LCSSA:%.*]], label [[FOR_BODY_PREHEADER_NEW:%.*]]
 ; AUTO_VEC:       for.body.preheader.new:
-; AUTO_VEC-NEXT:    [[UNROLL_ITER:%.*]] = and i64 [[ZEXT]], 4294967288
+; AUTO_VEC-NEXT:    [[UNROLL_ITER:%.*]] = and i64 [[ZEXT]], 2147483640
 ; AUTO_VEC-NEXT:    br label [[FOR_BODY:%.*]]
 ; AUTO_VEC:       for.body:
 ; AUTO_VEC-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ 0, [[FOR_BODY_PREHEADER_NEW]] ], [ [[INDVARS_IV_NEXT_7:%.*]], [[FOR_BODY]] ]
diff --git a/llvm/test/Transforms/LoopVectorize/float-induction.ll b/llvm/test/Transforms/LoopVectorize/float-induction.ll
index 44f3c3db26d9514..da2d8eb1b41666d 100644
--- a/llvm/test/Transforms/LoopVectorize/float-induction.ll
+++ b/llvm/test/Transforms/LoopVectorize/float-induction.ll
@@ -24,11 +24,11 @@ define void @fp_iv_loop1_fast_FMF(float %init, ptr noalias nocapture %A, i32 %N)
 ; VEC4_INTERL1-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC4_INTERL1:       for.body.lr.ph:
 ; VEC4_INTERL1-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC4_INTERL1-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC4_INTERL1-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC4_INTERL1-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4
 ; VEC4_INTERL1-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC4_INTERL1:       vector.ph:
-; VEC4_INTERL1-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967292
+; VEC4_INTERL1-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483644
 ; VEC4_INTERL1-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC4_INTERL1-NEXT:    [[TMP1:%.*]] = fmul fast float [[FPINC]], [[DOTCAST]]
 ; VEC4_INTERL1-NEXT:    [[IND_END:%.*]] = fsub fast float [[INIT:%.*]], [[TMP1]]
@@ -79,11 +79,11 @@ define void @fp_iv_loop1_fast_FMF(float %init, ptr noalias nocapture %A, i32 %N)
 ; VEC4_INTERL2-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC4_INTERL2:       for.body.lr.ph:
 ; VEC4_INTERL2-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC4_INTERL2-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC4_INTERL2-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC4_INTERL2-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 8
 ; VEC4_INTERL2-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC4_INTERL2:       vector.ph:
-; VEC4_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967288
+; VEC4_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483640
 ; VEC4_INTERL2-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC4_INTERL2-NEXT:    [[TMP1:%.*]] = fmul fast float [[FPINC]], [[DOTCAST]]
 ; VEC4_INTERL2-NEXT:    [[IND_END:%.*]] = fsub fast float [[INIT:%.*]], [[TMP1]]
@@ -137,11 +137,11 @@ define void @fp_iv_loop1_fast_FMF(float %init, ptr noalias nocapture %A, i32 %N)
 ; VEC1_INTERL2-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC1_INTERL2:       for.body.lr.ph:
 ; VEC1_INTERL2-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC1_INTERL2-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC1_INTERL2-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC1_INTERL2-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp eq i32 [[N]], 1
 ; VEC1_INTERL2-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC1_INTERL2:       vector.ph:
-; VEC1_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967294
+; VEC1_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483646
 ; VEC1_INTERL2-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC1_INTERL2-NEXT:    [[TMP1:%.*]] = fmul fast float [[FPINC]], [[DOTCAST]]
 ; VEC1_INTERL2-NEXT:    [[IND_END:%.*]] = fsub fast float [[INIT:%.*]], [[TMP1]]
@@ -188,11 +188,11 @@ define void @fp_iv_loop1_fast_FMF(float %init, ptr noalias nocapture %A, i32 %N)
 ; VEC2_INTERL1_PRED_STORE-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC2_INTERL1_PRED_STORE:       for.body.lr.ph:
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp eq i32 [[N]], 1
 ; VEC2_INTERL1_PRED_STORE-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC2_INTERL1_PRED_STORE:       vector.ph:
-; VEC2_INTERL1_PRED_STORE-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967294
+; VEC2_INTERL1_PRED_STORE-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483646
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP1:%.*]] = fmul fast float [[FPINC]], [[DOTCAST]]
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[IND_END:%.*]] = fsub fast float [[INIT:%.*]], [[TMP1]]
@@ -271,11 +271,11 @@ define void @fp_iv_loop1_reassoc_FMF(float %init, ptr noalias nocapture %A, i32
 ; VEC4_INTERL1-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC4_INTERL1:       for.body.lr.ph:
 ; VEC4_INTERL1-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC4_INTERL1-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC4_INTERL1-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC4_INTERL1-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4
 ; VEC4_INTERL1-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC4_INTERL1:       vector.ph:
-; VEC4_INTERL1-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967292
+; VEC4_INTERL1-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483644
 ; VEC4_INTERL1-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC4_INTERL1-NEXT:    [[TMP1:%.*]] = fmul reassoc float [[FPINC]], [[DOTCAST]]
 ; VEC4_INTERL1-NEXT:    [[IND_END:%.*]] = fsub reassoc float [[INIT:%.*]], [[TMP1]]
@@ -326,11 +326,11 @@ define void @fp_iv_loop1_reassoc_FMF(float %init, ptr noalias nocapture %A, i32
 ; VEC4_INTERL2-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC4_INTERL2:       for.body.lr.ph:
 ; VEC4_INTERL2-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC4_INTERL2-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC4_INTERL2-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC4_INTERL2-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 8
 ; VEC4_INTERL2-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC4_INTERL2:       vector.ph:
-; VEC4_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967288
+; VEC4_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483640
 ; VEC4_INTERL2-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC4_INTERL2-NEXT:    [[TMP1:%.*]] = fmul reassoc float [[FPINC]], [[DOTCAST]]
 ; VEC4_INTERL2-NEXT:    [[IND_END:%.*]] = fsub reassoc float [[INIT:%.*]], [[TMP1]]
@@ -384,11 +384,11 @@ define void @fp_iv_loop1_reassoc_FMF(float %init, ptr noalias nocapture %A, i32
 ; VEC1_INTERL2-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC1_INTERL2:       for.body.lr.ph:
 ; VEC1_INTERL2-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC1_INTERL2-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC1_INTERL2-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC1_INTERL2-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp eq i32 [[N]], 1
 ; VEC1_INTERL2-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC1_INTERL2:       vector.ph:
-; VEC1_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967294
+; VEC1_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483646
 ; VEC1_INTERL2-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC1_INTERL2-NEXT:    [[TMP1:%.*]] = fmul reassoc float [[FPINC]], [[DOTCAST]]
 ; VEC1_INTERL2-NEXT:    [[IND_END:%.*]] = fsub reassoc float [[INIT:%.*]], [[TMP1]]
@@ -437,11 +437,11 @@ define void @fp_iv_loop1_reassoc_FMF(float %init, ptr noalias nocapture %A, i32
 ; VEC2_INTERL1_PRED_STORE-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_LR_PH:%.*]], label [[FOR_END:%.*]]
 ; VEC2_INTERL1_PRED_STORE:       for.body.lr.ph:
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[FPINC:%.*]] = load float, ptr @fp_inc, align 4
-; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp eq i32 [[N]], 1
 ; VEC2_INTERL1_PRED_STORE-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC2_INTERL1_PRED_STORE:       vector.ph:
-; VEC2_INTERL1_PRED_STORE-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967294
+; VEC2_INTERL1_PRED_STORE-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483646
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP1:%.*]] = fmul reassoc float [[FPINC]], [[DOTCAST]]
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[IND_END:%.*]] = fsub reassoc float [[INIT:%.*]], [[TMP1]]
@@ -521,11 +521,11 @@ define void @fp_iv_loop2(float %init, ptr noalias nocapture %A, i32 %N) #0 {
 ; VEC4_INTERL1-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[N:%.*]], 0
 ; VEC4_INTERL1-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END:%.*]]
 ; VEC4_INTERL1:       for.body.preheader:
-; VEC4_INTERL1-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC4_INTERL1-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC4_INTERL1-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 4
 ; VEC4_INTERL1-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC4_INTERL1:       vector.ph:
-; VEC4_INTERL1-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967292
+; VEC4_INTERL1-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483644
 ; VEC4_INTERL1-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC4_INTERL1-NEXT:    [[TMP1:%.*]] = fmul fast float [[DOTCAST]], 5.000000e-01
 ; VEC4_INTERL1-NEXT:    [[IND_END:%.*]] = fadd fast float [[TMP1]], [[INIT:%.*]]
@@ -569,11 +569,11 @@ define void @fp_iv_loop2(float %init, ptr noalias nocapture %A, i32 %N) #0 {
 ; VEC4_INTERL2-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[N:%.*]], 0
 ; VEC4_INTERL2-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END:%.*]]
 ; VEC4_INTERL2:       for.body.preheader:
-; VEC4_INTERL2-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC4_INTERL2-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC4_INTERL2-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 8
 ; VEC4_INTERL2-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC4_INTERL2:       vector.ph:
-; VEC4_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967288
+; VEC4_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483640
 ; VEC4_INTERL2-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC4_INTERL2-NEXT:    [[TMP1:%.*]] = fmul fast float [[DOTCAST]], 5.000000e-01
 ; VEC4_INTERL2-NEXT:    [[IND_END:%.*]] = fadd fast float [[TMP1]], [[INIT:%.*]]
@@ -620,11 +620,11 @@ define void @fp_iv_loop2(float %init, ptr noalias nocapture %A, i32 %N) #0 {
 ; VEC1_INTERL2-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[N:%.*]], 0
 ; VEC1_INTERL2-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END:%.*]]
 ; VEC1_INTERL2:       for.body.preheader:
-; VEC1_INTERL2-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC1_INTERL2-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC1_INTERL2-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp eq i32 [[N]], 1
 ; VEC1_INTERL2-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; VEC1_INTERL2:       vector.ph:
-; VEC1_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 4294967294
+; VEC1_INTERL2-NEXT:    [[N_VEC:%.*]] = and i64 [[TMP0]], 2147483646
 ; VEC1_INTERL2-NEXT:    [[DOTCAST:%.*]] = sitofp i64 [[N_VEC]] to float
 ; VEC1_INTERL2-NEXT:    [[TMP1:%.*]] = fmul fast float [[DOTCAST]], 5.000000e-01
 ; VEC1_INTERL2-NEXT:    [[IND_END:%.*]] = fadd fast float [[TMP1]], [[INIT:%.*]]
@@ -670,11 +670,11 @@ define void @fp_iv_loop2(float %init, ptr noalias nocapture %A, i32 %N) #0 {
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[CMP4:%.*]] = icmp sgt i32 [[N:%.*]], 0
 ; VEC2_INTERL1_PRED_STORE-NEXT:    br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.*]], label [[FOR_END:%.*]]
 ; VEC2_INTERL1_PRED_STORE:       for.body.preheader:
-; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP0:%.*]] = zext i32 [[N]] to i64
+; VEC2_INTERL1_PRED_STORE-NEXT:    [[TMP0:%.*]] = zext nneg i32 [[N]] to i64
 ; VEC2_INTERL1_PRED_STORE-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp eq i32 [[N]], 1
 ; VEC2_INTERL1_PRED_STORE-NEXT:    br i1 [[MIN_ITERS_CHECK]], label [[FOR_BODY:%.*]], la...
[truncated]

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done in CVP and IPSCCP instead.

@dtcxzyw dtcxzyw closed this Nov 12, 2023
dtcxzyw added a commit that referenced this pull request Nov 13, 2023
This patch infers `nneg` flags for existing zext instructions in CVP.
After #71534 and this patch, we
can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`:


https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83

This is an alternative to #72049.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
This patch infers `nneg` flags for existing zext instructions in CVP.
After llvm#71534 and this patch, we
can drop `zext -> zext nneg` transform in `RISCVCodeGenPrepare`:


https://github.com/llvm/llvm-project/blob/40671bbdefb6ff83e2685576a3cb041b62f25bbe/llvm/lib/Target/RISCV/RISCVCodeGenPrepare.cpp#L74-L83

This is an alternative to llvm#72049.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants