Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TypePromotion] Support positive addition amounts in isSafeWrap. #81690

Merged
merged 1 commit into from
Mar 1, 2024

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Feb 14, 2024

We can support these by changing the sext promotion to -zext(-C) and replacing a sgt check with ugt. Reframing the logic in terms of how the unsigned range are affected.

More comments in the patch.

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 14, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-aarch64

Author: Craig Topper (topperc)

Changes

We can support these by changing the sext promotion to -zext(-C) and replacing a sgt check with ugt. Reframing the logic in terms of how the unsigned range are affected.

More comments in the patch.


Patch is 54.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/81690.diff

11 Files Affected:

  • (modified) llvm/lib/CodeGen/TypePromotion.cpp (+52-57)
  • (modified) llvm/test/CodeGen/AArch64/and-mask-removal.ll (+31-31)
  • (modified) llvm/test/CodeGen/AArch64/lack-of-signed-truncation-check.ll (+36-26)
  • (modified) llvm/test/CodeGen/AArch64/signed-truncation-check.ll (+37-27)
  • (modified) llvm/test/CodeGen/AArch64/typepromotion-overflow.ll (+2-3)
  • (modified) llvm/test/CodeGen/AArch64/typepromotion-signed.ll (+4-3)
  • (modified) llvm/test/CodeGen/RISCV/lack-of-signed-truncation-check.ll (+150-90)
  • (modified) llvm/test/CodeGen/RISCV/signed-truncation-check.ll (+146-60)
  • (modified) llvm/test/CodeGen/RISCV/typepromotion-overflow.ll (+2-3)
  • (modified) llvm/test/Transforms/TypePromotion/ARM/icmps.ll (+4-3)
  • (modified) llvm/test/Transforms/TypePromotion/ARM/wrapping.ll (+6-4)
diff --git a/llvm/lib/CodeGen/TypePromotion.cpp b/llvm/lib/CodeGen/TypePromotion.cpp
index 48ad8de778010e..2116027f22c6bd 100644
--- a/llvm/lib/CodeGen/TypePromotion.cpp
+++ b/llvm/lib/CodeGen/TypePromotion.cpp
@@ -272,64 +272,58 @@ bool TypePromotionImpl::isSink(Value *V) {
 
 /// Return whether this instruction can safely wrap.
 bool TypePromotionImpl::isSafeWrap(Instruction *I) {
-  // We can support a potentially wrapping instruction (I) if:
+  // We can support a potentially wrapping Add/Sub instruction (I) if:
   // - It is only used by an unsigned icmp.
   // - The icmp uses a constant.
-  // - The wrapping value (I) is decreasing, i.e would underflow - wrapping
-  //   around zero to become a larger number than before.
   // - The wrapping instruction (I) also uses a constant.
   //
-  // We can then use the two constants to calculate whether the result would
-  // wrap in respect to itself in the original bitwidth. If it doesn't wrap,
-  // just underflows the range, the icmp would give the same result whether the
-  // result has been truncated or not. We calculate this by:
-  // - Zero extending both constants, if needed, to RegisterBitWidth.
-  // - Take the absolute value of I's constant, adding this to the icmp const.
-  // - Check that this value is not out of range for small type. If it is, it
-  //   means that it has underflowed enough to wrap around the icmp constant.
+  // This a common pattern emitted to check if a value is within a range.
   //
   // For example:
   //
-  // %sub = sub i8 %a, 2
-  // %cmp = icmp ule i8 %sub, 254
+  // %sub = sub i8 %a, C1
+  // %cmp = icmp ule i8 %sub, C2
+  //
+  // or
   //
-  // If %a = 0, %sub = -2 == FE == 254
-  // But if this is evalulated as a i32
-  // %sub = -2 == FF FF FF FE == 4294967294
-  // So the unsigned compares (i8 and i32) would not yield the same result.
+  // %add = add i8 %a, C1
+  // %cmp = icmp ule i8 %add, C2.
   //
-  // Another way to look at it is:
-  // %a - 2 <= 254
-  // %a + 2 <= 254 + 2
-  // %a <= 256
-  // And we can't represent 256 in the i8 format, so we don't support it.
+  // We will treat an add as though it were a subtract by -C1. To promote
+  // the Add/Sub we will zero extend the LHS and the subtracted amount. For Add,
+  // this means we need to negate the constant, zero extend to RegisterBitWidth,
+  // and negate in the larger type.
   //
-  // Whereas:
+  // This will produce a value in the range [-zext(C1), zext(X)-zext(C1)] where
+  // C1 is the subtracted amount. This is either a small unsigned number or a
+  // large unsigned number in the promoted type.
   //
-  // %sub i8 %a, 1
+  // Now we need to correct the compare constant C2. Values >= C1 in the
+  // original add result range have been remapped to large values in the
+  // promoted range. If the compare constant fell into this range we need to
+  // remap it as well. We can do this as -(zext(-C2)).
+  //
+  // For example:
+  //
+  // %sub = sub i8 %a, 2
   // %cmp = icmp ule i8 %sub, 254
   //
-  // If %a = 0, %sub = -1 == FF == 255
-  // As i32:
-  // %sub = -1 == FF FF FF FF == 4294967295
+  // becomes
   //
-  // In this case, the unsigned compare results would be the same and this
-  // would also be true for ult, uge and ugt:
-  // - (255 < 254) == (0xFFFFFFFF < 254) == false
-  // - (255 <= 254) == (0xFFFFFFFF <= 254) == false
-  // - (255 > 254) == (0xFFFFFFFF > 254) == true
-  // - (255 >= 254) == (0xFFFFFFFF >= 254) == true
+  // %zext = zext %a to i32
+  // %sub = sub i32 %zext, 2
+  // %cmp = icmp ule i32 %sub, 4294967294
   //
-  // To demonstrate why we can't handle increasing values:
+  // Another example:
   //
-  // %add = add i8 %a, 2
-  // %cmp = icmp ult i8 %add, 127
+  // %sub = sub i8 %a, 1
+  // %cmp = icmp ule i8 %sub, 254
   //
-  // If %a = 254, %add = 256 == (i8 1)
-  // As i32:
-  // %add = 256
+  // becomes
   //
-  // (1 < 127) != (256 < 127)
+  // %zext = zext %a to i32
+  // %sub = sub i32 %zext, 1
+  // %cmp = icmp ule i32 %sub, 254
 
   unsigned Opc = I->getOpcode();
   if (Opc != Instruction::Add && Opc != Instruction::Sub)
@@ -356,15 +350,10 @@ bool TypePromotionImpl::isSafeWrap(Instruction *I) {
   APInt OverflowConst = cast<ConstantInt>(I->getOperand(1))->getValue();
   if (Opc == Instruction::Sub)
     OverflowConst = -OverflowConst;
-  if (!OverflowConst.isNonPositive())
-    return false;
 
   SafeWrap.insert(I);
 
-  // Using C1 = OverflowConst and C2 = ICmpConst, we can either prove that:
-  //   zext(x) + sext(C1) <u zext(C2)  if C1 < 0 and C1 >s C2
-  //   zext(x) + sext(C1) <u sext(C2)  if C1 < 0 and C1 <=s C2
-  if (OverflowConst.sgt(ICmpConst)) {
+  if (OverflowConst.ugt(ICmpConst)) {
     LLVM_DEBUG(dbgs() << "IR Promotion: Allowing safe overflow for sext "
                       << "const of " << *I << "\n");
     return true;
@@ -487,18 +476,24 @@ void IRPromoter::PromoteTree() {
         continue;
 
       if (auto *Const = dyn_cast<ConstantInt>(Op)) {
-        // For subtract, we don't need to sext the constant. We only put it in
+        // For subtract, we only need to zext the constant. We only put it in
         // SafeWrap because SafeWrap.size() is used elsewhere.
-        // For cmp, we need to sign extend a constant appearing in either
-        // operand. For add, we should only sign extend the RHS.
-        Constant *NewConst =
-            ConstantInt::get(Const->getContext(),
-                             (SafeWrap.contains(I) &&
-                              (I->getOpcode() == Instruction::ICmp || i == 1) &&
-                              I->getOpcode() != Instruction::Sub)
-                                 ? Const->getValue().sext(PromotedWidth)
-                                 : Const->getValue().zext(PromotedWidth));
-        I->setOperand(i, NewConst);
+        // For Add and ICmp we need to find how far the constant is from the
+        // top of its original unsigned range and place it the same distance
+        // from the top of its new unsigned range. We can do this by negating
+        // the constant, zero extending it, then negating in the new type.
+        APInt NewConst;
+        if (SafeWrap.contains(I)) {
+          if (I->getOpcode() == Instruction::ICmp)
+            NewConst = -((-Const->getValue()).zext(PromotedWidth));
+          else if (I->getOpcode() == Instruction::Add && i == 1)
+            NewConst = -((-Const->getValue()).zext(PromotedWidth));
+          else
+            NewConst = Const->getValue().zext(PromotedWidth);
+        } else
+          NewConst = Const->getValue().zext(PromotedWidth);
+
+        I->setOperand(i, ConstantInt::get(Const->getContext(), NewConst));
       } else if (isa<UndefValue>(Op))
         I->setOperand(i, ConstantInt::get(ExtTy, 0));
     }
diff --git a/llvm/test/CodeGen/AArch64/and-mask-removal.ll b/llvm/test/CodeGen/AArch64/and-mask-removal.ll
index 17ff0159701689..4ed482c8f2ba39 100644
--- a/llvm/test/CodeGen/AArch64/and-mask-removal.ll
+++ b/llvm/test/CodeGen/AArch64/and-mask-removal.ll
@@ -65,9 +65,8 @@ if.end:                                           ; preds = %if.then, %entry
 define zeroext i1 @test8_0(i8 zeroext %x)  align 2 {
 ; CHECK-LABEL: test8_0:
 ; CHECK:       ; %bb.0: ; %entry
-; CHECK-NEXT:    add w8, w0, #74
-; CHECK-NEXT:    and w8, w8, #0xff
-; CHECK-NEXT:    cmp w8, #236
+; CHECK-NEXT:    sub w8, w0, #182
+; CHECK-NEXT:    cmn w8, #20
 ; CHECK-NEXT:    cset w0, lo
 ; CHECK-NEXT:    ret
 entry:
@@ -295,20 +294,20 @@ ret_true:
 define zeroext i1 @test16_2(i16 zeroext %x)  align 2 {
 ; CHECK-SD-LABEL: test16_2:
 ; CHECK-SD:       ; %bb.0: ; %entry
-; CHECK-SD-NEXT:    mov w8, #16882 ; =0x41f2
-; CHECK-SD-NEXT:    mov w9, #40700 ; =0x9efc
+; CHECK-SD-NEXT:    mov w8, #-48654 ; =0xffff41f2
+; CHECK-SD-NEXT:    mov w9, #-24836 ; =0xffff9efc
 ; CHECK-SD-NEXT:    add w8, w0, w8
-; CHECK-SD-NEXT:    cmp w9, w8, uxth
-; CHECK-SD-NEXT:    cset w0, hi
+; CHECK-SD-NEXT:    cmp w8, w9
+; CHECK-SD-NEXT:    cset w0, lo
 ; CHECK-SD-NEXT:    ret
 ;
 ; CHECK-GI-LABEL: test16_2:
 ; CHECK-GI:       ; %bb.0: ; %entry
-; CHECK-GI-NEXT:    mov w8, #16882 ; =0x41f2
-; CHECK-GI-NEXT:    mov w9, #40699 ; =0x9efb
+; CHECK-GI-NEXT:    mov w8, #-48654 ; =0xffff41f2
+; CHECK-GI-NEXT:    mov w9, #-24837 ; =0xffff9efb
 ; CHECK-GI-NEXT:    add w8, w0, w8
-; CHECK-GI-NEXT:    cmp w9, w8, uxth
-; CHECK-GI-NEXT:    cset w0, hs
+; CHECK-GI-NEXT:    cmp w8, w9
+; CHECK-GI-NEXT:    cset w0, ls
 ; CHECK-GI-NEXT:    ret
 entry:
   %0 = add i16 %x, 16882
@@ -349,20 +348,20 @@ ret_true:
 define zeroext i1 @test16_4(i16 zeroext %x)  align 2 {
 ; CHECK-SD-LABEL: test16_4:
 ; CHECK-SD:       ; %bb.0: ; %entry
-; CHECK-SD-NEXT:    mov w8, #29985 ; =0x7521
+; CHECK-SD-NEXT:    mov w8, #-35551 ; =0xffff7521
 ; CHECK-SD-NEXT:    mov w9, #15676 ; =0x3d3c
 ; CHECK-SD-NEXT:    add w8, w0, w8
-; CHECK-SD-NEXT:    cmp w9, w8, uxth
-; CHECK-SD-NEXT:    cset w0, lo
+; CHECK-SD-NEXT:    cmp w8, w9
+; CHECK-SD-NEXT:    cset w0, hi
 ; CHECK-SD-NEXT:    ret
 ;
 ; CHECK-GI-LABEL: test16_4:
 ; CHECK-GI:       ; %bb.0: ; %entry
-; CHECK-GI-NEXT:    mov w8, #29985 ; =0x7521
+; CHECK-GI-NEXT:    mov w8, #-35551 ; =0xffff7521
 ; CHECK-GI-NEXT:    mov w9, #15677 ; =0x3d3d
 ; CHECK-GI-NEXT:    add w8, w0, w8
-; CHECK-GI-NEXT:    cmp w9, w8, uxth
-; CHECK-GI-NEXT:    cset w0, ls
+; CHECK-GI-NEXT:    cmp w8, w9
+; CHECK-GI-NEXT:    cset w0, hs
 ; CHECK-GI-NEXT:    ret
 entry:
   %0 = add i16 %x, -35551
@@ -431,20 +430,20 @@ ret_true:
 define zeroext i1 @test16_7(i16 zeroext %x)  align 2 {
 ; CHECK-SD-LABEL: test16_7:
 ; CHECK-SD:       ; %bb.0: ; %entry
-; CHECK-SD-NEXT:    mov w8, #9272 ; =0x2438
-; CHECK-SD-NEXT:    mov w9, #22619 ; =0x585b
+; CHECK-SD-NEXT:    mov w8, #-56264 ; =0xffff2438
+; CHECK-SD-NEXT:    mov w9, #-42917 ; =0xffff585b
 ; CHECK-SD-NEXT:    add w8, w0, w8
-; CHECK-SD-NEXT:    cmp w9, w8, uxth
-; CHECK-SD-NEXT:    cset w0, lo
+; CHECK-SD-NEXT:    cmp w8, w9
+; CHECK-SD-NEXT:    cset w0, hi
 ; CHECK-SD-NEXT:    ret
 ;
 ; CHECK-GI-LABEL: test16_7:
 ; CHECK-GI:       ; %bb.0: ; %entry
-; CHECK-GI-NEXT:    mov w8, #9272 ; =0x2438
-; CHECK-GI-NEXT:    mov w9, #22620 ; =0x585c
+; CHECK-GI-NEXT:    mov w8, #-56264 ; =0xffff2438
+; CHECK-GI-NEXT:    mov w9, #-42916 ; =0xffff585c
 ; CHECK-GI-NEXT:    add w8, w0, w8
-; CHECK-GI-NEXT:    cmp w9, w8, uxth
-; CHECK-GI-NEXT:    cset w0, ls
+; CHECK-GI-NEXT:    cmp w8, w9
+; CHECK-GI-NEXT:    cset w0, hs
 ; CHECK-GI-NEXT:    ret
 entry:
   %0 = add i16 %x, 9272
@@ -508,16 +507,17 @@ define i64 @pr58109(i8 signext %0) {
 define i64 @pr58109b(i8 signext %0, i64 %a, i64 %b) {
 ; CHECK-SD-LABEL: pr58109b:
 ; CHECK-SD:       ; %bb.0:
-; CHECK-SD-NEXT:    add w8, w0, #1
-; CHECK-SD-NEXT:    tst w8, #0xfe
-; CHECK-SD-NEXT:    csel x0, x1, x2, eq
+; CHECK-SD-NEXT:    and w8, w0, #0xff
+; CHECK-SD-NEXT:    sub w8, w8, #255
+; CHECK-SD-NEXT:    cmn w8, #254
+; CHECK-SD-NEXT:    csel x0, x1, x2, lo
 ; CHECK-SD-NEXT:    ret
 ;
 ; CHECK-GI-LABEL: pr58109b:
 ; CHECK-GI:       ; %bb.0:
-; CHECK-GI-NEXT:    add w8, w0, #1
-; CHECK-GI-NEXT:    and w8, w8, #0xff
-; CHECK-GI-NEXT:    cmp w8, #2
+; CHECK-GI-NEXT:    mov w8, #-255 ; =0xffffff01
+; CHECK-GI-NEXT:    add w8, w8, w0, uxtb
+; CHECK-GI-NEXT:    cmn w8, #254
 ; CHECK-GI-NEXT:    csel x0, x1, x2, lo
 ; CHECK-GI-NEXT:    ret
   %2 = add i8 %0, 1
diff --git a/llvm/test/CodeGen/AArch64/lack-of-signed-truncation-check.ll b/llvm/test/CodeGen/AArch64/lack-of-signed-truncation-check.ll
index 56a18b00a974cb..8e464d16dd90c5 100644
--- a/llvm/test/CodeGen/AArch64/lack-of-signed-truncation-check.ll
+++ b/llvm/test/CodeGen/AArch64/lack-of-signed-truncation-check.ll
@@ -187,10 +187,11 @@ define i1 @add_ulecmp_i16_i8(i16 %x) nounwind {
 define i1 @add_ugecmp_i16_i8(i16 %x) nounwind {
 ; CHECK-LABEL: add_ugecmp_i16_i8:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    sxtb w8, w0
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, w0, uxth
-; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    mov w9, #-65281 // =0xffff00ff
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
   %tmp1 = icmp uge i16 %tmp0, 256 ; 1U << 8
@@ -256,10 +257,11 @@ define i1 @add_ugecmp_i64_i8(i64 %x) nounwind {
 define i1 @add_ugtcmp_i16_i8(i16 %x) nounwind {
 ; CHECK-LABEL: add_ugtcmp_i16_i8:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    sxtb w8, w0
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, w0, uxth
-; CHECK-NEXT:    cset w0, ne
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    mov w9, #-65281 // =0xffff00ff
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
   %tmp1 = icmp ugt i16 %tmp0, 255 ; (1U << 8) - 1
@@ -301,9 +303,10 @@ define i1 @add_ugecmp_bad_i16_i8_cmp(i16 %x, i16 %y) nounwind {
 define i1 @add_ugecmp_bad_i8_i16(i16 %x) nounwind {
 ; CHECK-LABEL: add_ugecmp_bad_i8_i16:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #128
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #127
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    mov w9, #-65409 // =0xffff007f
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
@@ -315,9 +318,10 @@ define i1 @add_ugecmp_bad_i8_i16(i16 %x) nounwind {
 define i1 @add_ugecmp_bad_i16_i8_c0notpoweroftwo(i16 %x) nounwind {
 ; CHECK-LABEL: add_ugecmp_bad_i16_i8_c0notpoweroftwo:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #192
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #255
+; CHECK-NEXT:    mov w8, #-65344 // =0xffff00c0
+; CHECK-NEXT:    mov w9, #-65281 // =0xffff00ff
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 192 ; (1U << (8-1)) + (1U << (8-1-1))
@@ -329,9 +333,10 @@ define i1 @add_ugecmp_bad_i16_i8_c0notpoweroftwo(i16 %x) nounwind {
 define i1 @add_ugecmp_bad_i16_i8_c1notpoweroftwo(i16 %x) nounwind {
 ; CHECK-LABEL: add_ugecmp_bad_i16_i8_c1notpoweroftwo:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #128
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #767
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    mov w9, #-64769 // =0xffff02ff
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
@@ -343,9 +348,10 @@ define i1 @add_ugecmp_bad_i16_i8_c1notpoweroftwo(i16 %x) nounwind {
 define i1 @add_ugecmp_bad_i16_i8_magic(i16 %x) nounwind {
 ; CHECK-LABEL: add_ugecmp_bad_i16_i8_magic:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #64
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #255
+; CHECK-NEXT:    mov w8, #-65472 // =0xffff0040
+; CHECK-NEXT:    mov w9, #-65281 // =0xffff00ff
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 64 ; 1U << (8-1-1)
@@ -357,9 +363,10 @@ define i1 @add_ugecmp_bad_i16_i8_magic(i16 %x) nounwind {
 define i1 @add_ugecmp_bad_i16_i4(i16 %x) nounwind {
 ; CHECK-LABEL: add_ugecmp_bad_i16_i4:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #8
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #15
+; CHECK-NEXT:    mov w8, #-65528 // =0xffff0008
+; CHECK-NEXT:    mov w9, #-65521 // =0xffff000f
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 8 ; 1U << (4-1)
@@ -371,9 +378,12 @@ define i1 @add_ugecmp_bad_i16_i4(i16 %x) nounwind {
 define i1 @add_ugecmp_bad_i24_i8(i24 %x) nounwind {
 ; CHECK-LABEL: add_ugecmp_bad_i24_i8:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #128
-; CHECK-NEXT:    and w8, w8, #0xffffff
-; CHECK-NEXT:    cmp w8, #255
+; CHECK-NEXT:    mov w8, #128 // =0x80
+; CHECK-NEXT:    and w9, w0, #0xffffff
+; CHECK-NEXT:    movk w8, #65280, lsl #16
+; CHECK-NEXT:    add w8, w9, w8
+; CHECK-NEXT:    mov w9, #-16776961 // =0xff0000ff
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, hi
 ; CHECK-NEXT:    ret
   %tmp0 = add i24 %x, 128 ; 1U << (8-1)
diff --git a/llvm/test/CodeGen/AArch64/signed-truncation-check.ll b/llvm/test/CodeGen/AArch64/signed-truncation-check.ll
index ab42e6463feeed..6a5795ff337dd7 100644
--- a/llvm/test/CodeGen/AArch64/signed-truncation-check.ll
+++ b/llvm/test/CodeGen/AArch64/signed-truncation-check.ll
@@ -200,10 +200,11 @@ define i1 @add_ugtcmp_i16_i8(i16 %x) nounwind {
 define i1 @add_ultcmp_i16_i8(i16 %x) nounwind {
 ; CHECK-LABEL: add_ultcmp_i16_i8:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    sxtb w8, w0
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, w0, uxth
-; CHECK-NEXT:    cset w0, eq
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    mov w9, #-65280 // =0xffff0100
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    cset w0, lo
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
   %tmp1 = icmp ult i16 %tmp0, 256 ; 1U << 8
@@ -269,10 +270,11 @@ define i1 @add_ultcmp_i64_i8(i64 %x) nounwind {
 define i1 @add_ulecmp_i16_i8(i16 %x) nounwind {
 ; CHECK-LABEL: add_ulecmp_i16_i8:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    sxtb w8, w0
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, w0, uxth
-; CHECK-NEXT:    cset w0, eq
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    mov w9, #-65280 // =0xffff0100
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    cset w0, lo
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
   %tmp1 = icmp ule i16 %tmp0, 255 ; (1U << 8) - 1
@@ -314,9 +316,9 @@ define i1 @add_ultcmp_bad_i16_i8_cmp(i16 %x, i16 %y) nounwind {
 define i1 @add_ultcmp_bad_i8_i16(i16 %x) nounwind {
 ; CHECK-LABEL: add_ultcmp_bad_i8_i16:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    and w8, w0, #0xffff
-; CHECK-NEXT:    add w8, w8, #128
-; CHECK-NEXT:    lsr w0, w8, #16
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    cmn w8, w0, uxth
+; CHECK-NEXT:    cset w0, hs
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
   %tmp1 = icmp ult i16 %tmp0, 128 ; 1U << (8-1)
@@ -327,9 +329,10 @@ define i1 @add_ultcmp_bad_i8_i16(i16 %x) nounwind {
 define i1 @add_ultcmp_bad_i16_i8_c0notpoweroftwo(i16 %x) nounwind {
 ; CHECK-LABEL: add_ultcmp_bad_i16_i8_c0notpoweroftwo:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #192
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #256
+; CHECK-NEXT:    mov w8, #-65344 // =0xffff00c0
+; CHECK-NEXT:    mov w9, #-65280 // =0xffff0100
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, lo
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 192 ; (1U << (8-1)) + (1U << (8-1-1))
@@ -341,9 +344,10 @@ define i1 @add_ultcmp_bad_i16_i8_c0notpoweroftwo(i16 %x) nounwind {
 define i1 @add_ultcmp_bad_i16_i8_c1notpoweroftwo(i16 %x) nounwind {
 ; CHECK-LABEL: add_ultcmp_bad_i16_i8_c1notpoweroftwo:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #128
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #768
+; CHECK-NEXT:    mov w8, #-65408 // =0xffff0080
+; CHECK-NEXT:    mov w9, #-64768 // =0xffff0300
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, lo
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 128 ; 1U << (8-1)
@@ -355,9 +359,10 @@ define i1 @add_ultcmp_bad_i16_i8_c1notpoweroftwo(i16 %x) nounwind {
 define i1 @add_ultcmp_bad_i16_i8_magic(i16 %x) nounwind {
 ; CHECK-LABEL: add_ultcmp_bad_i16_i8_magic:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #64
-; CHECK-NEXT:    and w8, w8, #0xffff
-; CHECK-NEXT:    cmp w8, #256
+; CHECK-NEXT:    mov w8, #-65472 // =0xffff0040
+; CHECK-NEXT:    mov w9, #-65280 // =0xffff0100
+; CHECK-NEXT:    add w8, w8, w0, uxth
+; CHECK-NEXT:    cmp w8, w9
 ; CHECK-NEXT:    cset w0, lo
 ; CHECK-NEXT:    ret
   %tmp0 = add i16 %x, 64 ; 1U << (8-1-1)
@@ -369,9 +374,10 @@ define i1 @add_ultcmp_bad_i16_i8_magic(i16 %x) nounwind {
 define i1 @add_ultcmp_bad_i16_i4(i16 %x) nounwind {
 ; CHECK-LABEL: add_ultcmp_bad_i16_i4:
 ; CHECK:       // %bb.0:
-; CHECK-NEXT:    add w8, w0, #8
-; CHEC...
[truncated]

@AtariDreams
Copy link
Contributor

Given how this seems to increase the number of generated instructions, I think this should be done in cases where if OverflowConst.isNonPositive() holds false, instead of doing it to all numbers.

@topperc
Copy link
Collaborator Author

topperc commented Feb 14, 2024

Given how this seems to increase the number of generated instructions, I think this should be done in cases where if OverflowConst.isNonPositive() holds false, instead of doing it to all numbers.

A lot of the signed-truncation-check and lack-of-signed-truncation-check test cases are not the output that InstCombine produces. So I'm not sure increases there reflect real behavior.

I don't think the increases are directly related to the sign of the constants. We see increases on these test files from negative constants too. See #81574 If anything we should be costing the constants based on target specific cost like getImmCost not making decisions based on the sign.

@sparker-arm
Copy link
Contributor

A lot of the signed-truncation-check and lack-of-signed-truncation-check test cases are not the output that InstCombine produces

Then, to allay fears, would you mind adding a test run that first processes the file with instcombine before codegen?

If anything we should be costing the constants based on target specific cost like getImmCost

There's already some basic heuristics, at the end of TryToPromote, so tracking increased costs of immediates SGTM.

@topperc topperc marked this pull request as draft February 15, 2024 00:32
We can support these by changing the sext promotion to -zext(-C)
and replacing a sgt check with ugt. Reframing the logic in terms of
how the unsigned range are affected.

More comments in the patch.
@topperc
Copy link
Collaborator Author

topperc commented Feb 29, 2024

I've updated the patch to check isLegalAddImmediate for the constant we that promotion will generate. If it's not legal we continue treating the add as a source instead of promoting it.

Copy link
Contributor

@sparker-arm sparker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for making the change.

@topperc topperc marked this pull request as ready for review February 29, 2024 18:50
@topperc topperc merged commit 0813b90 into llvm:main Mar 1, 2024
6 checks passed
@topperc topperc deleted the pr/typeprom-positive branch March 1, 2024 17:17
topperc added a commit that referenced this pull request Mar 11, 2024
…ap. (#81690)"

This reverts commit 0813b90.

Fixes miscompile reported in #84718.
topperc added a commit that referenced this pull request Mar 11, 2024
…Wrap. (#81690)"

With special case with Add constant is 0.

Original message:
We can support these by changing the sext promotion to -zext(-C) and
replacing a sgt check with ugt. Reframing the logic in terms of how the
unsigned range are affected. More comments in the patch.

The new cases check isLegalAddImmediate to avoid some
regressions in lit tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants