Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0 #82280

Merged
merged 2 commits into from
Mar 11, 2024

Conversation

elhewaty
Copy link
Contributor

@elhewaty elhewaty commented Feb 19, 2024

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 19, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (elhewaty)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/82280.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (+17)
  • (modified) llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll (+71)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index 5266808c5abab4..7f59ad1d61d91e 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -2139,6 +2139,23 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) {
       }
     }
 
+    // usub_sat((sub nuw C1, A), C2) -> usub_sat(C1 - C2, A) if C2 u< C1
+    // usub_sat((sub nuw C1, A), C2) -> 0 otherwise
+    const APInt *C1, *C2;
+    Value *A;
+    if (IID == Intrinsic::usub_sat &&
+        match(Arg0, m_OneUse(m_NUWSub(m_APInt(C1), m_Value(A)))) &&
+        match(Arg1, m_APInt(C2)) && SI->hasOneUse()) {
+
+      if (C2->ult(*C1)) {
+        auto *New = Builder.CreateBinaryIntrinsic(
+            Intrinsic::usub_sat, ConstantInt::get(A->getType(), *C1 - *C2), A);
+        return replaceInstUsesWith(*SI, New);
+      } else {
+        return replaceInstUsesWith(*SI, ConstantInt::get(Ty, 0));
+      }
+    }
+
     // ssub.sat(X, C) -> sadd.sat(X, -C) if C != MIN
     Constant *C;
     if (IID == Intrinsic::ssub_sat && match(Arg1, m_Constant(C)) &&
diff --git a/llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll b/llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll
index 5cece931b8d987..4d60a7cc5a54bf 100644
--- a/llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll
+++ b/llvm/test/Transforms/InstCombine/unsigned_saturated_sub.ll
@@ -7,6 +7,77 @@
 declare void @use(i64)
 declare void @usei32(i32)
 declare void @usei1(i1)
+declare i32 @llvm.usub.sat.i32(i32, i32)
+declare i16 @llvm.usub.sat.i16(i16, i16)
+
+; usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0
+
+define i32 @usub_sat_C1_C2(i32 %a){
+; CHECK-LABEL: @usub_sat_C1_C2(
+; CHECK-NEXT:    [[COND:%.*]] = call i32 @llvm.usub.sat.i32(i32 50, i32 [[A:%.*]])
+; CHECK-NEXT:    ret i32 [[COND]]
+;
+  %add = sub nuw i32 64, %a
+  %cond = call i32 @llvm.usub.sat.i32(i32 %add, i32 14)
+  ret i32 %cond
+}
+
+define i32 @usub_sat_C1_C2_produce_0(i32 %a){
+; CHECK-LABEL: @usub_sat_C1_C2_produce_0(
+; CHECK-NEXT:    ret i32 0
+;
+  %add = sub nuw i32 14, %a
+  %cond = call i32 @llvm.usub.sat.i32(i32 %add, i32 14)
+  ret i32 %cond
+}
+
+define i32 @usub_sat_C1_C2_produce_0_too(i32 %a){
+; CHECK-LABEL: @usub_sat_C1_C2_produce_0_too(
+; CHECK-NEXT:    ret i32 0
+;
+  %add = sub nuw i32 12, %a
+  %cond = call i32 @llvm.usub.sat.i32(i32 %add, i32 14)
+  ret i32 %cond
+}
+
+; negative tests this souldn't work
+
+define i32 @usub_sat_C1_C2_without_nuw(i32 %a){
+; CHECK-LABEL: @usub_sat_C1_C2_without_nuw(
+; CHECK-NEXT:    [[ADD:%.*]] = sub i32 12, [[A:%.*]]
+; CHECK-NEXT:    [[COND:%.*]] = call i32 @llvm.usub.sat.i32(i32 [[ADD]], i32 14)
+; CHECK-NEXT:    ret i32 [[COND]]
+;
+  %add = sub i32 12, %a
+  %cond = call i32 @llvm.usub.sat.i32(i32 %add, i32 14)
+  ret i32 %cond
+}
+
+define i32 @usub_sat_C1_C2_more_than_one_use_with_add(i32 %a){
+; CHECK-LABEL: @usub_sat_C1_C2_more_than_one_use_with_add(
+; CHECK-NEXT:    [[ADD:%.*]] = sub nuw i32 12, [[A:%.*]]
+; CHECK-NEXT:    [[COND:%.*]] = call i32 @llvm.usub.sat.i32(i32 [[ADD]], i32 14)
+; CHECK-NEXT:    call void @usei32(i32 [[ADD]])
+; CHECK-NEXT:    ret i32 [[COND]]
+;
+  %add = sub nuw i32 12, %a
+  %cond = call i32 @llvm.usub.sat.i32(i32 %add, i32 14)
+  call void @usei32(i32 %add)
+  ret i32 %cond
+}
+
+define i32 @usub_sat_C1_C2_more_than_one_use_with_cond(i32 %a){
+; CHECK-LABEL: @usub_sat_C1_C2_more_than_one_use_with_cond(
+; CHECK-NEXT:    [[ADD:%.*]] = sub nuw i32 12, [[A:%.*]]
+; CHECK-NEXT:    [[COND:%.*]] = call i32 @llvm.usub.sat.i32(i32 [[ADD]], i32 14)
+; CHECK-NEXT:    call void @usei32(i32 [[COND]])
+; CHECK-NEXT:    ret i32 [[COND]]
+;
+  %add = sub nuw i32 12, %a
+  %cond = call i32 @llvm.usub.sat.i32(i32 %add, i32 14)
+  call void @usei32(i32 %cond)
+  ret i32 %cond
+}
 
 ; (a > b) ? a - b : 0 -> usub.sat(a, b)
 

@elhewaty
Copy link
Contributor Author

elhewaty commented Feb 19, 2024

@nikic, @dtcxzyw, @goldsteinn

@nikic
Copy link
Contributor

nikic commented Mar 7, 2024

Reverse ping

@elhewaty
Copy link
Contributor Author

elhewaty commented Mar 7, 2024

@nikic, @XChy Sorry I will update it now

* vector test

Do we need the vector test?

* test with commuted sub like (sub nuw i32 %a, 12), proof: https://alive2.llvm.org/ce/z/kVXDgW

For this test I think the optimizer converts sub to add and then add a negative number, Is there is a way to match additions with over flow like this:

%res = call {i8, i1} @llvm.uadd.with.overflow.i8(i8 %c1, i8 %c2)
  %overflow = extractvalue {i8, i1} %res, 1
  %notoverflow = xor i1 %overflow, -1
  call void @llvm.assume(i1 %notoverflow)

  %add = sub nuw i8 %a, %c1
  %cond = call i8 @llvm.usub.sat.i8(i8 %add, i8 %c2)
  ret i8 %cond

@XChy
Copy link
Member

XChy commented Mar 9, 2024

Do we need the vector test?

We always add vector tests if the pattern could handle vectors.

For this test I think the optimizer converts sub to add and then add a negative number, Is there is a way to match additions with over flow like this:

I think APInt::uadd_ov may help. Or if the commuted pattern is difficult now, I think it's OK solve the pattern in original issue first, and then propose another patch for the commuted one, since they are independent.

@nikic
Copy link
Contributor

nikic commented Mar 9, 2024

Yes, we should not try to handle the commuted pattern in this patch.

@elhewaty elhewaty requested review from dtcxzyw and XChy March 9, 2024 19:29
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 10, 2024
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

const APInt *C1, *C2;
Value *A;
if (IID == Intrinsic::usub_sat &&
match(Arg0, m_OneUse(m_NUWSub(m_APInt(C1), m_Value(A)))) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop this m_OneUse as well. We'll get the same number of instructions, but smaller dependence distance.

} else {
return replaceInstUsesWith(*SI, ConstantInt::get(Ty, 0));
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative way to phrase this would be like this: https://alive2.llvm.org/ce/z/_Gw9fQ

This way we don't have to check the C2->ult(C1) condition explicitly, which means that we can more easily generalize this to non-splat vectors.

It would look something like this:

if (IID == Intrinsic::usub_sat &&
    match(Arg0, m_NUWSub(m_ImmConstant(C1), m_Value(A))) &&
    match(Arg1, m_ImmConstant(C2))) {
  auto *NewC = Builder.CreateBinaryIntrinsic(Intrinsic::usub_sat, C1, C2);
  auto *NewSub = Builder.CreateBinaryIntrinsic(Intrinsic::usub_sat, NewC, A);
  return replaceInstUsesWith(*SI, NewSub);
}

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting that when phrased like this, the following generalization to uadd.sat + add nuw also holds: https://alive2.llvm.org/ce/z/Y4VGfs

Copy link
Member

@XChy XChy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits.

@elhewaty elhewaty requested a review from nikic March 11, 2024 12:29
@nikic nikic merged commit 3f302ea into llvm:main Mar 11, 2024
4 checks passed
@elhewaty elhewaty deleted the usub_sat branch March 11, 2024 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[InstCombine] Missed optimization: Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0
5 participants