[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs #90017

nikic · 2024-04-25T05:29:59Z

Currently, the OptimizePointerDifference fold does not trigger when working on the sub of two geps where one of the geps has multiple uses, to avoid duplicating the offset arithmetic too much.

However, there are clearly cases where performing it would still be clearly profitable, e.g. test_sub_ptradd_multiuse.

This patch drops the one-use restriction using the same strategy we use in GEP comparison folds: If there are multiple uses, we rewrite the GEP to use the expanded offset arithmetic instead (effectively canonicalizing it into ptradd representation).

Fixes #88231.

Currently, the OptimizePointerDifference fold does not trigger when working on the sub of two geps where one of the geps has multiple uses, to avoid duplicating the offset arithmetic too much. However, there are clearly cases where performing it would still be clearly profitable, e.g. test_sub_ptradd_multiuse. This patch drops the one-use restriction using the same strategy we use in GEP comparison folds: If there are multiple uses, we rewrite the GEP to use the expanded offset arithmetic instead (effectively canonicalizing it into ptradd representation). Fixes llvm#88231.

llvmbot · 2024-04-25T05:30:37Z

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Currently, the OptimizePointerDifference fold does not trigger when working on the sub of two geps where one of the geps has multiple uses, to avoid duplicating the offset arithmetic too much.

However, there are clearly cases where performing it would still be clearly profitable, e.g. test_sub_ptradd_multiuse.

This patch drops the one-use restriction using the same strategy we use in GEP comparison folds: If there are multiple uses, we rewrite the GEP to use the expanded offset arithmetic instead (effectively canonicalizing it into ptradd representation).

Fixes #88231.

Full diff: https://github.com/llvm/llvm-project/pull/90017.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp (+6-22)
(modified) llvm/test/Transforms/InstCombine/sub.ll (+14-18)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index 88b7e496897e1f..d9193cfe284017 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -2001,29 +2001,13 @@ Value *InstCombinerImpl::OptimizePointerDifference(Value *LHS, Value *RHS,
   if (!GEP1)
     return nullptr;
 
-  if (GEP2) {
-    // (gep X, ...) - (gep X, ...)
-    //
-    // Avoid duplicating the arithmetic if there are more than one non-constant
-    // indices between the two GEPs and either GEP has a non-constant index and
-    // multiple users. If zero non-constant index, the result is a constant and
-    // there is no duplication. If one non-constant index, the result is an add
-    // or sub with a constant, which is no larger than the original code, and
-    // there's no duplicated arithmetic, even if either GEP has multiple
-    // users. If more than one non-constant indices combined, as long as the GEP
-    // with at least one non-constant index doesn't have multiple users, there
-    // is no duplication.
-    unsigned NumNonConstantIndices1 = GEP1->countNonConstantIndices();
-    unsigned NumNonConstantIndices2 = GEP2->countNonConstantIndices();
-    if (NumNonConstantIndices1 + NumNonConstantIndices2 > 1 &&
-        ((NumNonConstantIndices1 > 0 && !GEP1->hasOneUse()) ||
-         (NumNonConstantIndices2 > 0 && !GEP2->hasOneUse()))) {
-      return nullptr;
-    }
-  }
+  // To avoid duplicating the offset arithmetic, rewrite the GEP to use the
+  // computed offset.
+  // TODO: We should probably do this even if there is only one GEP.
+  bool RewriteGEPs = GEP2 != nullptr;
 
   // Emit the offset of the GEP and an intptr_t.
-  Value *Result = EmitGEPOffset(GEP1);
+  Value *Result = EmitGEPOffset(GEP1, RewriteGEPs);
 
   // If this is a single inbounds GEP and the original sub was nuw,
   // then the final multiplication is also nuw.
@@ -2035,7 +2019,7 @@ Value *InstCombinerImpl::OptimizePointerDifference(Value *LHS, Value *RHS,
   // If we have a 2nd GEP of the same base pointer, subtract the offsets.
   // If both GEPs are inbounds, then the subtract does not have signed overflow.
   if (GEP2) {
-    Value *Offset = EmitGEPOffset(GEP2);
+    Value *Offset = EmitGEPOffset(GEP2, RewriteGEPs);
     Result = Builder.CreateSub(Result, Offset, "gepdiff", /* NUW */ false,
                                GEP1->isInBounds() && GEP2->isInBounds());
   }
diff --git a/llvm/test/Transforms/InstCombine/sub.ll b/llvm/test/Transforms/InstCombine/sub.ll
index 56dd9c6a879981..32ed4a787e9262 100644
--- a/llvm/test/Transforms/InstCombine/sub.ll
+++ b/llvm/test/Transforms/InstCombine/sub.ll
@@ -1123,7 +1123,8 @@ define i64 @test58(ptr %foo, i64 %i, i64 %j) {
 
 define i64 @test59(ptr %foo, i64 %i) {
 ; CHECK-LABEL: @test59(
-; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds [100 x [100 x i8]], ptr [[FOO:%.*]], i64 0, i64 42, i64 [[I:%.*]]
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr i8, ptr [[FOO:%.*]], i64 [[I:%.*]]
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr i8, ptr [[TMP1]], i64 4200
 ; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 4200
 ; CHECK-NEXT:    store ptr [[GEP1]], ptr @dummy_global1, align 8
 ; CHECK-NEXT:    store ptr [[GEP2]], ptr @dummy_global2, align 8
@@ -1142,13 +1143,12 @@ define i64 @test59(ptr %foo, i64 %i) {
 
 define i64 @test60(ptr %foo, i64 %i, i64 %j) {
 ; CHECK-LABEL: @test60(
-; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds [100 x [100 x i8]], ptr [[FOO:%.*]], i64 0, i64 [[J:%.*]], i64 [[I:%.*]]
-; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 4200
-; CHECK-NEXT:    [[CAST1:%.*]] = ptrtoint ptr [[GEP1]] to i64
-; CHECK-NEXT:    [[CAST2:%.*]] = ptrtoint ptr [[GEP2]] to i64
-; CHECK-NEXT:    [[SUB:%.*]] = sub i64 [[CAST1]], [[CAST2]]
+; CHECK-NEXT:    [[GEP1_IDX:%.*]] = mul nsw i64 [[J:%.*]], 100
+; CHECK-NEXT:    [[GEP1_OFFS:%.*]] = add nsw i64 [[GEP1_IDX]], [[I:%.*]]
+; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds i8, ptr [[FOO:%.*]], i64 [[GEP1_OFFS]]
+; CHECK-NEXT:    [[GEPDIFF:%.*]] = add nsw i64 [[GEP1_OFFS]], -4200
 ; CHECK-NEXT:    store ptr [[GEP1]], ptr @dummy_global1, align 8
-; CHECK-NEXT:    ret i64 [[SUB]]
+; CHECK-NEXT:    ret i64 [[GEPDIFF]]
 ;
 ; gep1 has a non-constant index and more than one uses. Shouldn't duplicate the arithmetic.
   %gep1 = getelementptr inbounds [100 x [100 x i8]], ptr %foo, i64 0, i64 %j, i64 %i
@@ -1162,13 +1162,12 @@ define i64 @test60(ptr %foo, i64 %i, i64 %j) {
 
 define i64 @test61(ptr %foo, i64 %i, i64 %j) {
 ; CHECK-LABEL: @test61(
-; CHECK-NEXT:    [[GEP1:%.*]] = getelementptr inbounds i8, ptr [[FOO:%.*]], i64 4200
-; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds [100 x [100 x i8]], ptr [[FOO]], i64 0, i64 [[J:%.*]], i64 [[I:%.*]]
-; CHECK-NEXT:    [[CAST1:%.*]] = ptrtoint ptr [[GEP1]] to i64
-; CHECK-NEXT:    [[CAST2:%.*]] = ptrtoint ptr [[GEP2]] to i64
-; CHECK-NEXT:    [[SUB:%.*]] = sub i64 [[CAST1]], [[CAST2]]
+; CHECK-NEXT:    [[GEP2_IDX:%.*]] = mul nsw i64 [[J:%.*]], 100
+; CHECK-NEXT:    [[GEP2_OFFS:%.*]] = add nsw i64 [[GEP2_IDX]], [[I:%.*]]
+; CHECK-NEXT:    [[GEP2:%.*]] = getelementptr inbounds i8, ptr [[FOO:%.*]], i64 [[GEP2_OFFS]]
+; CHECK-NEXT:    [[GEPDIFF:%.*]] = sub nsw i64 4200, [[GEP2_OFFS]]
 ; CHECK-NEXT:    store ptr [[GEP2]], ptr @dummy_global2, align 8
-; CHECK-NEXT:    ret i64 [[SUB]]
+; CHECK-NEXT:    ret i64 [[GEPDIFF]]
 ;
 ; gep2 has a non-constant index and more than one uses. Shouldn't duplicate the arithmetic.
   %gep1 = getelementptr inbounds [100 x [100 x i8]], ptr %foo, i64 0, i64 42, i64 0
@@ -1186,11 +1185,8 @@ define i64 @test_sub_ptradd_multiuse(ptr %p, i64 %idx1, i64 %idx2) {
 ; CHECK-LABEL: @test_sub_ptradd_multiuse(
 ; CHECK-NEXT:    [[P1:%.*]] = getelementptr inbounds i8, ptr [[P:%.*]], i64 [[IDX1:%.*]]
 ; CHECK-NEXT:    call void @use.ptr(ptr [[P1]])
-; CHECK-NEXT:    [[P2:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 [[IDX2:%.*]]
-; CHECK-NEXT:    [[P1_INT:%.*]] = ptrtoint ptr [[P1]] to i64
-; CHECK-NEXT:    [[P2_INT:%.*]] = ptrtoint ptr [[P2]] to i64
-; CHECK-NEXT:    [[SUB:%.*]] = sub i64 [[P1_INT]], [[P2_INT]]
-; CHECK-NEXT:    ret i64 [[SUB]]
+; CHECK-NEXT:    [[GEPDIFF:%.*]] = sub nsw i64 [[IDX1]], [[IDX2:%.*]]
+; CHECK-NEXT:    ret i64 [[GEPDIFF]]
 ;
   %p1 = getelementptr inbounds i8, ptr %p, i64 %idx1
   call void @use.ptr(ptr %p1)

nikic · 2024-04-25T06:44:59Z

Compile-time: https://llvm-compile-time-tracker.com/compare.php?from=873889b7fa23434876533c603e687fc70396b0ab&to=402eea4442aaf4408e68d82aef10bae1f6a002aa&stat=instructions:u

No impact except improvements on stage2 builds, which means this results in better optimization.

PR Link: llvm/llvm-project#90017

dtcxzyw

LGTM.

nikic requested review from dtcxzyw and goldsteinn April 25, 2024 05:30

llvmbot added the llvm:transforms label Apr 25, 2024

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Apr 25, 2024

pre-commit: test PR90017

05c83e9

PR Link: llvm/llvm-project#90017

This was referenced Apr 25, 2024

pre-commit: test PR90017 dtcxzyw/llvm-opt-benchmark#554

Closed

Test PR90017 dtcxzyw/llvm-ci#1130

Closed

dtcxzyw approved these changes Apr 25, 2024

View reviewed changes

nikic merged commit cbe1760 into llvm:main Apr 26, 2024
5 of 6 checks passed

nikic deleted the instcombine-sub-ptradd branch April 26, 2024 01:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs #90017

[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs #90017

nikic commented Apr 25, 2024

llvmbot commented Apr 25, 2024

nikic commented Apr 25, 2024

dtcxzyw left a comment

[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs #90017

[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs #90017

Conversation

nikic commented Apr 25, 2024

llvmbot commented Apr 25, 2024

nikic commented Apr 25, 2024

dtcxzyw left a comment

Choose a reason for hiding this comment