Skip to content

[InstCombine]: Missed Optimization, treat the disjoint Or as an add #86340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Sh0g0-1758
Copy link
Member

Fixes: #84401

Added tests
@llvmbot
Copy link
Member

llvmbot commented Mar 22, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Shourya Goel (Sh0g0-1758)

Changes

Fixes: #84401


Full diff: https://github.com/llvm/llvm-project/pull/86340.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+14-7)
  • (added) llvm/test/Transforms/InstCombine/gep-disjoint.ll (+21)
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 7c40fb4fc86082..507d09077a5cca 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -2889,9 +2889,12 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
     // Try to replace ADD + GEP with GEP + GEP.
     Value *Idx1, *Idx2;
     if (match(GEP.getOperand(1),
-              m_OneUse(m_Add(m_Value(Idx1), m_Value(Idx2))))) {
+              m_OneUse(m_AddLike(m_Value(Idx1), m_Value(Idx2))))) {
       //   %idx = add i64 %idx1, %idx2
       //   %gep = getelementptr i32, ptr %ptr, i64 %idx
+      // or
+      //   %idx = or disjoint i64 %idx1, %idx2
+      //   %gep = getelementptr i32, ptr %ptr, i64 %idx
       // as:
       //   %newptr = getelementptr i32, ptr %ptr, i64 %idx1
       //   %newgep = getelementptr i32, ptr %newptr, i64 %idx2
@@ -2901,14 +2904,18 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
                                        Idx2);
     }
     ConstantInt *C;
-    if (match(GEP.getOperand(1), m_OneUse(m_SExtLike(m_OneUse(m_NSWAdd(
+    if (match(GEP.getOperand(1), m_OneUse(m_SExtLike(m_OneUse(m_NSWAddLike(
                                      m_Value(Idx1), m_ConstantInt(C))))))) {
-      // %add = add nsw i32 %idx1, idx2
-      // %sidx = sext i32 %add to i64
-      // %gep = getelementptr i32, ptr %ptr, i64 %sidx
+      //   %add = add nsw i32 %idx1, %idx2
+      //   %sidx = sext i32 %add to i64
+      //   %gep = getelementptr i32, ptr %ptr, i64 %sidx
+      // or
+      //   %disjointOr = or disjoint i32 %idx1, %idx2
+      //   %sidx = sext i32 %disjointOr to i64
+      //   %gep = getelementptr i32, ptr %ptr, i64 %sidx
       // as:
-      // %newptr = getelementptr i32, ptr %ptr, i32 %idx1
-      // %newgep = getelementptr i32, ptr %newptr, i32 idx2
+      //   %newptr = getelementptr i32, ptr %ptr, i32 %idx1
+      //   %newgep = getelementptr i32, ptr %newptr, i32 idx2
       auto *NewPtr = Builder.CreateGEP(
           GEP.getResultElementType(), GEP.getPointerOperand(),
           Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()));
diff --git a/llvm/test/Transforms/InstCombine/gep-disjoint.ll b/llvm/test/Transforms/InstCombine/gep-disjoint.ll
new file mode 100644
index 00000000000000..fe75519822963b
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/gep-disjoint.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+; test that when offset is know while indexing arrays
+; folding optimization takes place on the addition which gets folded into an OR
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+
+define ptr @test(ptr %arr, i32 %conv) {
+; CHECK-LABEL: define ptr @test(
+; CHECK-SAME: ptr [[ARR:%.*]], i32 [[CONV:%.*]]) {
+; CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[CONV]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr i8, ptr [[ARR]], i64 [[TMP1]]
+; CHECK-NEXT:    [[ARRAYIDX9:%.*]] = getelementptr i8, ptr [[TMP2]], i64 1
+; CHECK-NEXT:    ret ptr [[ARRAYIDX9]]
+;
+  %or7 = or disjoint i32 %conv, 1
+  %idxprom8 = zext nneg i32 %or7 to i64
+  %arrayidx9 = getelementptr i8, ptr %arr, i64 %idxprom8
+  ret ptr %arrayidx9
+}

Copy link

✅ With the latest revision this PR passed the Python code formatter.

Copy link

✅ With the latest revision this PR passed the C/C++ code formatter.

@topperc
Copy link
Collaborator

topperc commented Mar 22, 2024

Please put [InstCombine] in the title. People use these tags in email filters to find which patches they might be interested in.

@Sh0g0-1758 Sh0g0-1758 changed the title [LLVM]: Missed Optimization, treat the disjoint Or as an add [InstCombine]: Missed Optimization, treat the disjoint Or as an add Mar 22, 2024
@topperc
Copy link
Collaborator

topperc commented Mar 22, 2024

I had written half of this in this PR #76981 I did not get around to looking at @nikic concerns there.

@Sh0g0-1758
Copy link
Member Author

@dtcxzyw

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 23, 2024
@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 23, 2024

Failed Tests (22):
LLVM :: Transforms/LoopUnroll/runtime-multiexit-heuristic.ll
LLVM :: Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
LLVM :: Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll
LLVM :: Transforms/LoopVectorize/ARM/mve-reductions.ll
LLVM :: Transforms/LoopVectorize/SystemZ/addressing.ll
LLVM :: Transforms/LoopVectorize/X86/float-induction-x86.ll
LLVM :: Transforms/LoopVectorize/X86/interleaving.ll
LLVM :: Transforms/LoopVectorize/X86/parallel-loops.ll
LLVM :: Transforms/LoopVectorize/X86/small-size.ll
LLVM :: Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll
LLVM :: Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll
LLVM :: Transforms/LoopVectorize/consecutive-ptr-uniforms.ll
LLVM :: Transforms/LoopVectorize/float-induction.ll
LLVM :: Transforms/LoopVectorize/forked-pointers.ll
LLVM :: Transforms/LoopVectorize/induction.ll
LLVM :: Transforms/LoopVectorize/interleaved-accesses.ll
LLVM :: Transforms/LoopVectorize/reduction-inloop-cond.ll
LLVM :: Transforms/LoopVectorize/reduction-inloop-pred.ll
LLVM :: Transforms/LoopVectorize/reduction-inloop-uf4.ll
LLVM :: Transforms/LoopVectorize/reduction-predselect.ll
LLVM :: Transforms/LoopVectorize/reduction.ll
LLVM :: Transforms/SLPVectorizer/AArch64/getelementptr.ll

Please fix these test failures.

@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 23, 2024

@topperc @Sh0g0-1758 Could you please provide some performance data in SPEC benchmark/llvm-test-suite?

@Sh0g0-1758
Copy link
Member Author

@topperc, I was looking at your PR and I realized that most of the work involves running the script on the tests that are not passing and that the work in InstructionCombining.cpp is done. If you are still working on your PR, I can shift my work there and we can work on it together?

@Sh0g0-1758 Sh0g0-1758 closed this Oct 8, 2024
@Sh0g0-1758 Sh0g0-1758 reopened this Nov 30, 2024
@llvmbot llvmbot added the llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes label Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missed Optimization: Aligned Pointer Optimizations Can't Happen With Prefered OR Instead of ADD
4 participants