Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold icmp(constants[x]) when the range of x is given #67093

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

XChy
Copy link
Member

@XChy XChy commented Sep 22, 2023

This patch extends foldCmpLoadFromIndexedGlobal and switch to byte-driven method to fold IR below:

define i1 @cmp_load_constant_array0(i64 %x){
entry:
  %cond = icmp ult i64 %x, 2
  br i1 %cond, label %case1, label %case2

case2:
  ret i1 0

case1:
  %isOK_ptr = getelementptr inbounds i32, ptr @CG, i64 %x
  %isOK = load i32, ptr %isOK_ptr
  %cond_inferred = icmp ult i32 %isOK, 3
  ret i1 %cond_inferred
}

Proof:
alive2

Related issue:
#64238

Migrated from Phabricator

llvm/lib/Transforms/InstCombine/InstructionCombining.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
@nikic nikic requested a review from dtcxzyw December 22, 2023 12:02
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Dec 22, 2023
Copy link

github-actions bot commented Dec 22, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

if (BeginOffset.slt(0))
BeginOffset += OffsetStep;

uint64_t ElementCountToTraverse = (DataSize - BeginOffset).udiv(OffsetStep).getZExtValue() + 1;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed "+1" here, actually BeginOffset indeed includes one more element. Fix it now.

@XChy XChy force-pushed the IndexedGlobal branch 2 times, most recently from d78ded4 to 353c5f3 Compare December 23, 2023 18:48
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
// If the index is larger than the pointer offset size of the target,
// truncate the index down like the GEP would do implicitly. We don't have
// to do this for an inbounds GEP because the index can't be out of range.
if (!GEP->isInBounds() && IdxBitWidth > IndexSize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we canonicalize the index's type of GEPs, I think we can skip the transform when IdxBitWidth != IndexSize.

Value *Idx = ConstantInt::get(
PtrIdxTy, (ConstantOffset - BeginOffset).sdiv(OffsetStep));
uint64_t IdxBitWidth = Idx->getType()->getScalarSizeInBits();
for (auto [Var, Coefficient] : VariableOffsets) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size of VariableOffset is 1.


if (Ty) {
Idx = MaskIdx(Idx);
Idx = LazyGetIndex(Idx);
Value *V = Builder.CreateIntCast(Idx, Ty, false);
V = Builder.CreateLShr(ConstantInt::get(Ty, MagicBitvector), V);
V = Builder.CreateAnd(ConstantInt::get(Ty, 1), V);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid creating multiple instructions when the load has multiple users.
See also https://github.com/dtcxzyw/llvm-opt-benchmark/pull/28/files/f845e103a2e2a78409e1f2aed9e21733056fd134#r1435245448.
But it would be good to do it in a separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will post a separate PR for it later.

@XChy XChy force-pushed the IndexedGlobal branch 3 times, most recently from 956ee50 to 992f410 Compare December 29, 2023 13:04
auto [Var, Coefficient] = VariableOffsets.front();
uint64_t VarBitWidth = Var->getType()->getScalarSizeInBits();
assert("GEP indices do not get canonicalized to the index type" &&
VarBitWidth == IdxBitWidth);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check this condition at the start of the transform and bail out. While it is canonicalized, there's no guarantee that the GEP is canonicalized at this point yet.

// idx < 3, we actually get x + 3 < 3
Value *Bias = ConstantInt::get(
PtrIdxTy, (ConstantOffset - BeginOffset).sdiv(OffsetStep));
uint64_t IdxBitWidth = PtrIdxTy->getScalarSizeInBits();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as the IndexSize variable.

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Outdated Show resolved Hide resolved
llvm/test/Transforms/InstCombine/load-cmp.ll Outdated Show resolved Hide resolved
;
entry:
%cond = icmp ult i64 %x, 2
br i1 %cond, label %case1, label %case2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to omit this condition from the test, so we can see the direct result of the transform, without additional implication reasoning. Same for the next test.

%isOK = load i32, ptr %isOK_ptr
%cond_inferred = icmp ult i32 %isOK, %y
ret i1 %cond_inferred
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see some tests where we also apply an additional offset. In particular also if the offset is greater than the stride, and if the offset is negative. (Preferably for a "non-messy" case, to make it understandable.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[InstCombine] Missed optimization for icmp(constants[x]) when the range of x is implied
4 participants