-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstCombine] Fold icmp(constants[x]) when the range of x is given #67093
base: main
Are you sure you want to change the base?
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
if (BeginOffset.slt(0)) | ||
BeginOffset += OffsetStep; | ||
|
||
uint64_t ElementCountToTraverse = (DataSize - BeginOffset).udiv(OffsetStep).getZExtValue() + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed "+1" here, actually BeginOffset
indeed includes one more element. Fix it now.
d78ded4
to
353c5f3
Compare
// If the index is larger than the pointer offset size of the target, | ||
// truncate the index down like the GEP would do implicitly. We don't have | ||
// to do this for an inbounds GEP because the index can't be out of range. | ||
if (!GEP->isInBounds() && IdxBitWidth > IndexSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we canonicalize the index's type of GEPs, I think we can skip the transform when IdxBitWidth != IndexSize
.
Value *Idx = ConstantInt::get( | ||
PtrIdxTy, (ConstantOffset - BeginOffset).sdiv(OffsetStep)); | ||
uint64_t IdxBitWidth = Idx->getType()->getScalarSizeInBits(); | ||
for (auto [Var, Coefficient] : VariableOffsets) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size of VariableOffset
is 1.
|
||
if (Ty) { | ||
Idx = MaskIdx(Idx); | ||
Idx = LazyGetIndex(Idx); | ||
Value *V = Builder.CreateIntCast(Idx, Ty, false); | ||
V = Builder.CreateLShr(ConstantInt::get(Ty, MagicBitvector), V); | ||
V = Builder.CreateAnd(ConstantInt::get(Ty, 1), V); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid creating multiple instructions when the load has multiple users.
See also https://github.com/dtcxzyw/llvm-opt-benchmark/pull/28/files/f845e103a2e2a78409e1f2aed9e21733056fd134#r1435245448.
But it would be good to do it in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will post a separate PR for it later.
956ee50
to
992f410
Compare
auto [Var, Coefficient] = VariableOffsets.front(); | ||
uint64_t VarBitWidth = Var->getType()->getScalarSizeInBits(); | ||
assert("GEP indices do not get canonicalized to the index type" && | ||
VarBitWidth == IdxBitWidth); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should check this condition at the start of the transform and bail out. While it is canonicalized, there's no guarantee that the GEP is canonicalized at this point yet.
// idx < 3, we actually get x + 3 < 3 | ||
Value *Bias = ConstantInt::get( | ||
PtrIdxTy, (ConstantOffset - BeginOffset).sdiv(OffsetStep)); | ||
uint64_t IdxBitWidth = PtrIdxTy->getScalarSizeInBits(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as the IndexSize variable.
; | ||
entry: | ||
%cond = icmp ult i64 %x, 2 | ||
br i1 %cond, label %case1, label %case2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to omit this condition from the test, so we can see the direct result of the transform, without additional implication reasoning. Same for the next test.
%isOK = load i32, ptr %isOK_ptr | ||
%cond_inferred = icmp ult i32 %isOK, %y | ||
ret i1 %cond_inferred | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some tests where we also apply an additional offset. In particular also if the offset is greater than the stride, and if the offset is negative. (Preferably for a "non-messy" case, to make it understandable.)
This patch extends foldCmpLoadFromIndexedGlobal and switch to byte-driven method to fold IR below:
Proof:
alive2
Related issue:
#64238
Migrated from Phabricator