-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AggressiveInstCombine] Inline strcmp/strncmp #89371
Conversation
Inline calls to strcmp(s1, s2) and strncmp(s1, s2, N), where N and exactly one of s1 and s2 are constant.
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Franklin Zhang (FLZ101) ChangesInline calls to strcmp(s1, s2) and strncmp(s1, s2, N), where N and exactly one of s1 and s2 are constant. For example: int res = strcmp(s, "ab"); is converted to int res = (int)s[0] - (int)'a';
if (res != 0)
goto END;
res = (int)s[1] - (int)'b';
if (res != 0)
goto END;
res = (int)s[2] - (int)'\0';
END: Ported from a similar gcc feature Inline strcmp with small constant strings. Patch is 31.09 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/89371.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
index e586e9eda1322f..eddd7382c27bbc 100644
--- a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+++ b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
@@ -19,6 +19,7 @@
#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/TargetTransformInfo.h"
@@ -28,6 +29,7 @@
#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/PatternMatch.h"
+#include "llvm/Transforms/Utils/BasicBlockUtils.h"
#include "llvm/Transforms/Utils/BuildLibCalls.h"
#include "llvm/Transforms/Utils/Local.h"
@@ -922,6 +924,251 @@ static bool foldPatternedLoads(Instruction &I, const DataLayout &DL) {
return true;
}
+static cl::opt<unsigned> StrNCmpInlineThreshold(
+ "strncmp-inline-threshold", cl::init(3), cl::Hidden,
+ cl::desc("The maximum length of a constant string for a builtin string cmp "
+ "call eligible for inlining. The default value is 3."));
+
+namespace {
+class StrNCmpInliner {
+public:
+ StrNCmpInliner(CallInst *CI, LibFunc Func, Function::iterator &BBNext,
+ DomTreeUpdater *DTU, const DataLayout &DL)
+ : CI(CI), Func(Func), BBNext(BBNext), DTU(DTU), DL(DL) {}
+
+ bool optimizeStrNCmp();
+
+private:
+ bool inlineCompare(Value *LHS, StringRef RHS, uint64_t N, bool Switched);
+
+ CallInst *CI;
+ LibFunc Func;
+ Function::iterator &BBNext;
+ DomTreeUpdater *DTU;
+ const DataLayout &DL;
+};
+
+} // namespace
+
+/// First we normalize calls to strncmp/strcmp to the form of
+/// compare(s1, s2, N), which means comparing first N bytes of s1 and s2
+/// (without considering '\0')
+///
+/// Examples:
+///
+/// \code
+/// strncmp(s, "a", 3) -> compare(s, "a", 2)
+/// strncmp(s, "abc", 3) -> compare(s, "abc", 3)
+/// strncmp(s, "a\0b", 3) -> compare(s, "a\0b", 2)
+/// strcmp(s, "a") -> compare(s, "a", 2)
+///
+/// char s2[] = {'a'}
+/// strncmp(s, s2, 3) -> compare(s, s2, 3)
+///
+/// char s2[] = {'a', 'b', 'c', 'd'}
+/// strncmp(s, s2, 3) -> compare(s, s2, 3)
+/// \endcode
+///
+/// We only handle cases that N and exactly one of s1 and s2 are constant. Cases
+/// that s1 and s2 are both constant are already handled by the instcombine
+/// pass.
+///
+/// We do not handle cases that N > StrNCmpInlineThreshold.
+///
+/// We also do not handles cases that N < 2, which are already
+/// handled by the instcombine pass.
+///
+bool StrNCmpInliner::optimizeStrNCmp() {
+ if (StrNCmpInlineThreshold < 2)
+ return false;
+
+ Value *Str1P = CI->getArgOperand(0);
+ Value *Str2P = CI->getArgOperand(1);
+ // should be handled elsewhere
+ if (Str1P == Str2P)
+ return false;
+
+ StringRef Str1, Str2;
+ bool HasStr1 = getConstantStringInfo(Str1P, Str1, false);
+ bool HasStr2 = getConstantStringInfo(Str2P, Str2, false);
+ if (!(HasStr1 ^ HasStr2))
+ return false;
+
+ // note that '\0' and characters after it are not trimmed
+ StringRef Str = HasStr1 ? Str1 : Str2;
+
+ size_t Idx = Str.find('\0');
+ uint64_t N = Idx == StringRef::npos ? UINT64_MAX : Idx + 1;
+ if (Func == LibFunc_strncmp) {
+ if (!isa<ConstantInt>(CI->getArgOperand(2)))
+ return false;
+ N = std::min(N, cast<ConstantInt>(CI->getArgOperand(2))->getZExtValue());
+ }
+ // now N means how many bytes we need to compare at most
+ if (N > Str.size() || N < 2 || N > StrNCmpInlineThreshold)
+ return false;
+
+ Value *StrP = HasStr1 ? Str2P : Str1P;
+
+ // cases that StrP has two or more dereferenceable bytes might be better
+ // optimized elsewhere
+ bool CanBeNull = false, CanBeFreed = false;
+ if (StrP->getPointerDereferenceableBytes(DL, CanBeNull, CanBeFreed) > 1)
+ return false;
+
+ return inlineCompare(StrP, Str, N, HasStr1);
+}
+
+/// Convert
+///
+/// \code
+/// ret = compare(s1, s2, N)
+/// \endcode
+///
+/// into
+///
+/// \code
+/// ret = (int)s1[0] - (int)s2[0]
+/// if (ret != 0)
+/// goto NE
+/// ...
+/// ret = (int)s1[N-2] - (int)s2[N-2]
+/// if (ret != 0)
+/// goto NE
+/// ret = (int)s1[N-1] - (int)s2[N-1]
+/// NE:
+/// \endcode
+///
+/// CFG before and after the transformation:
+///
+/// (before)
+/// BBCI
+///
+/// (after)
+/// BBBefore -> BBSubs[0] (sub,icmp) --NE-> BBNE -> BBCI
+/// | ^
+/// E |
+/// | |
+/// BBSubs[1] (sub,icmp) --NE-----+
+/// ... |
+/// BBSubs[N-1] (sub) ---------+
+///
+bool StrNCmpInliner::inlineCompare(Value *LHS, StringRef RHS, uint64_t N,
+ bool Switched) {
+ IRBuilder<> B(CI->getContext());
+
+ BasicBlock *BBCI = CI->getParent();
+ bool IsEntry = BBCI->isEntryBlock();
+ BasicBlock *BBBefore = splitBlockBefore(BBCI, CI, DTU, nullptr, nullptr,
+ BBCI->getName() + ".before");
+
+ SmallVector<BasicBlock *> BBSubs;
+ for (uint64_t i = 0; i < N + 1; ++i)
+ BBSubs.push_back(
+ BasicBlock::Create(CI->getContext(), "sub", BBCI->getParent(), BBCI));
+ BasicBlock *BBNE = BBSubs[N];
+
+ cast<BranchInst>(BBBefore->getTerminator())->setSuccessor(0, BBSubs[0]);
+
+ B.SetInsertPoint(BBNE);
+ PHINode *Phi = B.CreatePHI(CI->getType(), N);
+ B.CreateBr(BBCI);
+
+ Value *Base = LHS;
+ for (uint64_t i = 0; i < N; ++i) {
+ B.SetInsertPoint(BBSubs[i]);
+ Value *VL = B.CreateZExt(
+ B.CreateLoad(B.getInt8Ty(),
+ B.CreateInBoundsGEP(B.getInt8Ty(), Base, B.getInt64(i))),
+ CI->getType());
+ Value *VR = ConstantInt::get(CI->getType(), RHS[i]);
+ Value *Sub = Switched ? B.CreateSub(VR, VL) : B.CreateSub(VL, VR);
+ if (i < N - 1)
+ B.CreateCondBr(B.CreateICmpNE(Sub, ConstantInt::get(CI->getType(), 0)),
+ BBNE, BBSubs[i + 1]);
+ else
+ B.CreateBr(BBNE);
+
+ Phi->addIncoming(Sub, BBSubs[i]);
+ }
+
+ CI->replaceAllUsesWith(Phi);
+ CI->eraseFromParent();
+
+ BBNext = BBCI->getIterator();
+
+ // Update DomTree
+ if (DTU) {
+ if (IsEntry) {
+ DTU->recalculate(*BBCI->getParent());
+ } else {
+ SmallVector<DominatorTree::UpdateType, 8> Updates;
+ Updates.push_back({DominatorTree::Delete, BBBefore, BBCI});
+ Updates.push_back({DominatorTree::Insert, BBBefore, BBSubs[0]});
+ for (uint64_t i = 0; i < N; ++i) {
+ if (i < N - 1)
+ Updates.push_back({DominatorTree::Insert, BBSubs[i], BBSubs[i + 1]});
+ Updates.push_back({DominatorTree::Insert, BBSubs[i], BBNE});
+ }
+ Updates.push_back({DominatorTree::Insert, BBNE, BBCI});
+ DTU->applyUpdates(Updates);
+ }
+ }
+ return true;
+}
+
+static bool inlineLibCalls(Function &F, TargetLibraryInfo &TLI,
+ const TargetTransformInfo &TTI, DominatorTree &DT,
+ bool &MadeCFGChange) {
+ MadeCFGChange = false;
+ DomTreeUpdater DTU(&DT, DomTreeUpdater::UpdateStrategy::Lazy);
+
+ bool MadeChange = false;
+
+ Function::iterator CurrBB;
+ for (Function::iterator BB = F.begin(), BE = F.end(); BB != BE;) {
+ CurrBB = BB++;
+
+ for (BasicBlock::iterator II = CurrBB->begin(), IE = CurrBB->end();
+ II != IE; ++II) {
+ CallInst *Call = dyn_cast<CallInst>(&*II);
+ Function *CalledFunc;
+
+ if (!Call || !(CalledFunc = Call->getCalledFunction()))
+ continue;
+
+ if (Call->isNoBuiltin())
+ continue;
+
+ // Skip if function either has local linkage or is not a known library
+ // function.
+ LibFunc LF;
+ if (CalledFunc->hasLocalLinkage() || !TLI.getLibFunc(*CalledFunc, LF) ||
+ !TLI.has(LF))
+ continue;
+
+ switch (LF) {
+ case LibFunc_strcmp:
+ case LibFunc_strncmp: {
+ auto &DL = F.getParent()->getDataLayout();
+ if (StrNCmpInliner(Call, LF, BB, &DTU, DL).optimizeStrNCmp()) {
+ MadeCFGChange = true;
+ break;
+ }
+ continue;
+ }
+ default:
+ continue;
+ }
+
+ MadeChange = true;
+ break;
+ }
+ }
+
+ return MadeChange;
+}
+
/// This is the entry point for folds that could be implemented in regular
/// InstCombine, but they are separated because they are not expected to
/// occur frequently and/or have more than a constant-length pattern match.
@@ -969,11 +1216,12 @@ static bool foldUnusualPatterns(Function &F, DominatorTree &DT,
/// handled in the callers of this function.
static bool runImpl(Function &F, AssumptionCache &AC, TargetTransformInfo &TTI,
TargetLibraryInfo &TLI, DominatorTree &DT,
- AliasAnalysis &AA) {
+ AliasAnalysis &AA, bool &MadeCFGChange) {
bool MadeChange = false;
const DataLayout &DL = F.getParent()->getDataLayout();
TruncInstCombine TIC(AC, TLI, DL, DT);
MadeChange |= TIC.run(F);
+ MadeChange |= inlineLibCalls(F, TLI, TTI, DT, MadeCFGChange);
MadeChange |= foldUnusualPatterns(F, DT, TTI, TLI, AA, AC);
return MadeChange;
}
@@ -985,12 +1233,16 @@ PreservedAnalyses AggressiveInstCombinePass::run(Function &F,
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
auto &TTI = AM.getResult<TargetIRAnalysis>(F);
auto &AA = AM.getResult<AAManager>(F);
- if (!runImpl(F, AC, TTI, TLI, DT, AA)) {
+ bool MadeCFGChange = false;
+ if (!runImpl(F, AC, TTI, TLI, DT, AA, MadeCFGChange)) {
// No changes, all analyses are preserved.
return PreservedAnalyses::all();
}
// Mark all the analyses that instcombine updates as preserved.
PreservedAnalyses PA;
- PA.preserveSet<CFGAnalyses>();
+ if (MadeCFGChange)
+ PA.preserve<DominatorTreeAnalysis>();
+ else
+ PA.preserveSet<CFGAnalyses>();
return PA;
}
diff --git a/llvm/test/Transforms/AggressiveInstCombine/strcmp.ll b/llvm/test/Transforms/AggressiveInstCombine/strcmp.ll
deleted file mode 100644
index 99dd450e6f44e6..00000000000000
--- a/llvm/test/Transforms/AggressiveInstCombine/strcmp.ll
+++ /dev/null
@@ -1,219 +0,0 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes=aggressive-instcombine -S | FileCheck %s
-
-declare i32 @strcmp(ptr, ptr)
-
-@s0 = constant [1 x i8] c"\00"
-@s1 = constant [2 x i8] c"0\00"
-@s2 = constant [3 x i8] c"01\00"
-@s3 = constant [4 x i8] c"012\00"
-@s4 = constant [5 x i8] c"0123\00"
-
-; Expand strcmp(C, "x"), strcmp(C, "xy").
-
-define i1 @expand_strcmp_s0(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_s0(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s0)
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s0)
- %cmp = icmp eq i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_eq_s1(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_eq_s1(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- %cmp = icmp eq i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_eq_s1_commuted(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_eq_s1_commuted(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr @s1, ptr [[C:%.*]])
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr @s1, ptr %C)
- %cmp = icmp eq i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_ne_s1(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_ne_s1(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp ne i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- %cmp = icmp ne i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_sgt_s1(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_sgt_s1(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- %cmp = icmp sgt i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_sge_s1(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_sge_s1(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp sge i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- %cmp = icmp sge i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_slt_s1(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_slt_s1(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- %cmp = icmp slt i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_sle_s1(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_sle_s1(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp sle i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- %cmp = icmp sle i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_s1_fail_1(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_s1_fail_1(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 1
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- %cmp = icmp eq i32 %call, 1
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_s1_fail_2(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_s1_fail_2(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr @s1, ptr @s1)
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr @s1, ptr @s1)
- %cmp = icmp eq i32 %call, 0
- ret i1 %cmp
-}
-
-define i32 @expand_strcmp_s1_fail_3(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_s1_fail_3(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s1)
-; CHECK-NEXT: ret i32 [[CALL]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s1)
- ret i32 %call
-}
-
-define i1 @expand_strcmp_eq_s2(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_eq_s2(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s2)
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s2)
- %cmp = icmp eq i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_ne_s2(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_ne_s2(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s2)
-; CHECK-NEXT: [[CMP:%.*]] = icmp ne i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s2)
- %cmp = icmp ne i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_sgt_s2(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_sgt_s2(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s2)
-; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s2)
- %cmp = icmp sgt i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_sge_s2(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_sge_s2(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s2)
-; CHECK-NEXT: [[CMP:%.*]] = icmp sge i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s2)
- %cmp = icmp sge i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_slt_s2(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_slt_s2(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s2)
-; CHECK-NEXT: [[CMP:%.*]] = icmp slt i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s2)
- %cmp = icmp slt i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_sle_s2(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_sle_s2(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s2)
-; CHECK-NEXT: [[CMP:%.*]] = icmp sle i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s2)
- %cmp = icmp sle i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_s3(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_s3(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s3)
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s3)
- %cmp = icmp eq i32 %call, 0
- ret i1 %cmp
-}
-
-define i1 @expand_strcmp_s4(ptr %C) {
-; CHECK-LABEL: @expand_strcmp_s4(
-; CHECK-NEXT: [[CALL:%.*]] = call i32 @strcmp(ptr [[C:%.*]], ptr @s4)
-; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[CALL]], 0
-; CHECK-NEXT: ret i1 [[CMP]]
-;
- %call = call i32 @strcmp(ptr %C, ptr @s4)
- %cmp = icmp eq i32 %call, 0
- ret i1 %cmp
-}
diff --git a/llvm/test/Transforms/AggressiveInstCombine/strncmp-1.ll b/llvm/test/Transforms/AggressiveInstCombine/strncmp-1.ll
new file mode 100644
index 00000000000000..4679d6d7fca143
--- /dev/null
+++ b/llvm/test/Transforms/AggressiveInstCombine/strncmp-1.ll
@@ -0,0 +1,203 @@
+; RUN: opt -S -passes=aggressive-instcombine -strncmp-inline-threshold=3 < %s | FileCheck %s
+
+declare i32 @strncmp(ptr nocapture, ptr nocapture, i64)
+declare i32 @strcmp(ptr nocapture, ptr nocapture)
+
+@.str = private unnamed_addr constant [3 x i8] c"ab\00", align 1
+@.str.1 = private unnamed_addr constant [2 x i8] c"a\00", align 1
+
+define i32 @test_strncmp_1(ptr nocapture readonly %s) {
+; CHECK-LABEL: @test_strncmp_1(
+; CHECK-NEXT: entry.before:
+; CHECK-NEXT: br label [[SUB:%.*]]
+; CHECK: sub:
+; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[S:%.*]], i64 0
+; CHECK-NEXT: [[TMP1:%.*]] = load i8, ptr [[TMP0]], align 1
+; CHECK-NEXT: [[TMP2:%.*]] = zext i8 [[TMP1]] to i32
+; CHECK-NEXT: [[TMP3:%.*]] = sub i32 97, [[TMP2]]
+; CHECK-NEXT: [[TMP4:%.*]] = icmp ne i32 [[TMP3]], 0
+; CHECK-NEXT: br i1 [[TMP4]], label [[SUB2:%.*]], label [[SUB1:%.*]]
+; CHECK: sub1:
+; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i8, ptr [[S]], i64 1
+; CHECK-NEXT: [[TMP6:%.*]] = load i8, ptr [[TMP5]], align 1
+; CHECK-NEXT: [[TMP7:%.*]] = zext i8 [[TMP6]] to i32
+; CHECK-NEXT: [[TMP8:%.*]] = sub i32 98, [[TMP7]]
+; CHECK-NEXT: br label [[SUB2]]
+; CHECK: sub2:
+; CHECK-NEXT: [[TMP9:%.*]] = phi i32 [ [[TMP3]], [[SUB]] ], [ [[TMP8]], [[SUB1]] ]
+; CHECK-NEXT: br label [[ENTRY:%.*]]
+; CHECK: entry:
+; CHECK-NEXT: ret i32 [[TMP9]]
+;
+entry:
+ %call = tail call i32 @strncmp(ptr nonnull dereferenceable(3) @.str, ptr nonnull dereferenceable(1) %s, i64 2)
+ ret i32 %call
+}
+
+define i32 @test_strncmp_2(ptr nocapture readonly %s) {
+; CHECK-LABEL: @test_strncmp_2(
+; CHECK-NEXT: entry.before:
+; CHECK-NEXT: br label [[SUB:%.*]]
+; CHECK: sub:
+; CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[S:%.*]], i64 0
+; CHECK-NEXT: [[TMP1:%.*]] = load i8, ptr [[TMP0]], align 1
+; CHECK-NEXT: [[TMP2:%.*]] = zext i8 [[TMP1]] to i32
+; CHECK-NEXT: [[TMP3:%.*]] = sub i32 97, [[TMP2]]
+; CHECK-NEXT: [[TMP4:%.*]] = icmp ne i32 [[TMP3]], 0
+; CHECK-NEXT: br i1 [[TMP4]], label [[SUB3:%.*]], label [[SUB1:%.*]]
+; CHECK: sub1:
+; CHECK-NEXT: [[TMP5:%.*]] = getelementptr inbounds i8, ptr [[S]], i64 1
+; CHECK-NEXT: [[TMP6:%.*]] = load i8, ptr [[TMP5]], align 1
+; CHECK-NEXT: [[TMP7:%.*]] = zext i8 [[TMP6]] to i32
+; CHECK-NEXT: [[TMP8:%.*]] = sub i32 98, [[TMP7]]
+; CHECK-NEXT: [[TMP9:%.*]] = icmp ne i32 [[TMP8]], 0
+; CHECK-NEXT: br i1 [[TMP9]], label [[SUB3]], label [[SUB2:%.*]]
+; CHECK: sub2:
+; CHECK-NEXT: [[TMP10:%.*]] = getelementptr inbounds i8, ptr [[S]], i64 2
+; CHECK-NEXT: [[TMP11:%.*]] = load i8, ptr [[TMP10]], align 1
+; CHECK-NEXT: [[TMP12:%.*]] = zext i8 [[TMP11]] to i32
+; CHECK-NEXT: [[TMP13:%.*]] = sub i32 0, [[TMP12]]
+; CHECK-NEXT: br label [[SUB3]]
+; CHECK: sub3:
+; CHECK-NEXT: [[TMP14:%.*]] = phi i32 [ [[TMP3]], [[SUB]] ], [ [[TMP8]], [[SUB1]] ], [ ...
[truncated]
|
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
* add missing test cases * remove redundant code
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
It looks like this patch crashes when compiling lencod from llvm-test-suite in ReleaseLTO-g configuration: https://llvm-compile-time-tracker.com/show_error.php?commit=73bf9d626bf3fb7a43453f6731f1aecab83177b4 |
666c908
to
e2d3521
Compare
It seems to be a bug related to the new DebugInfo, I found the change as below could fix it. --- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -1009,7 +1009,7 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator Dest, BasicBlock *Src,
// generate the iterator with begin() / getFirstInsertionPt(), it means
// any trailing debug-info at the end of the block would "normally" have
// been pushed in front of "First". Move it there now.
- DbgMarker *FirstMarker = getMarker(First);
+ DbgMarker *FirstMarker = createMarker(First);
DbgMarker *TrailingDbgRecords = getTrailingDbgRecords();
if (TrailingDbgRecords) {
FirstMarker->absorbDebugValues(*TrailingDbgRecords, true); I am preparing a minimal test case and will create a new pull request to fix this bug soon. |
ede43f9
to
e2d3521
Compare
Shoule have been fixed by #90312 |
c101cdc
to
436675a
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
* isOnlyUsedInZeroComparison * more tests * ...
436675a
to
d5769df
Compare
Looks like there is a hang while building the stage 2 compiler now: https://llvm-compile-time-tracker.com/show_error.php?commit=c690fb9bf301e84b21a239dfafb278bab262a762 |
I should have known what is wrong. How can I run these tests before I push the fix? |
I'd recommend you try and reproduce this locally, which you'll need to investigate. To do that, you need to do a 2 stage build
|
* use SplitBlock rather than splitBlockBefore to avoid invalidating the Instruction iterator
Should have been fixed by latest commit 4ca7ba9. Tested by doing a two stage build locally. |
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
Outdated
Show resolved
Hide resolved
* improve tests
286f9fe
to
8e63645
Compare
; CHECK-NEXT: [[TMP8:%.*]] = icmp ne i32 [[TMP7]], 0 | ||
; CHECK-NEXT: br i1 [[TMP8]], label [[NE]], label [[SUB2:%.*]] | ||
; CHECK: sub_2: | ||
; CHECK-NEXT: [[TMP9:%.*]] = getelementptr inbounds i8, ptr [[S]], i64 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the inbounds on %s
valid here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, otherwise the strncmp call would be UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I guess, was thinking we only have derefeanceable(1)
on %s
, but the entire transform is premised on standard definition of strncmp
so you're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* improve tests
This caused a miscompile in ffmpeg for x86 platforms. It's reproducible on both Linux and MinGW, for both i686 and x86_64. To reproduce on linux on x86: $ git clone https://github.com/ffmpeg/ffmpeg
$ mkdir ffmpeg-build
$ cd ffmpeg-build
$ ../ffmpeg/configure --cc=clang --samples=$(pwd)/../samples
$ make fate-rsync # download test samples
$ make -j$(nproc) fate-subtitles The miscompilation lies in the object file I presume we should revert this until the issue has been resolved? |
I'll check this issue soon. This optimization could be disabled by specifying "-mllvm -strncmp-inline-threshold=0" and/or "-Wl,-mllvm,-strncmp-inline-threshold=0". |
Hmm, did you remove the message about you being unable to reproduce the issue, or is github glitching? Anyway, I re-reproduced the issue, with clang at d70267f as in your case. To simplify the reproducing and get rid of some potential variables, you can grab https://martin.st/temp/subtitles-preproc.c, compile it with |
Sorry, I thought the issue was a compilation error. Then I realized it was about test failures. Yes, I can reproduce it. |
I should have known what is wrong, and will push the fix soon. |
Should have been fixed by #91204. |
Thanks! I can confirm that the issue is gone now, in all of my regularly tested setups. |
Inline calls to strcmp(s1, s2) and strncmp(s1, s2, N), where N and exactly one of s1 and s2 are constant.
For example:
is converted to
Ported from a similar gcc feature Inline strcmp with small constant strings.