-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AggressiveInstCombine] Ignore debug instructions when load combining #70200
Conversation
@llvm/pr-subscribers-llvm-transforms Author: None (mikaelholmen) ChangesWe previously included debug instructions when counting instructions when This fixes #69925. Full diff: https://github.com/llvm/llvm-project/pull/70200.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
index a55d01645f10eb8..72f55e237ca9151 100644
--- a/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+++ b/llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
@@ -701,12 +701,21 @@ static bool foldLoadsRecursive(Value *V, LoadOps &LOps, const DataLayout &DL,
Loc = Loc.getWithNewSize(LOps.LoadSize);
} else
Loc = MemoryLocation::get(End);
+
+ // Ignore debug info (and other "AssumeLike" intrinsics) so that's not counted
+ // against MaxInstrsToScan. Otherwise debug info could affect codegen.
+ auto IsAssumeLikeIntr = [](const Instruction &I) {
+ if (auto *II = dyn_cast<IntrinsicInst>(&I))
+ return II->isAssumeLikeIntrinsic();
+ return false;
+ };
unsigned NumScanned = 0;
for (Instruction &Inst :
make_range(Start->getIterator(), End->getIterator())) {
if (Inst.mayWriteToMemory() && isModSet(AA.getModRefInfo(&Inst, Loc)))
return false;
- if (++NumScanned > MaxInstrsToScan)
+
+ if (!IsAssumeLikeIntr(Inst) && ++NumScanned > MaxInstrsToScan)
return false;
}
diff --git a/llvm/test/Transforms/AggressiveInstCombine/AArch64/combine_ignore_debug.ll b/llvm/test/Transforms/AggressiveInstCombine/AArch64/combine_ignore_debug.ll
new file mode 100644
index 000000000000000..68455a1f9074ecb
--- /dev/null
+++ b/llvm/test/Transforms/AggressiveInstCombine/AArch64/combine_ignore_debug.ll
@@ -0,0 +1,51 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3
+; RUN: opt -mtriple aarch64 -aggressive-instcombine-max-scan-instrs=1 -passes="aggressive-instcombine" -S < %s | FileCheck %s -check-prefix DBG
+; RUN: opt -strip-debug -mtriple aarch64 -aggressive-instcombine-max-scan-instrs=1 -passes="aggressive-instcombine" -S < %s | FileCheck %s -check-prefix NODBG
+
+; The DBG and NODBG cases should be the same. I.e. we should optimize the DBG
+; case too even if there is a dbg.value.
+
+target datalayout = "E"
+
+%s = type { i16, i16 }
+
+@e = global %s zeroinitializer, align 1
+@l = global %s zeroinitializer, align 1
+
+define void @test() {
+; DBG-LABEL: define void @test() {
+; DBG-NEXT: entry:
+; DBG-NEXT: [[L1:%.*]] = load i32, ptr @e, align 1
+; DBG-NEXT: call void @llvm.dbg.value(metadata i32 undef, metadata [[META3:![0-9]+]], metadata !DIExpression()), !dbg [[DBG5:![0-9]+]]
+; DBG-NEXT: store i32 [[L1]], ptr @l, align 1
+; DBG-NEXT: ret void
+;
+; NODBG-LABEL: define void @test() {
+; NODBG-NEXT: entry:
+; NODBG-NEXT: [[L1:%.*]] = load i32, ptr @e, align 1
+; NODBG-NEXT: store i32 [[L1]], ptr @l, align 1
+; NODBG-NEXT: ret void
+;
+entry:
+ %l1 = load i16, ptr @e, align 1
+ call void @llvm.dbg.value(metadata i32 undef, metadata !3, metadata !DIExpression()), !dbg !5
+ %l2 = load i16, ptr getelementptr inbounds (%s, ptr @e, i16 0, i32 1), align 1
+ %e2 = zext i16 %l2 to i32
+ %e1 = zext i16 %l1 to i32
+ %s1 = shl nuw i32 %e1, 16
+ %o1 = or i32 %s1, %e2
+ store i32 %o1, ptr @l, align 1
+ ret void
+}
+
+declare void @llvm.dbg.value(metadata, metadata, metadata)
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1)
+!1 = !DIFile(filename: "foo.c", directory: "/")
+!2 = !{i32 2, !"Debug Info Version", i32 3}
+!3 = !DILocalVariable(scope: !4)
+!4 = distinct !DISubprogram(unit: !0)
+!5 = !DILocation(scope: !4)
|
// against MaxInstrsToScan. Otherwise debug info could affect codegen. | ||
auto IsAssumeLikeIntr = [](const Instruction &I) { | ||
if (auto *II = dyn_cast<IntrinsicInst>(&I)) | ||
return II->isAssumeLikeIntrinsic(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check for DbgInfoIntrinsic only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check for DbgInfoIntrinsic only.
Sure, I can do that. The reason I went for isAssumeLike is that on a previous similar fix I made in SLPVectorizer, I initially just looked for Dbg intrinsics but then got comments I should look for isAssumeLike.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for review. A question: I'd prefer to submit the testcase precommit and the actual fix as two separate commits. How do I go about to do that now when I have both commits in the same pull request? Should I just go to my repo and git push the precommit, then rebase this PR (which would then just contain the fix) and then "Squash and merge" ? |
See https://discourse.llvm.org/t/how-to-enable-the-function-rebase-and-merge/73990/3?u=nikic for a way to land multi-commit PRs in a way that GitHub understands. |
Thanks! |
We get different results with/without debug info present.
…llvm#70200) We previously included debug instructions when counting instructions when looking for loads to combine. This meant that the presence of debug instructions could affect optimization, as shown in the updated testcase. This fixes llvm#69925.
c4f8989
to
ce0a750
Compare
…llvm#70200) We previously included debug instructions when counting instructions when looking for loads to combine. This meant that the presence of debug instructions could affect optimization, as shown in the updated testcase. This fixes llvm#69925.
We previously included debug instructions when counting instructions when
looking for loads to combine. This meant that the presence of debug
instructions could affect optimization, as shown in the updated testcase.
This fixes #69925.