-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MemoryBuiltins] Cache the result of ObjectOffsetSizeVisitor::visit. #64796 #65326
[MemoryBuiltins] Cache the result of ObjectOffsetSizeVisitor::visit. #64796 #65326
Conversation
…lvm#64796 visit will skip visiting instructions it already has visited to avoid issues with cycles in the data graph. However, the result of this skipping behavior is that if we encounter the same instruction twice, and that instruction has a well defined result and isn't part of a cycle, we will introduce unknowns into the analysis even though we knew the size and offset of the instruction's result. Instead of skipping such instructions, keep a cache of the result of visiting them. This result is initialized to unknown() before visiting, so if we happen to visit it again recursively (perhaps as the result of a cycle or a phi), we will get unknown as the cached result and exit out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -61,3 +61,59 @@ if.end: | |||
%size = call i64 @llvm.objectsize.i64.p0(ptr %p, i1 true, i1 true, i1 false) | |||
ret i64 %size | |||
} | |||
|
|||
define dso_local i64 @pick_max_same(i32 noundef %n) local_unnamed_addr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop dso_local, noundef, local_unnamed_addr, they should not be relevant.
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Changesvisit will skip visiting instructions it already has visited to avoid issues with cycles in the data graph. However, the result of this skipping behavior is that if we encounter the same instruction twice, and that instruction has a well defined result and isn't part of a cycle, we will introduce unknowns into the analysis even though we knew the size and offset of the instruction's result.Instead of skipping such instructions, keep a cache of -- 3 Files Affected:
diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h b/llvm/include/llvm/Analysis/MemoryBuiltins.h index 711bbf6a0afe5f6..66d1885b92a4905 100644 --- a/llvm/include/llvm/Analysis/MemoryBuiltins.h +++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h @@ -198,7 +198,7 @@ class ObjectSizeOffsetVisitor ObjectSizeOpts Options; unsigned IntTyBits; APInt Zero; - SmallPtrSet<Instruction *, 8> SeenInsts; + DenseMap<Instruction *, SizeOffsetType> SeenInsts; APInt align(APInt Size, MaybeAlign Align); diff --git a/llvm/lib/Analysis/MemoryBuiltins.cpp b/llvm/lib/Analysis/MemoryBuiltins.cpp index 53e089ba1feae57..cacebd987f307f1 100644 --- a/llvm/lib/Analysis/MemoryBuiltins.cpp +++ b/llvm/lib/Analysis/MemoryBuiltins.cpp @@ -733,10 +733,14 @@ SizeOffsetType ObjectSizeOffsetVisitor::computeImpl(Value *V) { if (Instruction *I = dyn_cast<Instruction>(V)) { // If we have already seen this instruction, bail out. Cycles can happen in // unreachable code after constant propagation. - if (!SeenInsts.insert(I).second) - return unknown(); - - return visit(*I); + auto P = SeenInsts.try_emplace(I, unknown()); + if (!P.second) + return P.first->second; + SizeOffsetType Res = visit(*I); + // Cache the result for later visits. If we happened to visit this during + // the above recursion, we would consider it unknown until now. + SeenInsts[I] = Res; + return Res; } if (Argument *A = dyn_cast<Argument>(V)) return visitArgument(*A); diff --git a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll index 7937265a69afe1e..4f4d6a88e1693be 100644 --- a/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll +++ b/llvm/test/Transforms/LowerConstantIntrinsics/builtin-object-size-phi.ll @@ -61,3 +61,59 @@ if.end: %size = call i64 @llvm.objectsize.i64.p0(ptr %p, i1 true, i1 true, i1 false) ret i64 %size } + +define i64 @pick_max_same(i32 %n) { +; CHECK-LABEL: @pick_max_same( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[BUFFER:%.*]] = alloca i8, i64 20, align 1 +; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0 +; CHECK-NEXT: br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]] +; CHECK: if.else: +; CHECK-NEXT: [[OFFSETED:%.*]] = getelementptr i8, ptr [[BUFFER]], i64 10 +; CHECK-NEXT: br label [[IF_END]] +; CHECK: if.end: +; CHECK-NEXT: [[P:%.*]] = phi ptr [ [[OFFSETED]], [[IF_ELSE]] ], [ [[BUFFER]], [[ENTRY:%.*]] ] +; CHECK-NEXT: ret i64 20 +; +entry: + %buffer = alloca i8, i64 20 + %cond = icmp eq i32 %n, 0 + br i1 %cond, label %if.else, label %if.end + +if.else: + %offseted = getelementptr i8, ptr %buffer, i64 10 + br label %if.end + +if.end: + %p = phi ptr [ %offseted, %if.else ], [ %buffer, %entry ] + %size = call i64 @llvm.objectsize.i64.p0(ptr %p, i1 false, i1 true, i1 false) + ret i64 %size +} + +define i64 @pick_min_same(i32 %n) { +; CHECK-LABEL: @pick_min_same( +; CHECK-NEXT: entry: +; CHECK-NEXT: [[BUFFER:%.*]] = alloca i8, i64 20, align 1 +; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 [[N:%.*]], 0 +; CHECK-NEXT: br i1 [[COND]], label [[IF_ELSE:%.*]], label [[IF_END:%.*]] +; CHECK: if.else: +; CHECK-NEXT: [[OFFSETED:%.*]] = getelementptr i8, ptr [[BUFFER]], i64 10 +; CHECK-NEXT: br label [[IF_END]] +; CHECK: if.end: +; CHECK-NEXT: [[P:%.*]] = phi ptr [ [[OFFSETED]], [[IF_ELSE]] ], [ [[BUFFER]], [[ENTRY:%.*]] ] +; CHECK-NEXT: ret i64 10 +; +entry: + %buffer = alloca i8, i64 20 + %cond = icmp eq i32 %n, 0 + br i1 %cond, label %if.else, label %if.end + +if.else: + %offseted = getelementptr i8, ptr %buffer, i64 10 + br label %if.end + +if.end: + %p = phi ptr [ %offseted, %if.else ], [ %buffer, %entry ] + %size = call i64 @llvm.objectsize.i64.p0(ptr %p, i1 true, i1 true, i1 false) + ret i64 %size +} |
This partially recovers a major compile-time regression introduced by #65326.
This change caused a large compile-time regression. I've mostly mitigated this with c0a64ec, but there is still some residual regression: http://llvm-compile-time-tracker.com/compare.php?from=0a692b6b9632e1460f9e0e983196f2be5879acd1&to=0bf8763781fa68fa63ee8c1f0d9f6040df97483c&stat=instructions%3Au |
That's pretty unfortunate. I'm not sure what more can be done about it, it's bound to iterate further since it's given the chance to. |
getObjectSize() is almost always called on root instructions (identified objects like allocas, globals) -- actually using it with objectsize intrinsics is rare. I expect this is the additional overhead of the map, not the extra iteration. |
Aha, alright. I suppose it makes sense, the tests are probably not built with something like _FORTIFY_SOURCE that would use the intrinsic. Is the remaining regression acceptable? |
…lvm#64796 (llvm#65326) visit will skip visiting instructions it already has visited to avoid issues with cycles in the data graph. However, the result of this skipping behavior is that if we encounter the same instruction twice, and that instruction has a well defined result and isn't part of a cycle, we will introduce unknowns into the analysis even though we knew the size and offset of the instruction's result. Instead of skipping such instructions, keep a cache of the result of visiting them. This result is initialized to unknown() before visiting, so if we happen to visit it again recursively (perhaps as the result of a cycle or a phi), we will get unknown as the cached result and exit out.
This partially recovers a major compile-time regression introduced by llvm#65326.
visit will skip visiting instructions it already has visited
to avoid issues with cycles in the data graph. However,
the result of this skipping behavior is that if we
encounter the same instruction twice, and that instruction
has a well defined result and isn't part of a cycle, we
will introduce unknowns into the analysis even though we
knew the size and offset of the instruction's result.
Instead of skipping such instructions, keep a cache of
the result of visiting them. This result is initialized
to unknown() before visiting, so if we happen to visit
it again recursively (perhaps as the result of a cycle
or a phi), we will get unknown as the cached result and
exit out.