Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BasicAA] Guess reasonable contexts for separate storage hints #76770

Merged
merged 1 commit into from
Jan 4, 2024

Conversation

davidtgoldblatt
Copy link
Contributor

The definition of the pointer of the memory location being queried is always one such context. Even this conservative guess can be better than no guess at all in some cases.

Fixes #64666

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 3, 2024

@llvm/pr-subscribers-llvm-analysis

Author: David Goldblatt (davidtgoldblatt)

Changes

The definition of the pointer of the memory location being queried is always one such context. Even this conservative guess can be better than no guess at all in some cases.

Fixes #64666


Full diff: https://github.com/llvm/llvm-project/pull/76770.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ValueTracking.h (+7-2)
  • (modified) llvm/lib/Analysis/BasicAliasAnalysis.cpp (+26-5)
  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+4-3)
  • (added) llvm/test/Analysis/BasicAA/separate_storage-alias-sets.ll (+43)
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index baa16306ebf5df..7360edfce1f39a 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -810,9 +810,14 @@ bool isAssumeLikeIntrinsic(const Instruction *I);
 
 /// Return true if it is valid to use the assumptions provided by an
 /// assume intrinsic, I, at the point in the control-flow identified by the
-/// context instruction, CxtI.
+/// context instruction, CxtI. By default, ephemeral values of the assumption
+/// are treated as an invalid context, to prevent the assumption from being used
+/// to optimize away its argument. If the caller can ensure that this won't
+/// happen, it can call with AllowEphemerals set to true to get more valid
+/// assumptions.
 bool isValidAssumeForContext(const Instruction *I, const Instruction *CxtI,
-                             const DominatorTree *DT = nullptr);
+                             const DominatorTree *DT = nullptr,
+                             bool AllowEphemerals = false);
 
 enum class OverflowResult {
   /// Always overflows in the direction of signed/unsigned min value.
diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index 3de147368f2346..fa7309c29f48dc 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -1543,7 +1543,7 @@ AliasResult BasicAAResult::aliasCheck(const Value *V1, LocationSize V1Size,
           TLI, NullIsValidLocation)))
     return AliasResult::NoAlias;
 
-  if (CtxI && EnableSeparateStorageAnalysis) {
+  if (EnableSeparateStorageAnalysis) {
     for (auto &AssumeVH : AC.assumptions()) {
       if (!AssumeVH)
         continue;
@@ -1561,10 +1561,31 @@ AliasResult BasicAAResult::aliasCheck(const Value *V1, LocationSize V1Size,
           const Value *HintO1 = getUnderlyingObject(Hint1);
           const Value *HintO2 = getUnderlyingObject(Hint2);
 
-          if (((O1 == HintO1 && O2 == HintO2) ||
-               (O1 == HintO2 && O2 == HintO1)) &&
-              isValidAssumeForContext(Assume, CtxI, DT))
-            return AliasResult::NoAlias;
+          auto ValidAssumeForPtrContext = [&](const Value *Ptr) {
+            if (const Instruction *PtrI = dyn_cast<Instruction>(Ptr)) {
+              return isValidAssumeForContext(Assume, PtrI, DT,
+                                             /* AllowEphemerals */ true);
+            }
+            if (const Argument *PtrA = dyn_cast<Argument>(Ptr)) {
+              const Instruction *FirstI =
+                  &*PtrA->getParent()->getEntryBlock().begin();
+              return isValidAssumeForContext(Assume, FirstI, DT,
+                                             /* AllowEphemerals */ true);
+            }
+            return false;
+          };
+
+          if ((O1 == HintO1 && O2 == HintO2) ||
+              (O1 == HintO2 && O2 == HintO1)) {
+            // Note that we go back to V1 and V2 for the
+            // ValidAssumeForPtrContext checks; they're dominated by O1 and O2,
+            // so strictly more assumptions are valid for them.
+            if ((CtxI && isValidAssumeForContext(Assume, CtxI, DT,
+                                                 /* AllowEphemerals */ true)) ||
+                ValidAssumeForPtrContext(V1) || ValidAssumeForPtrContext(V2)) {
+              return AliasResult::NoAlias;
+            }
+          }
         }
       }
     }
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 16d78c1ded6d7a..856da5b45d94e4 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -485,7 +485,8 @@ bool llvm::isAssumeLikeIntrinsic(const Instruction *I) {
 
 bool llvm::isValidAssumeForContext(const Instruction *Inv,
                                    const Instruction *CxtI,
-                                   const DominatorTree *DT) {
+                                   const DominatorTree *DT,
+                                   bool AllowEphemerals) {
   // There are two restrictions on the use of an assume:
   //  1. The assume must dominate the context (or the control flow must
   //     reach the assume whenever it reaches the context).
@@ -503,7 +504,7 @@ bool llvm::isValidAssumeForContext(const Instruction *Inv,
     // Don't let an assume affect itself - this would cause the problems
     // `isEphemeralValueOf` is trying to prevent, and it would also make
     // the loop below go out of bounds.
-    if (Inv == CxtI)
+    if (!AllowEphemerals && Inv == CxtI)
       return false;
 
     // The context comes first, but they're both in the same block.
@@ -516,7 +517,7 @@ bool llvm::isValidAssumeForContext(const Instruction *Inv,
     if (!isGuaranteedToTransferExecutionToSuccessor(Range, 15))
       return false;
 
-    return !isEphemeralValueOf(Inv, CxtI);
+    return AllowEphemerals || !isEphemeralValueOf(Inv, CxtI);
   }
 
   // Inv and CxtI are in different blocks.
diff --git a/llvm/test/Analysis/BasicAA/separate_storage-alias-sets.ll b/llvm/test/Analysis/BasicAA/separate_storage-alias-sets.ll
new file mode 100644
index 00000000000000..7bd87d1b227b36
--- /dev/null
+++ b/llvm/test/Analysis/BasicAA/separate_storage-alias-sets.ll
@@ -0,0 +1,43 @@
+; We want BasicAA to make a reasonable conservative guess as to context for
+; separate storage hints. This lets alias analysis users (such as the alias set
+; tracker) who can't be context-sensitive still get the benefits of hints.
+
+; RUN: opt < %s -basic-aa-separate-storage -S -passes=print-alias-sets 2>&1 | FileCheck %s
+
+declare void @llvm.assume(i1)
+
+; CHECK-LABEL: Alias sets for function 'arg_arg'
+; CHECK: AliasSet[{{.*}}, 1] must alias, Ref Pointers: (ptr %a1, LocationSize::precise(1))
+; CHECK: AliasSet[{{.*}}, 1] must alias, Ref Pointers: (ptr %a2, LocationSize::precise(1))
+define void @arg_arg(ptr %a1, ptr %a2) {
+entry:
+  call void @llvm.assume(i1 true) [ "separate_storage"(ptr %a1, ptr %a2) ]
+  %0 = load i8, ptr %a1
+  %1 = load i8, ptr %a2
+  ret void
+}
+
+; CHECK-LABEL: Alias sets for function 'arg_inst'
+; CHECK: AliasSet[{{.*}}, 1] must alias, Ref Pointers: (ptr %a1, LocationSize::precise(1))
+; CHECK: AliasSet[{{.*}}, 1] must alias, Ref Pointers: (ptr %0, LocationSize::precise(1))
+define void @arg_inst(ptr %a1, ptr %a2) {
+entry:
+  %0 = getelementptr inbounds i8, ptr %a2, i64 20
+  call void @llvm.assume(i1 true) [ "separate_storage"(ptr %a1, ptr %0) ]
+  %1 = load i8, ptr %a1
+  %2 = load i8, ptr %0
+  ret void
+}
+
+; CHECK-LABEL: Alias sets for function 'inst_inst'
+; CHECK: AliasSet[{{.*}}, 1] must alias, Ref Pointers: (ptr %0, LocationSize::precise(1))
+; CHECK: AliasSet[{{.*}}, 1] must alias, Ref Pointers: (ptr %1, LocationSize::precise(1))
+define void @inst_inst(ptr %a1, ptr %a2) {
+entry:
+  %0 = getelementptr inbounds i8, ptr %a1, i64 20
+  %1 = getelementptr inbounds i8, ptr %a2, i64 20
+  call void @llvm.assume(i1 true) [ "separate_storage"(ptr %0, ptr %1) ]
+  %2 = load i8, ptr %0
+  %3 = load i8, ptr %1
+  ret void
+}

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please first submit a PR to enable the separate storage functionality by default? It seems to be essentially free right now, and we may have trouble enabling it in the future if it is made more expensive first.

This patch doesn't seem to add much overhead on average, but seems to be quite expensive in some cases. For example there is an 8% compile-time increase for Interp/Disasm.cpp (http://llvm-compile-time-tracker.com/compare_clang.php?from=662453637d86179c0a7c55532172f735f9c14c6d&to=cc866a92542e99de74d4a946467bb8a73e227757&stat=instructions%3Au&sortBy=interestingness)

I'm wondering whether the motivating case could be solved by using (must-exec) separate_storage assumptions in the entry block to infer noalias on function parameters. Assuming that is a valid transform, we would benefit from other noalias-based optimizations as well. (We could probably even drop the assume in that case, as noalias should enable strictly more optimizations, and gets preserved during inlining using scoped AA metadata.)

@nikic
Copy link
Contributor

nikic commented Jan 3, 2024

#76806 should help the Interp/Disasm.cpp case by making performance independent from the total number of assumes in the function.

@davidtgoldblatt
Copy link
Contributor Author

Can you please first submit a PR to enable the separate storage functionality by default? It seems to be essentially free right now, and we may have trouble enabling it in the future if it is made more expensive first.

Will do.

I'm wondering whether the motivating case could be solved by using (must-exec) separate_storage assumptions in the entry block to infer noalias on function parameters. Assuming that is a valid transform, we would benefit from other noalias-based optimizations as well. (We could probably even drop the assume in that case, as noalias should enable strictly more optimizations, and gets preserved during inlining using scoped AA metadata.)

I think noalias hinting can't be enough; separate storage hints actually end up being stronger than the scoped noalias stuff post-inlining, because the scopes don't propagate to accesses in the calling function (i.e. it can't be used to tell the caller that the caller's memory accesses don't alias one another; the metadata still just applies to accesses that were there in the inlined function). So these hints can be used in cases where there's sort of a split between core library code owned by experts, who want the hints to benefit the code written by nonexperts. Example: https://gcc.godbolt.org/z/Tfae5d5dM

The definition of the pointer of the memory location being queried is always one
such context. Even this conservative guess can be better than no guess at all in
some cases.

Fixes llvm#64666
davidtgoldblatt added a commit that referenced this pull request Jan 3, 2024
As requested in
#76770 (review)

A few months of experimentation in a large codebase did not reveal any
significant build speed regressions, and b07bf16 speeds up hint lookup
even further.

Co-authored-by: David Goldblatt <davidgoldblatt@meta.com>
@davidtgoldblatt
Copy link
Contributor Author

Can you please first submit a PR to enable the separate storage functionality by default? It seems to be essentially free right now, and we may have trouble enabling it in the future if it is made more expensive first.

Will do.

(Merged in #76864)

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidtgoldblatt davidtgoldblatt merged commit 852596d into llvm:main Jan 4, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Loop Vectorizer] __builtin_assume_separate_storage is not propagated to LV AA
3 participants