Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MemProf] Remove empty edges once after cloning #85320

Merged
merged 3 commits into from
Mar 27, 2024

Conversation

teresajohnson
Copy link
Contributor

Restructure the handling of edges that become empty during the cloning
process. Instead of removing them as they become empty (no context ids
and alloc type), do this once after all cloning is complete.

This has no effect on the cloning result, but prepares for a follow on
change that does improve the cloning. The structural change here reduces
the diffs for the follow on change, which would be much more difficult
with the previous handling.

Restructure the handling of edges that become empty during the cloning
process. Instead of removing them as they become empty (no context ids
and alloc type), do this once after all cloning is complete.

This has no effect on the cloning result, but prepares for a follow on
change that does improve the cloning. The structural change here reduces
the diffs for the follow on change, which would be much more difficult
with the previous handling.
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 14, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Teresa Johnson (teresajohnson)

Changes

Restructure the handling of edges that become empty during the cloning
process. Instead of removing them as they become empty (no context ids
and alloc type), do this once after all cloning is complete.

This has no effect on the cloning result, but prepares for a follow on
change that does improve the cloning. The structural change here reduces
the diffs for the follow on change, which would be much more difficult
with the previous handling.


Full diff: https://github.com/llvm/llvm-project/pull/85320.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp (+48-29)
diff --git a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
index 271d3ed40030b4..742b62a277a4b9 100644
--- a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+++ b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
@@ -306,7 +306,7 @@ class CallsiteContextGraph {
     // True if this node was effectively removed from the graph, in which case
     // its context id set, caller edges, and callee edges should all be empty.
     bool isRemoved() const {
-      assert(ContextIds.empty() ==
+      assert(!ContextIds.empty() ||
              (CalleeEdges.empty() && CallerEdges.empty()));
       return ContextIds.empty();
     }
@@ -349,9 +349,12 @@ class CallsiteContextGraph {
     }
   };
 
-  /// Helper to remove callee edges that have allocation type None (due to not
+  /// Helpers to remove callee edges that have allocation type None (due to not
   /// carrying any context ids) after transformations.
   void removeNoneTypeCalleeEdges(ContextNode *Node);
+  void
+  recursivelyRemoveNoneTypeCalleeEdges(ContextNode *Node,
+                                       DenseSet<const ContextNode *> &Visited);
 
 protected:
   /// Get a list of nodes corresponding to the stack ids in the given callsite
@@ -2427,11 +2430,42 @@ void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::
   }
 }
 
+template <typename DerivedCCG, typename FuncTy, typename CallTy>
+void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::
+    recursivelyRemoveNoneTypeCalleeEdges(
+        ContextNode *Node, DenseSet<const ContextNode *> &Visited) {
+  auto Inserted = Visited.insert(Node);
+  if (!Inserted.second)
+    return;
+
+  removeNoneTypeCalleeEdges(Node);
+
+  for (auto *Clone : Node->Clones)
+    recursivelyRemoveNoneTypeCalleeEdges(Clone, Visited);
+
+  // The recursive call may remove some of this Node's caller edges.
+  // Iterate over a copy and skip any that were removed.
+  auto CallerEdges = Node->CallerEdges;
+  for (auto &Edge : CallerEdges) {
+    // Skip any that have been removed by an earlier recursive call.
+    if (Edge->Callee == nullptr && Edge->Caller == nullptr) {
+      assert(!std::count(Node->CallerEdges.begin(), Node->CallerEdges.end(),
+                         Edge));
+      continue;
+    }
+    recursivelyRemoveNoneTypeCalleeEdges(Edge->Caller, Visited);
+  }
+}
+
 template <typename DerivedCCG, typename FuncTy, typename CallTy>
 void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::identifyClones() {
   DenseSet<const ContextNode *> Visited;
   for (auto &Entry : AllocationCallToContextNodeMap)
     identifyClones(Entry.second, Visited);
+  Visited.clear();
+  for (auto &Entry : AllocationCallToContextNodeMap)
+    recursivelyRemoveNoneTypeCalleeEdges(Entry.second, Visited);
+  check();
 }
 
 // helper function to check an AllocType is cold or notcold or both.
@@ -2446,7 +2480,7 @@ template <typename DerivedCCG, typename FuncTy, typename CallTy>
 void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::identifyClones(
     ContextNode *Node, DenseSet<const ContextNode *> &Visited) {
   if (VerifyNodes)
-    checkNode<DerivedCCG, FuncTy, CallTy>(Node);
+    checkNode<DerivedCCG, FuncTy, CallTy>(Node, /*CheckEdges=*/false);
   assert(!Node->CloneOf);
 
   // If Node as a null call, then either it wasn't found in the module (regular
@@ -2507,8 +2541,16 @@ void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::identifyClones(
   std::stable_sort(Node->CallerEdges.begin(), Node->CallerEdges.end(),
                    [&](const std::shared_ptr<ContextEdge> &A,
                        const std::shared_ptr<ContextEdge> &B) {
-                     assert(checkColdOrNotCold(A->AllocTypes) &&
-                            checkColdOrNotCold(B->AllocTypes));
+                     // Nodes with non-empty context ids should be sorted before
+                     // those with empty context ids.
+                     if (A->ContextIds.empty())
+                       // Either B ContextIds are non-empty (in which case we
+                       // should return false because B < A), or B ContextIds
+                       // are empty, in which case they are equal, and we should
+                       // maintain the original relative ordering.
+                       return false;
+                     if (B->ContextIds.empty())
+                       return true;
 
                      if (A->AllocTypes == B->AllocTypes)
                        // Use the first context id for each edge as a
@@ -2587,39 +2629,16 @@ void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::identifyClones(
            Node->AllocTypes != (uint8_t)AllocationType::None);
     // Sanity check that no alloc types on clone or its edges are None.
     assert(Clone->AllocTypes != (uint8_t)AllocationType::None);
-    assert(llvm::none_of(
-        Clone->CallerEdges, [&](const std::shared_ptr<ContextEdge> &E) {
-          return E->AllocTypes == (uint8_t)AllocationType::None;
-        }));
   }
 
-  // Cloning may have resulted in some cloned callee edges with type None,
-  // because they aren't carrying any contexts. Remove those edges.
-  for (auto *Clone : Node->Clones) {
-    removeNoneTypeCalleeEdges(Clone);
-    if (VerifyNodes)
-      checkNode<DerivedCCG, FuncTy, CallTy>(Clone);
-  }
   // We should still have some context ids on the original Node.
   assert(!Node->ContextIds.empty());
 
-  // Remove any callee edges that ended up with alloc type None after creating
-  // clones and updating callee edges.
-  removeNoneTypeCalleeEdges(Node);
-
   // Sanity check that no alloc types on node or edges are None.
   assert(Node->AllocTypes != (uint8_t)AllocationType::None);
-  assert(llvm::none_of(Node->CalleeEdges,
-                       [&](const std::shared_ptr<ContextEdge> &E) {
-                         return E->AllocTypes == (uint8_t)AllocationType::None;
-                       }));
-  assert(llvm::none_of(Node->CallerEdges,
-                       [&](const std::shared_ptr<ContextEdge> &E) {
-                         return E->AllocTypes == (uint8_t)AllocationType::None;
-                       }));
 
   if (VerifyNodes)
-    checkNode<DerivedCCG, FuncTy, CallTy>(Node);
+    checkNode<DerivedCCG, FuncTy, CallTy>(Node, /*CheckEdges=*/false);
 }
 
 void ModuleCallsiteContextGraph::updateAllocationCall(

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add a separate test for this or is it covered in upcoming changes?

Visited.clear();
for (auto &Entry : AllocationCallToContextNodeMap)
recursivelyRemoveNoneTypeCalleeEdges(Entry.second, Visited);
check();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is check() here? Do we need it to run on all builds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this was unconditional from when I was debugging. It should be guarded by the VerifyCCG option like other invocations of this function. Fixed.

@@ -306,7 +306,7 @@ class CallsiteContextGraph {
// True if this node was effectively removed from the graph, in which case
// its context id set, caller edges, and callee edges should all be empty.
bool isRemoved() const {
assert(ContextIds.empty() ==
assert(!ContextIds.empty() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An assert with an or condition is a little hard to reason about for me. For this to fail both LHS and RHS need to be false - context IDs is empty and at least one of CalleeEdges, CallerEdges is not empty. Can we rewrite this as two separate assertions to simplify? Also it seems that it is now inconsistent with the comment above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original assert was wrong, which showed up in testing after the reorganization here. The case that was wrong was when we were left with a single (non-removed) node in the graph, which would have non-empty context ids, but no callee or caller edges. I played around with restructuring the assertion but didn't come up with anything that looked clearer to me. E.g. here's another structure but it feels more convoluted to me personally:

assert(!(ContextIds.empty() &&
               (!CalleeEdges.empty() || !CallerEdges.empty()));

It's not inconsistent with the comment (it is still true that all 3 should be empty if it was removed) but I've added new comments to explain the case described above. Let me know if you have a preference of what to do for the assert structure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the following equivalent to what you want?

if (ContextIds.empty()) {
  assert(CalleeEdges.empty() && CallerEdges.empty() && "Context ids empty but at least one of callee and caller edges were not!");
}

This shouldn't cause non-assert builds to complain since we don't have any assert specific temporaries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is equivalent, it just means an extra check that is executed unnecessarily. Although hopefully the compiler would common it with the check on the return?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think there will be an extra check. With clang for non-assert builds (where NDEBUG is defined) the IR has an empty basic block [1] which should get optimized away in the generated code.

[1] https://godbolt.org/z/r5fceqeTK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will go ahead and restructure like that.

@teresajohnson
Copy link
Contributor Author

Does it make sense to add a separate test for this or is it covered in upcoming changes?

This is essentially an NFC change (I thought about adding NFC to it but decided not since it changes enough about the internal ordering of the graph transformation, although it should have no impact on anything user-visible).

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -306,7 +306,7 @@ class CallsiteContextGraph {
// True if this node was effectively removed from the graph, in which case
// its context id set, caller edges, and callee edges should all be empty.
bool isRemoved() const {
assert(ContextIds.empty() ==
assert(!ContextIds.empty() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the following equivalent to what you want?

if (ContextIds.empty()) {
  assert(CalleeEdges.empty() && CallerEdges.empty() && "Context ids empty but at least one of callee and caller edges were not!");
}

This shouldn't cause non-assert builds to complain since we don't have any assert specific temporaries.

@teresajohnson teresajohnson merged commit 082e7c4 into llvm:main Mar 27, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants