Skip to content

Conversation

mshockwave
Copy link
Member

@mshockwave mshockwave commented Sep 28, 2025

An alternative approach to #149732 , which sorts the DAG before dumping it. That approach runs a risk of altering the codegen result as we don't know if any of the downstream DAG users relies on the node ID, which was updated as part of the sorting.

The new method proposed by this PR does not update the node ID or any of the DAG's internal states: the newly added SelectionDAG::getTopologicallyOrderedNodes is a const member function that returns a list of all nodes in their topological order.

@llvmbot
Copy link
Member

llvmbot commented Sep 28, 2025

@llvm/pr-subscribers-llvm-selectiondag

Author: Min-Yih Hsu (mshockwave)

Changes

An alternative approach to #149732 , which sorts the DAG before dumping it. That approach runs a risk of altering the codegen result as we don't know if any of the downstream DAG users relies on the node ID, which was updated as part of the sorting.

The new method proposed by this PR does not update the node ID or any of the DAG's internal states: the newly added SelectionDAG::getTopologicallyOrderedNodes is a const member function that returns a list of all nodes in their topological order. The existing SelectionDAG::AssignTopologicalOrder was also modified to use this function as they share a nearly identical sorting algorithm.

The only downside I can think of is that while AssignTopologicalOrder sorts nodes in-place, getTopologicallyOrderedNodes uses auxiliary spaces to store nodes and metadata.


Full diff: https://github.com/llvm/llvm-project/pull/161097.diff

4 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/SelectionDAG.h (+7-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+49-80)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+12-2)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+15-10)
diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h
index d9d6f0bcdcb84..446239a007186 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -1959,6 +1959,10 @@ class SelectionDAG {
   LLVM_ABI SDValue makeEquivalentMemoryOrdering(LoadSDNode *OldLoad,
                                                 SDValue NewMemOp);
 
+  /// Get all the nodes in their topological order without modifying any states.
+  LLVM_ABI void getTopologicallyOrderedNodes(
+      SmallVectorImpl<const SDNode *> &SortedNodes) const;
+
   /// Topological-sort the AllNodes list and a
   /// assign a unique node id for each node in the DAG based on their
   /// topological order. Returns the number of nodes.
@@ -2009,7 +2013,9 @@ class SelectionDAG {
   /// function mirrors \c llvm::salvageDebugInfo.
   LLVM_ABI void salvageDebugInfo(SDNode &N);
 
-  LLVM_ABI void dump() const;
+  /// Dump the textual format of this DAG. Print nodes in sorted orders is \p
+  /// Sorted is true.
+  LLVM_ABI void dump(bool Sorted = false) const;
 
   /// In most cases this function returns the ABI alignment for a given type,
   /// except for illegal vector types where the alignment exceeds that of the
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 8fc7eabf90ea8..f1999fe34e88a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -12610,87 +12610,56 @@ void SelectionDAG::ReplaceAllUsesOfValuesWith(const SDValue *From,
 /// based on their topological order. It returns the maximum id and a vector
 /// of the SDNodes* in assigned order by reference.
 unsigned SelectionDAG::AssignTopologicalOrder() {
-  unsigned DAGSize = 0;
-
-  // SortedPos tracks the progress of the algorithm. Nodes before it are
-  // sorted, nodes after it are unsorted. When the algorithm completes
-  // it is at the end of the list.
-  allnodes_iterator SortedPos = allnodes_begin();
-
-  // Visit all the nodes. Move nodes with no operands to the front of
-  // the list immediately. Annotate nodes that do have operands with their
-  // operand count. Before we do this, the Node Id fields of the nodes
-  // may contain arbitrary values. After, the Node Id fields for nodes
-  // before SortedPos will contain the topological sort index, and the
-  // Node Id fields for nodes At SortedPos and after will contain the
-  // count of outstanding operands.
-  for (SDNode &N : llvm::make_early_inc_range(allnodes())) {
+  SmallVector<const SDNode *> SortedNodes(AllNodes.size());
+  getTopologicallyOrderedNodes(SortedNodes);
+
+  for (auto [Idx, ConstN] : enumerate(SortedNodes)) {
+    auto *N = const_cast<SDNode *>(ConstN);
+    N->setNodeId(Idx);
+    if (N->getIterator() != std::prev(allnodes_end()))
+      AllNodes.push_back(AllNodes.remove(N));
+  }
+
+  return SortedNodes.size();
+}
+
+void SelectionDAG::getTopologicallyOrderedNodes(
+    SmallVectorImpl<const SDNode *> &SortedNodes) const {
+  SortedNodes.clear();
+  // Node -> remaining number of outstanding operands.
+  DenseMap<const SDNode *, unsigned> RemainingOperands;
+
+  // Put nodes without any operands into SortedNodes first.
+  for (const SDNode &N : allnodes()) {
     checkForCycles(&N, this);
-    unsigned Degree = N.getNumOperands();
-    if (Degree == 0) {
-      // A node with no uses, add it to the result array immediately.
-      N.setNodeId(DAGSize++);
-      allnodes_iterator Q(&N);
-      if (Q != SortedPos)
-        SortedPos = AllNodes.insert(SortedPos, AllNodes.remove(Q));
-      assert(SortedPos != AllNodes.end() && "Overran node list");
-      ++SortedPos;
-    } else {
-      // Temporarily use the Node Id as scratch space for the degree count.
-      N.setNodeId(Degree);
-    }
-  }
-
-  // Visit all the nodes. As we iterate, move nodes into sorted order,
-  // such that by the time the end is reached all nodes will be sorted.
-  for (SDNode &Node : allnodes()) {
-    SDNode *N = &Node;
-    checkForCycles(N, this);
-    // N is in sorted position, so all its uses have one less operand
-    // that needs to be sorted.
-    for (SDNode *P : N->users()) {
-      unsigned Degree = P->getNodeId();
-      assert(Degree != 0 && "Invalid node degree");
-      --Degree;
-      if (Degree == 0) {
-        // All of P's operands are sorted, so P may sorted now.
-        P->setNodeId(DAGSize++);
-        if (P->getIterator() != SortedPos)
-          SortedPos = AllNodes.insert(SortedPos, AllNodes.remove(P));
-        assert(SortedPos != AllNodes.end() && "Overran node list");
-        ++SortedPos;
-      } else {
-        // Update P's outstanding operand count.
-        P->setNodeId(Degree);
-      }
-    }
-    if (Node.getIterator() == SortedPos) {
-#ifndef NDEBUG
-      allnodes_iterator I(N);
-      SDNode *S = &*++I;
-      dbgs() << "Overran sorted position:\n";
-      S->dumprFull(this); dbgs() << "\n";
-      dbgs() << "Checking if this is due to cycles\n";
-      checkForCycles(this, true);
-#endif
-      llvm_unreachable(nullptr);
-    }
-  }
-
-  assert(SortedPos == AllNodes.end() &&
-         "Topological sort incomplete!");
-  assert(AllNodes.front().getOpcode() == ISD::EntryToken &&
-         "First node in topological sort is not the entry token!");
-  assert(AllNodes.front().getNodeId() == 0 &&
-         "First node in topological sort has non-zero id!");
-  assert(AllNodes.front().getNumOperands() == 0 &&
-         "First node in topological sort has operands!");
-  assert(AllNodes.back().getNodeId() == (int)DAGSize-1 &&
-         "Last node in topologic sort has unexpected id!");
-  assert(AllNodes.back().use_empty() &&
-         "Last node in topologic sort has users!");
-  assert(DAGSize == allnodes_size() && "Node count mismatch!");
-  return DAGSize;
+    unsigned NumOperands = N.getNumOperands();
+    if (NumOperands == 0)
+      SortedNodes.push_back(&N);
+    else
+      // Record their total number of outstanding operands.
+      RemainingOperands[&N] = NumOperands;
+  }
+
+  // A node is pushed into SortedNodes when all of its operands (predecessors in
+  // the graph) are also in SortedNodes.
+  for (unsigned i = 0U; i < SortedNodes.size(); ++i) {
+    const SDNode *N = SortedNodes[i];
+    for (const SDNode *U : N->users()) {
+      unsigned &NumRemOperands = RemainingOperands[U];
+      assert(NumRemOperands && "Invalid number of remaining operands");
+      --NumRemOperands;
+      if (!NumRemOperands)
+        SortedNodes.push_back(U);
+    }
+  }
+
+  assert(SortedNodes.size() == AllNodes.size() && "Node count mismatch");
+  assert(SortedNodes.front()->getOpcode() == ISD::EntryToken &&
+         "First node in topological sort is not the entry token");
+  assert(SortedNodes.front()->getNumOperands() == 0 &&
+         "First node in topological sort has operands");
+  assert(SortedNodes.back()->use_empty() &&
+         "Last node in topologic sort has users");
 }
 
 /// AddDbgValue - Add a dbg_value SDNode. If SD is non-null that means the
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 4b2a00c2e2cfa..46dde33acb2da 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -1061,13 +1061,23 @@ static void DumpNodes(const SDNode *N, unsigned indent, const SelectionDAG *G) {
   N->dump(G);
 }
 
-LLVM_DUMP_METHOD void SelectionDAG::dump() const {
+LLVM_DUMP_METHOD void SelectionDAG::dump(bool Sorted) const {
   dbgs() << "SelectionDAG has " << AllNodes.size() << " nodes:\n";
 
-  for (const SDNode &N : allnodes()) {
+  auto dumpEachNode = [this](const SDNode &N) {
     if (!N.hasOneUse() && &N != getRoot().getNode() &&
         (!shouldPrintInline(N, this) || N.use_empty()))
       DumpNodes(&N, 2, this);
+  };
+
+  if (Sorted) {
+    SmallVector<const SDNode *> SortedNodes(AllNodes.size());
+    getTopologicallyOrderedNodes(SortedNodes);
+    for (const SDNode *N : SortedNodes)
+      dumpEachNode(*N);
+  } else {
+    for (const SDNode &N : allnodes())
+      dumpEachNode(N);
   }
 
   if (getRoot().getNode()) DumpNodes(getRoot().getNode(), 2, this);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index e61558c59bf0d..0604bd4ed8889 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -144,6 +144,11 @@ UseMBPI("use-mbpi",
         cl::init(true), cl::Hidden);
 
 #ifndef NDEBUG
+static cl::opt<bool>
+    DumpSortedDAG("dump-sorted-dags", cl::Hidden,
+                  cl::desc("Print the DAG with sorted nodes in debug dump"),
+                  cl::init(false));
+
 static cl::opt<std::string>
 FilterDAGBasicBlockName("filter-view-dags", cl::Hidden,
                         cl::desc("Only display the basic block whose name "
@@ -932,7 +937,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
   ISEL_DUMP(dbgs() << "\nInitial selection DAG: "
                    << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                    << "'\n";
-            CurDAG->dump());
+            CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
   if (TTI->hasBranchDivergence())
@@ -952,7 +957,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
   ISEL_DUMP(dbgs() << "\nOptimized lowered selection DAG: "
                    << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                    << "'\n";
-            CurDAG->dump());
+            CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
   if (TTI->hasBranchDivergence())
@@ -974,7 +979,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
   ISEL_DUMP(dbgs() << "\nType-legalized selection DAG: "
                    << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                    << "'\n";
-            CurDAG->dump());
+            CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
   if (TTI->hasBranchDivergence())
@@ -998,7 +1003,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
     ISEL_DUMP(dbgs() << "\nOptimized type-legalized selection DAG: "
                      << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                      << "'\n";
-              CurDAG->dump());
+              CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
     if (TTI->hasBranchDivergence())
@@ -1016,7 +1021,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
     ISEL_DUMP(dbgs() << "\nVector-legalized selection DAG: "
                      << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                      << "'\n";
-              CurDAG->dump());
+              CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
     if (TTI->hasBranchDivergence())
@@ -1032,7 +1037,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
     ISEL_DUMP(dbgs() << "\nVector/type-legalized selection DAG: "
                      << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                      << "'\n";
-              CurDAG->dump());
+              CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
     if (TTI->hasBranchDivergence())
@@ -1052,7 +1057,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
     ISEL_DUMP(dbgs() << "\nOptimized vector-legalized selection DAG: "
                      << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                      << "'\n";
-              CurDAG->dump());
+              CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
     if (TTI->hasBranchDivergence())
@@ -1072,7 +1077,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
   ISEL_DUMP(dbgs() << "\nLegalized selection DAG: "
                    << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                    << "'\n";
-            CurDAG->dump());
+            CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
   if (TTI->hasBranchDivergence())
@@ -1092,7 +1097,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
   ISEL_DUMP(dbgs() << "\nOptimized legalized selection DAG: "
                    << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                    << "'\n";
-            CurDAG->dump());
+            CurDAG->dump(DumpSortedDAG));
 
 #if !defined(NDEBUG) && LLVM_ENABLE_ABI_BREAKING_CHECKS
   if (TTI->hasBranchDivergence())
@@ -1116,7 +1121,7 @@ void SelectionDAGISel::CodeGenAndEmitDAG() {
   ISEL_DUMP(dbgs() << "\nSelected selection DAG: "
                    << printMBBReference(*FuncInfo->MBB) << " '" << BlockName
                    << "'\n";
-            CurDAG->dump());
+            CurDAG->dump(DumpSortedDAG));
 
   if (ViewSchedDAGs && MatchFilterBB)
     CurDAG->viewGraph("scheduler input for " + BlockName);

// Node Id fields for nodes At SortedPos and after will contain the
// count of outstanding operands.
for (SDNode &N : llvm::make_early_inc_range(allnodes())) {
SmallVector<const SDNode *> SortedNodes(AllNodes.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this vector have any compile time cost on large DAGs?

Copy link
Member Author

@mshockwave mshockwave Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the very least, it'll be slower to not reserving enough space for all nodes ahead of time. But other than that, yeah the heap allocation time might add up if we're going to create this vector every time we dump a DAG. One solution is allowing users of this API to pass in a pre-allocated (auxiliary) vector to avoid the problem we're discussing here, what do you think?

EDIT: I thought this is the dump API, but I think having a pre-allocated space might still be an option.

Copy link
Collaborator

@topperc topperc Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that this added a vector to the non-dumping path too. Directly related to the downside you listed

The only downside I can think of is that while AssignTopologicalOrder sorts nodes in-place, getTopologicallyOrderedNodes uses auxiliary spaces to store nodes and metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can also leave AssignTopologicalOrder untouched and only use getTopologicallyOrderedNodes in SelectionDAG::dump

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've reverted the code of AssignTopologicalOrder to its original one.

@mshockwave mshockwave requested a review from topperc October 1, 2025 21:20
};

if (Sorted) {
SmallVector<const SDNode *> SortedNodes(AllNodes.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is initializing AllNodes.size() elements in the vector so it will perform a memset. getTopologicallyOrderedNodes will clear the vector which will change the size, but not the capacity back to 0. Should we use reserve instead to avoid the memset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It's updated now.

@mshockwave mshockwave requested a review from topperc October 3, 2025 18:26
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mshockwave mshockwave merged commit c2c2e4e into llvm:main Oct 3, 2025
9 checks passed
@mshockwave mshockwave deleted the patch/sdag-immutable-sort branch October 3, 2025 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants