Skip to content

Conversation

@kazutakahirata
Copy link
Contributor

Without this patch, SmallDenseMap::grow has two separate code paths to
grow the bucket array. The code path to handle the small mode has its
own traversal over the bucket array. This patch simplifies this logic
as follows:

  1. Allocate a temporary instance of SmallDenseMap.
  2. Move valid key/value pairs to the temporary instance.
  3. Move LargeRep to *this.

Remarks:

  • This patch adds moveFromImpl to move key/value pairs.
    moveFromOldBuckets is updated to use the new helper function.

  • This patch adds a private constructor to SmallDenseMap that takes an
    exact number of buckets, accompanied by tag ExactBucketCount.

  • This patch adds a fast path to deallocateBuckets in case
    getLargeRep()->NumBuckets == 0, just like destroyAll. This path is
    used to destruct zombie instances after moves.

  • In somewhat rare cases, we "grow" from the small mode to the small
    mode when there are many tombstones in the inline storage. This is
    handled with another call to moveFrom.

Without this patch, SmallDenseMap::grow has two separate code paths to
grow the bucket array.  The code path to handle the small mode has its
own traversal over the bucket array.  This patch simplifies this logic
as follows:

1. Allocate a temporary instance of SmallDenseMap.
2. Move valid key/value pairs to the temporary instance.
3. Move LargeRep to *this.

Remarks:

- This patch adds moveFromImpl to move key/value pairs.
  moveFromOldBuckets is updated to use the new helper function.

- This patch adds a private constructor to SmallDenseMap that takes an
  exact number of buckets, accompanied by tag ExactBucketCount.

- This patch adds a fast path to deallocateBuckets in case
  getLargeRep()->NumBuckets == 0, just like destroyAll.  This path is
  used to destruct zombie instances after moves.

- In somewhat rare cases, we "grow" from the small mode to the small
  mode when there are many tombstones in the inline storage.  This is
  handled with another call to moveFrom.
@llvmbot
Copy link
Member

llvmbot commented Nov 13, 2025

@llvm/pr-subscribers-llvm-adt

Author: Kazu Hirata (kazutakahirata)

Changes

Without this patch, SmallDenseMap::grow has two separate code paths to
grow the bucket array. The code path to handle the small mode has its
own traversal over the bucket array. This patch simplifies this logic
as follows:

  1. Allocate a temporary instance of SmallDenseMap.
  2. Move valid key/value pairs to the temporary instance.
  3. Move LargeRep to *this.

Remarks:

  • This patch adds moveFromImpl to move key/value pairs.
    moveFromOldBuckets is updated to use the new helper function.

  • This patch adds a private constructor to SmallDenseMap that takes an
    exact number of buckets, accompanied by tag ExactBucketCount.

  • This patch adds a fast path to deallocateBuckets in case
    getLargeRep()->NumBuckets == 0, just like destroyAll. This path is
    used to destruct zombie instances after moves.

  • In somewhat rare cases, we "grow" from the small mode to the small
    mode when there are many tombstones in the inline storage. This is
    handled with another call to moveFrom.


Full diff: https://github.com/llvm/llvm-project/pull/167829.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ADT/DenseMap.h (+30-42)
diff --git a/llvm/include/llvm/ADT/DenseMap.h b/llvm/include/llvm/ADT/DenseMap.h
index d5b13e7731550..c4d139411503f 100644
--- a/llvm/include/llvm/ADT/DenseMap.h
+++ b/llvm/include/llvm/ADT/DenseMap.h
@@ -413,9 +413,7 @@ class DenseMapBase : public DebugEpochBase {
     return NextPowerOf2(NumEntries * 4 / 3 + 1);
   }
 
-  void moveFromOldBuckets(iterator_range<BucketT *> OldBuckets) {
-    initEmpty();
-
+  void moveFromImpl(iterator_range<BucketT *> OldBuckets) {
     // Insert all the old elements.
     const KeyT EmptyKey = KeyInfoT::getEmptyKey();
     const KeyT TombstoneKey = KeyInfoT::getTombstoneKey();
@@ -438,6 +436,14 @@ class DenseMapBase : public DebugEpochBase {
     }
   }
 
+  void moveFromOldBuckets(iterator_range<BucketT *> OldBuckets) {
+    initEmpty();
+    moveFromImpl(OldBuckets);
+  }
+
+  // Move key/value from Other to *this.  Other will be in a zombie state.
+  void moveFrom(DerivedT &Other) { moveFromImpl(Other.buckets()); }
+
   void copyFrom(const DerivedT &other) {
     this->destroyAll();
     derived().deallocateBuckets();
@@ -889,6 +895,12 @@ class SmallDenseMap
   /// a large bucket. This union will be discriminated by the 'Small' bit.
   AlignedCharArrayUnion<BucketT[InlineBuckets], LargeRep> storage;
 
+  struct ExactBucketCount {};
+  SmallDenseMap(unsigned NumBuckets, ExactBucketCount) {
+    allocateBuckets(NumBuckets);
+    this->BaseT::initEmpty();
+  }
+
 public:
   explicit SmallDenseMap(unsigned NumElementsToReserve = 0) {
     init(NumElementsToReserve);
@@ -1065,7 +1077,7 @@ class SmallDenseMap
   }
 
   void deallocateBuckets() {
-    if (Small)
+    if (Small || getLargeRep()->NumBuckets == 0)
       return;
 
     deallocate_buffer(getLargeRep()->Buckets,
@@ -1096,46 +1108,22 @@ class SmallDenseMap
     if (AtLeast > InlineBuckets)
       AtLeast = std::max<unsigned>(64, NextPowerOf2(AtLeast - 1));
 
-    if (Small) {
-      // First move the inline buckets into a temporary storage.
-      AlignedCharArrayUnion<BucketT[InlineBuckets]> TmpStorage;
-      BucketT *TmpBegin = reinterpret_cast<BucketT *>(&TmpStorage);
-      BucketT *TmpEnd = TmpBegin;
+    SmallDenseMap Tmp(AtLeast, ExactBucketCount{});
+    Tmp.moveFrom(*this);
 
-      // Loop over the buckets, moving non-empty, non-tombstones into the
-      // temporary storage. Have the loop move the TmpEnd forward as it goes.
-      const KeyT EmptyKey = KeyInfoT::getEmptyKey();
-      const KeyT TombstoneKey = KeyInfoT::getTombstoneKey();
-      for (BucketT &B : inlineBuckets()) {
-        if (!KeyInfoT::isEqual(B.getFirst(), EmptyKey) &&
-            !KeyInfoT::isEqual(B.getFirst(), TombstoneKey)) {
-          assert(size_t(TmpEnd - TmpBegin) < InlineBuckets &&
-                 "Too many inline buckets!");
-          ::new (&TmpEnd->getFirst()) KeyT(std::move(B.getFirst()));
-          ::new (&TmpEnd->getSecond()) ValueT(std::move(B.getSecond()));
-          ++TmpEnd;
-          B.getSecond().~ValueT();
-        }
-        B.getFirst().~KeyT();
-      }
-
-      // AtLeast == InlineBuckets can happen if there are many tombstones,
-      // and grow() is used to remove them. Usually we always switch to the
-      // large rep here.
-      allocateBuckets(AtLeast);
-      this->moveFromOldBuckets(llvm::make_range(TmpBegin, TmpEnd));
-      return;
+    if (Tmp.Small) {
+      // Use moveFrom in those rare cases where we stay in the small mode.  This
+      // can happen when we have many tombstones.
+      this->BaseT::initEmpty();
+      this->moveFrom(Tmp);
+      Tmp.Small = false;
+      Tmp.getLargeRep()->NumBuckets = 0;
+    } else {
+      Small = false;
+      NumTombstones = 0;
+      *getLargeRep() = std::move(*Tmp.getLargeRep());
+      Tmp.getLargeRep()->NumBuckets = 0;
     }
-
-    LargeRep OldRep = std::move(*getLargeRep());
-    getLargeRep()->~LargeRep();
-    allocateBuckets(AtLeast);
-
-    this->moveFromOldBuckets(OldRep.buckets());
-
-    // Free the old table.
-    deallocate_buffer(OldRep.Buckets, sizeof(BucketT) * OldRep.NumBuckets,
-                      alignof(BucketT));
   }
 
   // Plan how to shrink the bucket table.  Return:

@kazutakahirata kazutakahirata merged commit c78fb8d into llvm:main Nov 13, 2025
10 checks passed
@kazutakahirata kazutakahirata deleted the cleanup_20251112_ADT_DenseMap_SmallDenseMap_grow branch November 13, 2025 16:38
@pcc
Copy link
Contributor

pcc commented Nov 14, 2025

LSan reports that this change caused memory leaks: https://lab.llvm.org/buildbot/#/builders/24/builds/14677/steps/12/logs/stdio

Can you please take a look?

@npanchen
Copy link
Contributor

LSan reports that this change caused memory leaks: https://lab.llvm.org/buildbot/#/builders/24/builds/14677/steps/12/logs/stdio

Can you please take a look?

also hit lsan problem with mlir tools. @kazutakahirata can you please take a look ?

@kazutakahirata
Copy link
Contributor Author

LSan reports that this change caused memory leaks: https://lab.llvm.org/buildbot/#/builders/24/builds/14677/steps/12/logs/stdio
Can you please take a look?

also hit lsan problem with mlir tools. @kazutakahirata can you please take a look ?

Thanks! I'll post a patch soon.

@kazutakahirata
Copy link
Contributor Author

also hit lsan problem with mlir tools. @kazutakahirata can you please take a look ?

Thanks! I'll post a patch soon.

I've posted #168011.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants