Skip to content

Conversation

@kazutakahirata
Copy link
Contributor

This patch moves the core logic of swapping to DenseMapBase::swap.

DenseMapBase::swap attempts to swap LHS and RHS fast with
metadata/pointer updates. This always succeeds in DenseMap. For
SmallDenseMap, it succeeds only if both LHS and RHS are in the large
mode.

If that fails, we swap LHS and RHS using three moves involving a
temporary instance. Even then, we try to move fast with
metadata/pointer updates where we can.

This patch is part of the effort outlined in #168255.

This patch moves the core logic of swapping to DenseMapBase::swap.

DenseMapBase::swap attempts to swap LHS and RHS fast with
metadata/pointer updates.  This always succeeds in DenseMap.  For
SmallDenseMap, it succeeds only if both LHS and RHS are in the large
mode.

If that fails, we swap LHS and RHS using three moves involving a
temporary instance.  Even then, we try to move fast with
metadata/pointer updates where we can.

This patch is part of the effort outlined in llvm#168255.
@llvmbot
Copy link
Member

llvmbot commented Nov 18, 2025

@llvm/pr-subscribers-llvm-adt

Author: Kazu Hirata (kazutakahirata)

Changes

This patch moves the core logic of swapping to DenseMapBase::swap.

DenseMapBase::swap attempts to swap LHS and RHS fast with
metadata/pointer updates. This always succeeds in DenseMap. For
SmallDenseMap, it succeeds only if both LHS and RHS are in the large
mode.

If that fails, we swap LHS and RHS using three moves involving a
temporary instance. Even then, we try to move fast with
metadata/pointer updates where we can.

This patch is part of the effort outlined in #168255.


Full diff: https://github.com/llvm/llvm-project/pull/168486.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ADT/DenseMap.h (+66-81)
diff --git a/llvm/include/llvm/ADT/DenseMap.h b/llvm/include/llvm/ADT/DenseMap.h
index a706f68fab81b..32877de3a24bd 100644
--- a/llvm/include/llvm/ADT/DenseMap.h
+++ b/llvm/include/llvm/ADT/DenseMap.h
@@ -363,7 +363,16 @@ class DenseMapBase : public DebugEpochBase {
   void swap(DerivedT &RHS) {
     this->incrementEpoch();
     RHS.incrementEpoch();
-    derived().swapImpl(RHS);
+    if (derived().maybeSwapFast(RHS.derived()))
+      return;
+
+    DerivedT &LHS = derived();
+    DerivedT Tmp;
+    Tmp.destroyAll();
+    Tmp.derived().kill();
+    Tmp.moveWithoutRehash(std::move(LHS));
+    LHS.moveWithoutRehash(std::move(RHS));
+    RHS.moveWithoutRehash(std::move(Tmp));
   }
 
 protected:
@@ -424,6 +433,43 @@ class DenseMapBase : public DebugEpochBase {
     return NextPowerOf2(NumEntries * 4 / 3 + 1);
   }
 
+  // Move buckets without rehash.
+  //
+  // Preconditions:
+  // - *this is killed.
+  // - RHS is valid.
+  //
+  // Post conditions:
+  // - *this is valid.
+  // - RHS is valid or killed (but not uninitialized).
+  void moveWithoutRehash(DerivedT &&RHS) {
+    if (derived().maybeMoveFast(std::move(RHS.derived())))
+      return;
+
+    derived().allocateBuckets(RHS.getNumBuckets());
+    setNumEntries(RHS.getNumEntries());
+    setNumTombstones(RHS.getNumTombstones());
+    DerivedT &LHS = derived();
+    iterator_range<BucketT *> LHSBuckets = LHS.buckets();
+    iterator_range<BucketT *> RHSBuckets = RHS.buckets();
+    assert(std::distance(LHSBuckets.begin(), LHSBuckets.end()) ==
+           std::distance(RHSBuckets.begin(), RHSBuckets.end()));
+    const KeyT EmptyKey = KeyInfoT::getEmptyKey();
+    const KeyT TombstoneKey = KeyInfoT::getTombstoneKey();
+    for (BucketT *L = LHSBuckets.begin(), *R = RHSBuckets.begin(),
+                 *E = LHSBuckets.end();
+         L != E; ++L, ++R) {
+      ::new (&L->getFirst()) KeyT(std::move(R->getFirst()));
+      R->getFirst().~KeyT();
+      if (!KeyInfoT::isEqual(L->getFirst(), EmptyKey) &&
+          !KeyInfoT::isEqual(L->getFirst(), TombstoneKey)) {
+        ::new (&L->getSecond()) ValueT(std::move(R->getSecond()));
+        R->getSecond().~ValueT();
+      }
+    }
+    RHS.derived().kill();
+  }
+
   // Move key/value from Other to *this.
   // Other is left in a valid but empty state.
   void moveFrom(DerivedT &Other) {
@@ -795,13 +841,6 @@ class DenseMap : public DenseMapBase<DenseMap<KeyT, ValueT, KeyInfoT, BucketT>,
   }
 
 private:
-  void swapImpl(DenseMap &RHS) {
-    std::swap(Buckets, RHS.Buckets);
-    std::swap(NumEntries, RHS.NumEntries);
-    std::swap(NumTombstones, RHS.NumTombstones);
-    std::swap(NumBuckets, RHS.NumBuckets);
-  }
-
   unsigned getNumEntries() const { return NumEntries; }
 
   void setNumEntries(unsigned Num) { NumEntries = Num; }
@@ -842,11 +881,16 @@ class DenseMap : public DenseMapBase<DenseMap<KeyT, ValueT, KeyInfoT, BucketT>,
                     static_cast<unsigned>(NextPowerOf2(MinNumBuckets - 1)));
   }
 
-  bool maybeMoveFast(DenseMap &&Other) {
-    swapImpl(Other);
+  bool maybeSwapFast(DenseMap &Other) {
+    std::swap(Buckets, Other.Buckets);
+    std::swap(NumEntries, Other.NumEntries);
+    std::swap(NumTombstones, Other.NumTombstones);
+    std::swap(NumBuckets, Other.NumBuckets);
     return true;
   }
 
+  bool maybeMoveFast(DenseMap &&Other) { return maybeSwapFast(Other); }
+
   // Plan how to shrink the bucket table.  Return:
   // - {false, 0} to reuse the existing bucket table
   // - {true, N} to reallocate a bucket table with N entries
@@ -941,77 +985,6 @@ class SmallDenseMap
   }
 
 private:
-  void swapImpl(SmallDenseMap &RHS) {
-    unsigned TmpNumEntries = RHS.NumEntries;
-    RHS.NumEntries = NumEntries;
-    NumEntries = TmpNumEntries;
-    std::swap(NumTombstones, RHS.NumTombstones);
-
-    const KeyT EmptyKey = KeyInfoT::getEmptyKey();
-    const KeyT TombstoneKey = KeyInfoT::getTombstoneKey();
-    if (Small && RHS.Small) {
-      // If we're swapping inline bucket arrays, we have to cope with some of
-      // the tricky bits of DenseMap's storage system: the buckets are not
-      // fully initialized. Thus we swap every key, but we may have
-      // a one-directional move of the value.
-      for (unsigned i = 0, e = InlineBuckets; i != e; ++i) {
-        BucketT *LHSB = &getInlineBuckets()[i],
-                *RHSB = &RHS.getInlineBuckets()[i];
-        bool hasLHSValue = (!KeyInfoT::isEqual(LHSB->getFirst(), EmptyKey) &&
-                            !KeyInfoT::isEqual(LHSB->getFirst(), TombstoneKey));
-        bool hasRHSValue = (!KeyInfoT::isEqual(RHSB->getFirst(), EmptyKey) &&
-                            !KeyInfoT::isEqual(RHSB->getFirst(), TombstoneKey));
-        if (hasLHSValue && hasRHSValue) {
-          // Swap together if we can...
-          std::swap(*LHSB, *RHSB);
-          continue;
-        }
-        // Swap separately and handle any asymmetry.
-        std::swap(LHSB->getFirst(), RHSB->getFirst());
-        if (hasLHSValue) {
-          ::new (&RHSB->getSecond()) ValueT(std::move(LHSB->getSecond()));
-          LHSB->getSecond().~ValueT();
-        } else if (hasRHSValue) {
-          ::new (&LHSB->getSecond()) ValueT(std::move(RHSB->getSecond()));
-          RHSB->getSecond().~ValueT();
-        }
-      }
-      return;
-    }
-    if (!Small && !RHS.Small) {
-      std::swap(*getLargeRep(), *RHS.getLargeRep());
-      return;
-    }
-
-    SmallDenseMap &SmallSide = Small ? *this : RHS;
-    SmallDenseMap &LargeSide = Small ? RHS : *this;
-
-    // First stash the large side's rep and move the small side across.
-    LargeRep TmpRep = std::move(*LargeSide.getLargeRep());
-    LargeSide.getLargeRep()->~LargeRep();
-    LargeSide.Small = true;
-    // This is similar to the standard move-from-old-buckets, but the bucket
-    // count hasn't actually rotated in this case. So we have to carefully
-    // move construct the keys and values into their new locations, but there
-    // is no need to re-hash things.
-    for (unsigned i = 0, e = InlineBuckets; i != e; ++i) {
-      BucketT *NewB = &LargeSide.getInlineBuckets()[i],
-              *OldB = &SmallSide.getInlineBuckets()[i];
-      ::new (&NewB->getFirst()) KeyT(std::move(OldB->getFirst()));
-      OldB->getFirst().~KeyT();
-      if (!KeyInfoT::isEqual(NewB->getFirst(), EmptyKey) &&
-          !KeyInfoT::isEqual(NewB->getFirst(), TombstoneKey)) {
-        ::new (&NewB->getSecond()) ValueT(std::move(OldB->getSecond()));
-        OldB->getSecond().~ValueT();
-      }
-    }
-
-    // The hard part of moving the small buckets across is done, just move
-    // the TmpRep into its new home.
-    SmallSide.Small = false;
-    new (SmallSide.getLargeRep()) LargeRep(std::move(TmpRep));
-  }
-
   unsigned getNumEntries() const { return NumEntries; }
 
   void setNumEntries(unsigned Num) {
@@ -1105,6 +1078,18 @@ class SmallDenseMap
                     static_cast<unsigned>(NextPowerOf2(MinNumBuckets - 1)));
   }
 
+  bool maybeSwapFast(SmallDenseMap &Other) {
+    if (Small || Other.Small)
+      return false;
+
+    unsigned Tmp = Other.NumEntries;
+    Other.NumEntries = NumEntries;
+    NumEntries = Tmp;
+    std::swap(NumTombstones, Other.NumTombstones);
+    std::swap(*getLargeRep(), *Other.getLargeRep());
+    return true;
+  }
+
   bool maybeMoveFast(SmallDenseMap &&Other) {
     if (Other.Small)
       return false;

@github-actions
Copy link

github-actions bot commented Nov 18, 2025

🐧 Linux x64 Test Results

  • 186344 tests passed
  • 4859 tests skipped

Copy link
Contributor

@s-barannikov s-barannikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just general comments, I'm not familiar with DenseMap implementation details


DerivedT &LHS = derived();
DerivedT Tmp;
Tmp.destroyAll();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you call destroyAll on an object just created?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the SmallDenseMap case, DerivedT Tmp; would initialize each inline bucket with EmptyKey. That needs to be destroyed. Some unit tests do check that even empty keys are constructed and destructed correctly.

As an optimization, we could have a special constructor that does not initialize inline buckets.

FWIW, destroyAll is a no-op for trivially destructible key/value pairs:

  void destroyAll() {
    // No need to iterate through the buckets if both KeyT and ValueT are
    // trivially destructible.
    if constexpr (std::is_trivially_destructible_v<KeyT> &&
                  std::is_trivially_destructible_v<ValueT>)
      return;

R->getSecond().~ValueT();
}
}
RHS.derived().kill();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't kill have a side effect of deallocating buckets? That's not what was happening before this patch.

RHS is already DerivedT, no need for derived().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to deallocateBuckets inside kill is a no-op here because this path is used in the small mode.

If RHS is in the large mode, we take the fast path:

    if (derived().maybeMoveFast(std::move(RHS)))
      return;

If RHS is in the small mode, we take the slow path ending with kill().

kill() here puts RHS in a known good state that is safe for destruction. Without kill(), RHS would remain a move-from object, and that's not safe for destruction, which involves destroyAll.

// Post conditions:
// - *this is valid.
// - RHS is valid or killed (but not uninitialized).
void moveWithoutRehash(DerivedT &&RHS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original function had two separate paths for LHS.Small && RHS.Small and LHS.Small != RHS.Small. Why has this changed? I suppose this was an optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are both handled here and DenseMapBase::swap via the Tmp = LHS; LHS = RHS; RHS = Tmp-style swap.

In the previous code:

  • LHS.Small != RHS.Small was handled literally in the Tmp = LHS; LHS = RHS; RHS = Tmp style.
  • LHS.Small && RHS.Small was handled by bucket-wise swap -- sort of like std::swap(LHS[I], RHS[I]) for each I.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants