Skip to content

Conversation

kazutakahirata
Copy link
Contributor

SmallSet supports two underlying data structures -- SmallVector and
std::set. SmallSetIterator supports them by maintaining a flag
IsSmall and a union of the two underlying iterators.

This patch essentially packages the flag and the union into
std::variant, eliminating a lot of code involving IsSmall.

With this change, we drop from the Rule of Five all the way down to
Rule of Zero.

SmallSet supports two underlying data structures -- SmallVector and
std::set.  SmallSetIterator supports them by maintaining a flag
IsSmall and a union of the two underlying iterators.

This patch essentially packages the flag and the union into
std::variant, eliminating a lot of code involving IsSmall.

With this change, we drop from the Rule of Five all the way down to
Rule of Zero.
@llvmbot
Copy link
Member

llvmbot commented Sep 6, 2025

@llvm/pr-subscribers-llvm-adt

Author: Kazu Hirata (kazutakahirata)

Changes

SmallSet supports two underlying data structures -- SmallVector and
std::set. SmallSetIterator supports them by maintaining a flag
IsSmall and a union of the two underlying iterators.

This patch essentially packages the flag and the union into
std::variant, eliminating a lot of code involving IsSmall.

With this change, we drop from the Rule of Five all the way down to
Rule of Zero.


Full diff: https://github.com/llvm/llvm-project/pull/157229.diff

1 Files Affected:

  • (modified) llvm/include/llvm/ADT/SmallSet.h (+10-76)
diff --git a/llvm/include/llvm/ADT/SmallSet.h b/llvm/include/llvm/ADT/SmallSet.h
index 0e90293352630..11d1d25ccb932 100644
--- a/llvm/include/llvm/ADT/SmallSet.h
+++ b/llvm/include/llvm/ADT/SmallSet.h
@@ -24,6 +24,7 @@
 #include <initializer_list>
 #include <set>
 #include <utility>
+#include <variant>
 
 namespace llvm {
 
@@ -37,92 +38,25 @@ class SmallSetIterator
   using SetIterTy = typename std::set<T, C>::const_iterator;
   using VecIterTy = typename SmallVector<T, N>::const_iterator;
 
-  /// Iterators to the parts of the SmallSet containing the data. They are set
-  /// depending on isSmall.
-  union {
-    SetIterTy SetIter;
-    VecIterTy VecIter;
-  };
-
-  bool IsSmall;
+  // A variant over the two possible iterators.
+  std::variant<VecIterTy, SetIterTy> Iter;
 
 public:
-  SmallSetIterator(SetIterTy SetIter) : SetIter(SetIter), IsSmall(false) {}
-
-  SmallSetIterator(VecIterTy VecIter) : VecIter(VecIter), IsSmall(true) {}
-
-  // Spell out destructor, copy/move constructor and assignment operators for
-  // MSVC STL, where set<T>::const_iterator is not trivially copy constructible.
-  ~SmallSetIterator() {
-    if (IsSmall)
-      VecIter.~VecIterTy();
-    else
-      SetIter.~SetIterTy();
-  }
-
-  SmallSetIterator(const SmallSetIterator &Other) : IsSmall(Other.IsSmall) {
-    if (IsSmall)
-      VecIter = Other.VecIter;
-    else
-      // Use placement new, to make sure SetIter is properly constructed, even
-      // if it is not trivially copy-able (e.g. in MSVC).
-      new (&SetIter) SetIterTy(Other.SetIter);
-  }
-
-  SmallSetIterator(SmallSetIterator &&Other) : IsSmall(Other.IsSmall) {
-    if (IsSmall)
-      VecIter = std::move(Other.VecIter);
-    else
-      // Use placement new, to make sure SetIter is properly constructed, even
-      // if it is not trivially copy-able (e.g. in MSVC).
-      new (&SetIter) SetIterTy(std::move(Other.SetIter));
-  }
-
-  SmallSetIterator& operator=(const SmallSetIterator& Other) {
-    // Call destructor for SetIter, so it gets properly destroyed if it is
-    // not trivially destructible in case we are setting VecIter.
-    if (!IsSmall)
-      SetIter.~SetIterTy();
-
-    IsSmall = Other.IsSmall;
-    if (IsSmall)
-      VecIter = Other.VecIter;
-    else
-      new (&SetIter) SetIterTy(Other.SetIter);
-    return *this;
-  }
-
-  SmallSetIterator& operator=(SmallSetIterator&& Other) {
-    // Call destructor for SetIter, so it gets properly destroyed if it is
-    // not trivially destructible in case we are setting VecIter.
-    if (!IsSmall)
-      SetIter.~SetIterTy();
-
-    IsSmall = Other.IsSmall;
-    if (IsSmall)
-      VecIter = std::move(Other.VecIter);
-    else
-      new (&SetIter) SetIterTy(std::move(Other.SetIter));
-    return *this;
-  }
+  SmallSetIterator(VecIterTy VecIter) : Iter(VecIter) {}
+  SmallSetIterator(SetIterTy SetIter) : Iter(SetIter) {}
 
   bool operator==(const SmallSetIterator &RHS) const {
-    if (IsSmall != RHS.IsSmall)
-      return false;
-    if (IsSmall)
-      return VecIter == RHS.VecIter;
-    return SetIter == RHS.SetIter;
+    return Iter == RHS.Iter;
   }
 
   SmallSetIterator &operator++() { // Preincrement
-    if (IsSmall)
-      ++VecIter;
-    else
-      ++SetIter;
+    std::visit([](auto &It) { ++It; }, Iter);
     return *this;
   }
 
-  const T &operator*() const { return IsSmall ? *VecIter : *SetIter; }
+  const T &operator*() const {
+    return std::visit([](const auto &It) -> const T & { return *It; }, Iter);
+  }
 };
 
 /// SmallSet - This maintains a set of unique values, optimizing for the case

@s-barannikov
Copy link
Contributor

s-barannikov commented Sep 6, 2025

std::variant can be quite heavy from both compile time and run time perspectives. Can you measure if there is any impact?

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the concern about an extra include, I haven't seen std::variant used in llvm/mlir but heard that some implementations are not very peformant. I think we'd want to proceed with caution in case this doesn't end up being just an NFC.

@tgymnich
Copy link
Member

tgymnich commented Sep 6, 2025

In my understanding the performance problems from std::variant stem from exception handling which shouldn't be an issue for the llvm code base.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@tgymnich tgymnich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the curious: https://godbolt.org/z/c9sMMn665. Could the overhead be caused bystd::visit on some toolchains?

@kazutakahirata
Copy link
Contributor Author

https://llvm-compile-time-tracker.com/compare.php?from=e0f00bd645bfd75a6717242cc42d63eeebb1db68&to=22587d6d1a9ae5a4e6805fa1dc964818bbbc6cb5&stat=instructions:u

std::variant should never be used in performance sensitive code.

@nikic Thank you for benchmarking! I've also verified that std::variant does indeed slow down even micro benchmarking. Closing.

@kazutakahirata kazutakahirata deleted the cleanup_20250905_ADT_SmallSet_std_variant branch September 6, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants