-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[ADT] Simplify SmallSetIterator with std::variant (NFC) #157229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADT] Simplify SmallSetIterator with std::variant (NFC) #157229
Conversation
SmallSet supports two underlying data structures -- SmallVector and std::set. SmallSetIterator supports them by maintaining a flag IsSmall and a union of the two underlying iterators. This patch essentially packages the flag and the union into std::variant, eliminating a lot of code involving IsSmall. With this change, we drop from the Rule of Five all the way down to Rule of Zero.
@llvm/pr-subscribers-llvm-adt Author: Kazu Hirata (kazutakahirata) ChangesSmallSet supports two underlying data structures -- SmallVector and This patch essentially packages the flag and the union into With this change, we drop from the Rule of Five all the way down to Full diff: https://github.com/llvm/llvm-project/pull/157229.diff 1 Files Affected:
diff --git a/llvm/include/llvm/ADT/SmallSet.h b/llvm/include/llvm/ADT/SmallSet.h
index 0e90293352630..11d1d25ccb932 100644
--- a/llvm/include/llvm/ADT/SmallSet.h
+++ b/llvm/include/llvm/ADT/SmallSet.h
@@ -24,6 +24,7 @@
#include <initializer_list>
#include <set>
#include <utility>
+#include <variant>
namespace llvm {
@@ -37,92 +38,25 @@ class SmallSetIterator
using SetIterTy = typename std::set<T, C>::const_iterator;
using VecIterTy = typename SmallVector<T, N>::const_iterator;
- /// Iterators to the parts of the SmallSet containing the data. They are set
- /// depending on isSmall.
- union {
- SetIterTy SetIter;
- VecIterTy VecIter;
- };
-
- bool IsSmall;
+ // A variant over the two possible iterators.
+ std::variant<VecIterTy, SetIterTy> Iter;
public:
- SmallSetIterator(SetIterTy SetIter) : SetIter(SetIter), IsSmall(false) {}
-
- SmallSetIterator(VecIterTy VecIter) : VecIter(VecIter), IsSmall(true) {}
-
- // Spell out destructor, copy/move constructor and assignment operators for
- // MSVC STL, where set<T>::const_iterator is not trivially copy constructible.
- ~SmallSetIterator() {
- if (IsSmall)
- VecIter.~VecIterTy();
- else
- SetIter.~SetIterTy();
- }
-
- SmallSetIterator(const SmallSetIterator &Other) : IsSmall(Other.IsSmall) {
- if (IsSmall)
- VecIter = Other.VecIter;
- else
- // Use placement new, to make sure SetIter is properly constructed, even
- // if it is not trivially copy-able (e.g. in MSVC).
- new (&SetIter) SetIterTy(Other.SetIter);
- }
-
- SmallSetIterator(SmallSetIterator &&Other) : IsSmall(Other.IsSmall) {
- if (IsSmall)
- VecIter = std::move(Other.VecIter);
- else
- // Use placement new, to make sure SetIter is properly constructed, even
- // if it is not trivially copy-able (e.g. in MSVC).
- new (&SetIter) SetIterTy(std::move(Other.SetIter));
- }
-
- SmallSetIterator& operator=(const SmallSetIterator& Other) {
- // Call destructor for SetIter, so it gets properly destroyed if it is
- // not trivially destructible in case we are setting VecIter.
- if (!IsSmall)
- SetIter.~SetIterTy();
-
- IsSmall = Other.IsSmall;
- if (IsSmall)
- VecIter = Other.VecIter;
- else
- new (&SetIter) SetIterTy(Other.SetIter);
- return *this;
- }
-
- SmallSetIterator& operator=(SmallSetIterator&& Other) {
- // Call destructor for SetIter, so it gets properly destroyed if it is
- // not trivially destructible in case we are setting VecIter.
- if (!IsSmall)
- SetIter.~SetIterTy();
-
- IsSmall = Other.IsSmall;
- if (IsSmall)
- VecIter = std::move(Other.VecIter);
- else
- new (&SetIter) SetIterTy(std::move(Other.SetIter));
- return *this;
- }
+ SmallSetIterator(VecIterTy VecIter) : Iter(VecIter) {}
+ SmallSetIterator(SetIterTy SetIter) : Iter(SetIter) {}
bool operator==(const SmallSetIterator &RHS) const {
- if (IsSmall != RHS.IsSmall)
- return false;
- if (IsSmall)
- return VecIter == RHS.VecIter;
- return SetIter == RHS.SetIter;
+ return Iter == RHS.Iter;
}
SmallSetIterator &operator++() { // Preincrement
- if (IsSmall)
- ++VecIter;
- else
- ++SetIter;
+ std::visit([](auto &It) { ++It; }, Iter);
return *this;
}
- const T &operator*() const { return IsSmall ? *VecIter : *SetIter; }
+ const T &operator*() const {
+ return std::visit([](const auto &It) -> const T & { return *It; }, Iter);
+ }
};
/// SmallSet - This maintains a set of unique values, optimizing for the case
|
std::variant can be quite heavy from both compile time and run time perspectives. Can you measure if there is any impact? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the concern about an extra include, I haven't seen std::variant
used in llvm/mlir but heard that some implementations are not very peformant. I think we'd want to proceed with caution in case this doesn't end up being just an NFC.
In my understanding the performance problems from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::variant should never be used in performance sensitive code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the curious: https://godbolt.org/z/c9sMMn665. Could the overhead be caused bystd::visit
on some toolchains?
@nikic Thank you for benchmarking! I've also verified that |
SmallSet supports two underlying data structures -- SmallVector and
std::set. SmallSetIterator supports them by maintaining a flag
IsSmall and a union of the two underlying iterators.
This patch essentially packages the flag and the union into
std::variant, eliminating a lot of code involving IsSmall.
With this change, we drop from the Rule of Five all the way down to
Rule of Zero.