-
Notifications
You must be signed in to change notification settings - Fork 15k
[NFC][clang][AST] Add compile-time dispatch for random access iterators in append() #162000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-clang Author: Samarth Narang (snarang181) ChangesThis patch improves the Full diff: https://github.com/llvm/llvm-project/pull/162000.diff 1 Files Affected:
diff --git a/clang/include/clang/AST/ASTVector.h b/clang/include/clang/AST/ASTVector.h
index d5a04767ca133..ca30ec343492a 100644
--- a/clang/include/clang/AST/ASTVector.h
+++ b/clang/include/clang/AST/ASTVector.h
@@ -180,9 +180,21 @@ class ASTVector {
size_t capacity() const { return this->capacity_ptr() - Begin; }
/// append - Add the specified range to the end of the SmallVector.
- template<typename in_iter>
+ template <typename in_iter>
void append(const ASTContext &C, in_iter in_start, in_iter in_end) {
- size_type NumInputs = std::distance(in_start, in_end);
+ using size_type =
+ typename std::remove_reference_t<decltype(*this)>::size_type;
+ using iterator_category =
+ typename std::iterator_traits<in_iter>::iterator_category;
+
+ size_t NumInputs = 0;
+ constexpr bool is_random_access =
+ std::is_base_of_v<std::random_access_iterator_tag, iterator_category>;
+
+ if constexpr (is_random_access)
+ NumInputs = static_cast<size_type>(in_end - in_start);
+ else
+ NumInputs = static_cast<size_type>(std::distance(in_start, in_end));
if (NumInputs == 0)
return;
@@ -192,9 +204,11 @@ class ASTVector {
this->grow(C, this->size()+NumInputs);
// Copy the new elements over.
- // TODO: NEED To compile time dispatch on whether in_iter is a random access
- // iterator to use the fast uninitialized_copy.
- std::uninitialized_copy(in_start, in_end, this->end());
+ if constexpr (is_random_access)
+ std::uninitialized_copy_n(in_start, NumInputs, this->end());
+ else
+ std::uninitialized_copy(in_start, in_end, this->end());
+
this->setEnd(this->end() + NumInputs);
}
|
| // TODO: NEED To compile time dispatch on whether in_iter is a random access | ||
| // iterator to use the fast uninitialized_copy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment correct? Can you confirm that std::distance and std::uninitialized_copy do NOT perform compile time dispatch? How is std::uninitialized_copy_n faster here?
Update: I can confirm that the implementation of std::uninitialized_copy doesn't perform compile time dispatch, but I don't see how is std::uninitialized_copy_n faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zwuis,thanks for your review.
std::distance and std::uninitialized_copy both rely on iterator tag dispatch, which determines the implementation at overload resolution time rather than through explicit compile-time branching. While std::distance will use an O(1) path for random access iterators, that dispatch happens inside the library and isn’t guaranteed to inline cleanly into templated call sites like append(). Moreover, std::uninitialized_copy always performs an element-by-element loop regardless of iterator category, offering no benefit when the range size (NumInputs) is already known. Using std::uninitialized_copy_n with a precomputed element count avoids the per-iteration sentinel checks and enables the compiler to generate tighter loops.
|
Nice, thanks. I see some very minor improvement in wall-time. |
|
LGTM but please let other people take a look. |
|
Requesting others to review, thanks. |
|
Been a while since this has been up for review. Even though @zwuis has approved it, they'd be more comfortable with another review. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the changes here are basically correct, but ASTVector is not a common datatype (only used a handful of times) and needs significant refactoring to reduce code duplication (see the FIXME comment at the top of the file). So while the changes are reasonable, it seems like it may be adding more complexity for little value given that the compile time tracker found no significant performance changes one way or the other. But it's not that much complexity, either.
On balance, I'm inclined to accept. Does anyone want to make a case for why we shouldn't?
7a0f49e to
ac3615e
Compare
This patch improves the
append()implementation by introducing a compile-time dispatch between random access and non-random access iterators.