Skip to content

Conversation

gardell
Copy link

@gardell gardell commented Mar 28, 2024

I've been optimizing some code that streams large amounts of binary data using std::vector<uint8_t> and in the profiler I can see that the application spends a lot of time in ~vector(), specifically __clear().

For the std::vector<bool> specialization, __clear is not called when destructing the vector. I believe this is because bool is trivially destructible.

In this PR I propose a template specialization of __destroy_vector in the general vector<T> case that allows us to suppress the call to __clear if the element type in question is is_trivially_destructible.

Please let me know if I should clarify something. This is my first submission so I am sure I've gotten a lot of stuff wrong!

Regards

If the element type of a `vector` is `trivially_destructible` do not
call `__clear` when destroying the `vector`.
@gardell gardell requested a review from a team as a code owner March 28, 2024 12:52
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Mar 28, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 28, 2024

@llvm/pr-subscribers-libcxx

Author: Johan Gardell (gardell)

Changes

I've been optimizing some code that streams large amounts of binary data using std::vector&lt;uint8_t&gt; and in the profiler I can see that the application spends a lot of time in ~vector(), specifically __clear().

For the std::vector&lt;bool&gt; specialization, __clear is not called when destructing the vector. I believe this is because bool is trivially destructible.

In this PR I propose a template specialization of __destroy_vector in the general vector&lt;T&gt; case that allows us to suppress the call to __clear if the element type in question is is_trivially_destructible.

Please let me know if I should clarify something. This is my first submission so I am sure I've gotten a lot of stuff wrong!

Regards


Full diff: https://github.com/llvm/llvm-project/pull/86947.diff

2 Files Affected:

  • (modified) libcxx/include/vector (+10)
  • (added) libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp (+85)
diff --git a/libcxx/include/vector b/libcxx/include/vector
index 0908482600c533..e0df865366103a 100644
--- a/libcxx/include/vector
+++ b/libcxx/include/vector
@@ -347,6 +347,7 @@ template<class T, class charT> requires is-vector-bool-reference<T> // Since C++
 #include <__type_traits/is_allocator.h>
 #include <__type_traits/is_constructible.h>
 #include <__type_traits/is_nothrow_assignable.h>
+#include <__type_traits/is_trivially_destructible.h>
 #include <__type_traits/noexcept_move_assign_container.h>
 #include <__type_traits/type_identity.h>
 #include <__utility/exception_guard.h>
@@ -486,6 +487,15 @@ private:
   public:
     _LIBCPP_CONSTEXPR _LIBCPP_HIDE_FROM_ABI __destroy_vector(vector& __vec) : __vec_(__vec) {}
 
+    template <typename _Tp2 = _Tp, __enable_if_t<is_trivially_destructible<_Tp2>::value, int> = 0>
+    _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void operator()() {
+      if (__vec_.__begin_ != nullptr) {
+        __vec_.__annotate_delete();
+        __alloc_traits::deallocate(__vec_.__alloc(), __vec_.__begin_, __vec_.capacity());
+      }
+    }
+
+    template <typename _Tp2 = _Tp, __enable_if_t<!is_trivially_destructible<_Tp2>::value, int> = 0>
     _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void operator()() {
       if (__vec_.__begin_ != nullptr) {
         __vec_.__clear();
diff --git a/libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp b/libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp
new file mode 100644
index 00000000000000..5a5399e2fffc68
--- /dev/null
+++ b/libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp
@@ -0,0 +1,85 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include <vector>
+#include <cassert>
+
+struct int_alloc
+{
+    typedef int value_type;
+
+    template< class U >
+    struct rebind
+    {
+        typedef int_alloc other;
+    };
+
+    int* allocate(std::size_t n)
+    {
+        return new int[n];
+    }
+    void deallocate(int* p, std::size_t n)
+    {
+        for (std::size_t i = 0; i < n; ++i) {
+            assert(p[i] != 0);
+        }
+        delete p;
+    }
+};
+
+struct with_dtor
+{
+    with_dtor() = default;
+    with_dtor(const with_dtor &) = default;
+
+    int x = 42;
+
+    ~with_dtor() {
+        x = 0;
+    }
+};
+
+struct with_dtor_alloc
+{
+    typedef with_dtor value_type;
+
+    template< class U >
+    struct rebind
+    {
+        typedef with_dtor_alloc other;
+    };
+
+    with_dtor* allocate(std::size_t n)
+    {
+        return new with_dtor[n];
+    }
+    void deallocate(with_dtor* p, std::size_t n)
+    {
+        for (std::size_t i = 0; i < n; ++i) {
+            assert(p[i].x == 0);
+        }
+        delete[] p;
+    }
+};
+
+void tests()
+{
+    {
+        std::vector<with_dtor, with_dtor_alloc> v(5, with_dtor());
+    }
+    {
+        std::vector<int, int_alloc> v(5, 42);
+    }
+}
+
+int main(int, char**)
+{
+    tests();
+    return 0;
+}
+

@philnik777
Copy link
Contributor

When destroying a vector<uint8_t> there is no call to __clear(): https://godbolt.org/z/748P3f8xT. Could you give an example where this optimization applies?

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff de0abc0983d355bbd971c5c571ba4c209a0c63ea 6917d186bbb197356bdeaf663380976c8080aba1 -- libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp libcxx/include/vector
View the diff from clang-format here.
diff --git a/libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp b/libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp
index 5a5399e2ff..5eafe1191a 100644
--- a/libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector/vector.cons/dtor_no_trivial_element_dtor_call.pass.cpp
@@ -9,77 +9,55 @@
 #include <vector>
 #include <cassert>
 
-struct int_alloc
-{
-    typedef int value_type;
-
-    template< class U >
-    struct rebind
-    {
-        typedef int_alloc other;
-    };
-
-    int* allocate(std::size_t n)
-    {
-        return new int[n];
-    }
-    void deallocate(int* p, std::size_t n)
-    {
-        for (std::size_t i = 0; i < n; ++i) {
-            assert(p[i] != 0);
-        }
-        delete p;
+struct int_alloc {
+  typedef int value_type;
+
+  template < class U >
+  struct rebind {
+    typedef int_alloc other;
+  };
+
+  int* allocate(std::size_t n) { return new int[n]; }
+  void deallocate(int* p, std::size_t n) {
+    for (std::size_t i = 0; i < n; ++i) {
+      assert(p[i] != 0);
     }
+    delete p;
+  }
 };
 
-struct with_dtor
-{
-    with_dtor() = default;
-    with_dtor(const with_dtor &) = default;
+struct with_dtor {
+  with_dtor()                 = default;
+  with_dtor(const with_dtor&) = default;
 
-    int x = 42;
+  int x = 42;
 
-    ~with_dtor() {
-        x = 0;
-    }
+  ~with_dtor() { x = 0; }
 };
 
-struct with_dtor_alloc
-{
-    typedef with_dtor value_type;
+struct with_dtor_alloc {
+  typedef with_dtor value_type;
 
-    template< class U >
-    struct rebind
-    {
-        typedef with_dtor_alloc other;
-    };
+  template < class U >
+  struct rebind {
+    typedef with_dtor_alloc other;
+  };
 
-    with_dtor* allocate(std::size_t n)
-    {
-        return new with_dtor[n];
-    }
-    void deallocate(with_dtor* p, std::size_t n)
-    {
-        for (std::size_t i = 0; i < n; ++i) {
-            assert(p[i].x == 0);
-        }
-        delete[] p;
+  with_dtor* allocate(std::size_t n) { return new with_dtor[n]; }
+  void deallocate(with_dtor* p, std::size_t n) {
+    for (std::size_t i = 0; i < n; ++i) {
+      assert(p[i].x == 0);
     }
+    delete[] p;
+  }
 };
 
-void tests()
-{
-    {
-        std::vector<with_dtor, with_dtor_alloc> v(5, with_dtor());
-    }
-    {
-        std::vector<int, int_alloc> v(5, 42);
-    }
+void tests() {
+  { std::vector<with_dtor, with_dtor_alloc> v(5, with_dtor()); }
+  { std::vector<int, int_alloc> v(5, 42); }
 }
 
-int main(int, char**)
-{
-    tests();
-    return 0;
+int main(int, char**) {
+  tests();
+  return 0;
 }
-

void deallocate(int* p, std::size_t n)
{
for (std::size_t i = 0; i < n; ++i) {
assert(p[i] != 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this is testing UB. Perhaps this test file should be moved to libcxx/test/libcxx subdirectory.

@gardell
Copy link
Author

gardell commented Apr 4, 2024

So, it turns out rather embarrassingly that my CMake setup was passing -O2 to every file in the project, except the one with the large amount of ~vector that was compiled with -O0.

Apologies for taking your time!

@gardell gardell closed this Apr 4, 2024
@philnik777
Copy link
Contributor

So, it turns out rather embarrassingly that my CMake setup was passing -O2 to every file in the project, except the one with the large amount of ~vector that was compiled with -O0.

Apologies for taking your time!

No worries! I've done that often enough myself :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants