Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libc++][hardening] Don't trigger redundant checks in the fast mode. #77176

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

var-const
Copy link
Member

Sometimes we essentially check the same condition twice -- for example,
a class might check that an index into its vector member variable is
valid before accessing it, but vector::operator[] contains the same
check. These "redundant" checks allow catching errors closer to the
source and providing a better error message, but they also impose
additional overhead. Marking the "early" checks as redundant allows
ignoring them in the fast mode (while still getting a guaranteed trap)
while still getting better error messages in the extensive mode and
above. Introducing a separate wrapper macro allows making the concept of
redundant assertions orthogonal to assertion categories and retaining
the actual category of a check.

This is a follow-up to #75918,
specifically to this discussion.

Sometimes we essentially check the same condition twice -- for example,
a class might check that an index into its vector member variable is
valid before accessing it, but `vector::operator[]` contains the same
check. These "redundant" checks allow catching errors closer to the
source and providing a better error message, but they also impose
additional overhead. Marking the "early" checks as redundant allows
ignoring them in the fast mode (while still getting a guaranteed trap)
while still getting better error messages in the extensive mode and
above. Introducing a separate wrapper macro allows making the concept of
redundant assertions orthogonal to assertion categories and retaining
the actual category of a check.

This is a follow-up to llvm#75918,
specifically to [this discussion](llvm#75918 (comment)).
@var-const var-const requested a review from a team as a code owner January 6, 2024 04:17
@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Jan 6, 2024
@var-const var-const added the hardening Issues related to the hardening effort label Jan 6, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 6, 2024

@llvm/pr-subscribers-libcxx

Author: Konstantin Varlamov (var-const)

Changes

Sometimes we essentially check the same condition twice -- for example,
a class might check that an index into its vector member variable is
valid before accessing it, but vector::operator[] contains the same
check. These "redundant" checks allow catching errors closer to the
source and providing a better error message, but they also impose
additional overhead. Marking the "early" checks as redundant allows
ignoring them in the fast mode (while still getting a guaranteed trap)
while still getting better error messages in the extensive mode and
above. Introducing a separate wrapper macro allows making the concept of
redundant assertions orthogonal to assertion categories and retaining
the actual category of a check.

This is a follow-up to #75918,
specifically to this discussion.


Full diff: https://github.com/llvm/llvm-project/pull/77176.diff

11 Files Affected:

  • (modified) libcxx/include/__config (+16)
  • (modified) libcxx/include/__filesystem/directory_iterator.h (+2-1)
  • (modified) libcxx/include/__iterator/next.h (+3-2)
  • (modified) libcxx/include/__iterator/prev.h (+3-2)
  • (modified) libcxx/include/__mdspan/layout_left.h (+3-2)
  • (modified) libcxx/include/__mdspan/layout_right.h (+3-2)
  • (modified) libcxx/include/__mdspan/layout_stride.h (+3-2)
  • (modified) libcxx/include/__ranges/chunk_by_view.h (+11-6)
  • (modified) libcxx/include/__ranges/drop_while_view.h (+5-4)
  • (modified) libcxx/include/__ranges/filter_view.h (+3-2)
  • (modified) libcxx/include/regex (+2-1)
diff --git a/libcxx/include/__config b/libcxx/include/__config
index 082c73e672c749..b20e8abed0939c 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -290,6 +290,18 @@
 //   user input.
 //
 // - `_LIBCPP_ASSERT_UNCATEGORIZED` -- for assertions that haven't been properly classified yet.
+//
+// In addition to these categories, `_LIBCPP_REDUNDANT_ASSERTION` should be used to wrap assertions that duplicate other
+// assertions (for example, a range view might check that its `optional` data member holds a value before dereferencing
+// it, but this is already checked by `optional` itself). Redundant assertions incur an additional performance overhead
+// and don't provide any extra security benefit, but catching an error earlier allows halting the program closer to the
+// root cause and giving the user an error message that contains more context. Due to these tradeoffs, redundant
+// assertions are disabled in the fast mode but are enabled in the extensive mode and above. Thus, going back to the
+// example above, if a view attempts to dereference an empty optional member variable:
+// - in the fast mode, the program will only perform one check and will trap inside the optional (with an error
+//   amounting to "Attempting to dereference an empty optional");
+// - in the extensive mode, the program will normally perform two checks (in the non-error case), and if the optional is
+//   empty, it will trap inside the view (with an error like "`foo_view` doesn't have a valid predicate").
 
 // clang-format off
 #  define _LIBCPP_HARDENING_MODE_NONE      (1 << 1)
@@ -331,6 +343,7 @@ _LIBCPP_HARDENING_MODE_DEBUG
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                 _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                 _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)            _LIBCPP_ASSUME(expression)
+#    define _LIBCPP_REDUNDANT_ASSERTION(expression)                      _LIBCPP_ASSUME(expression)
 
 // Extensive hardening mode checks.
 
@@ -344,6 +357,7 @@ _LIBCPP_HARDENING_MODE_DEBUG
 #    define _LIBCPP_ASSERT_COMPATIBLE_ALLOCATOR(expression, message)     _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)            _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                 _LIBCPP_ASSERT(expression, message)
+#    define _LIBCPP_REDUNDANT_ASSERTION(expression)                      expression
 // Disabled checks.
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                 _LIBCPP_ASSUME(expression)
 
@@ -360,6 +374,7 @@ _LIBCPP_HARDENING_MODE_DEBUG
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                  _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                  _LIBCPP_ASSERT(expression, message)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)             _LIBCPP_ASSERT(expression, message)
+#    define _LIBCPP_REDUNDANT_ASSERTION(expression)                       expression
 
 // Disable all checks if hardening is not enabled.
 
@@ -374,6 +389,7 @@ _LIBCPP_HARDENING_MODE_DEBUG
 #    define _LIBCPP_ASSERT_PEDANTIC(expression, message)                  _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_INTERNAL(expression, message)                  _LIBCPP_ASSUME(expression)
 #    define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message)             _LIBCPP_ASSUME(expression)
+#    define _LIBCPP_REDUNDANT_ASSERTION(expression)                       _LIBCPP_ASSUME(expression)
 
 #  endif // _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_FAST
 // clang-format on
diff --git a/libcxx/include/__filesystem/directory_iterator.h b/libcxx/include/__filesystem/directory_iterator.h
index 5287a4d8b055fd..9b55513c87e508 100644
--- a/libcxx/include/__filesystem/directory_iterator.h
+++ b/libcxx/include/__filesystem/directory_iterator.h
@@ -74,7 +74,8 @@ class directory_iterator {
 
   _LIBCPP_HIDE_FROM_ABI const directory_entry& operator*() const {
     // Note: this check duplicates a check in `__dereference()`.
-    _LIBCPP_ASSERT_NON_NULL(__imp_, "The end iterator cannot be dereferenced");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_NON_NULL(__imp_, "The end iterator cannot be dereferenced"));
     return __dereference();
   }
 
diff --git a/libcxx/include/__iterator/next.h b/libcxx/include/__iterator/next.h
index 21d3688ad9eb60..b08881e1313e59 100644
--- a/libcxx/include/__iterator/next.h
+++ b/libcxx/include/__iterator/next.h
@@ -29,8 +29,9 @@ inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX17 _InputIter
 next(_InputIter __x, typename iterator_traits<_InputIter>::difference_type __n = 1) {
   // Calling `advance` with a negative value on a non-bidirectional iterator is a no-op in the current implementation.
   // Note that this check duplicates the similar check in `std::advance`.
-  _LIBCPP_ASSERT_PEDANTIC(__n >= 0 || __has_bidirectional_iterator_category<_InputIter>::value,
-                          "Attempt to next(it, n) with negative n on a non-bidirectional iterator");
+  _LIBCPP_REDUNDANT_ASSERTION( //
+      _LIBCPP_ASSERT_PEDANTIC(__n >= 0 || __has_bidirectional_iterator_category<_InputIter>::value,
+                              "Attempt to next(it, n) with negative n on a non-bidirectional iterator"));
 
   std::advance(__x, __n);
   return __x;
diff --git a/libcxx/include/__iterator/prev.h b/libcxx/include/__iterator/prev.h
index 2f0e6a088edb36..2590ea27d2b9b9 100644
--- a/libcxx/include/__iterator/prev.h
+++ b/libcxx/include/__iterator/prev.h
@@ -29,8 +29,9 @@ inline _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX17 _InputIter
 prev(_InputIter __x, typename iterator_traits<_InputIter>::difference_type __n = 1) {
   // Calling `advance` with a negative value on a non-bidirectional iterator is a no-op in the current implementation.
   // Note that this check duplicates the similar check in `std::advance`.
-  _LIBCPP_ASSERT_PEDANTIC(__n <= 0 || __has_bidirectional_iterator_category<_InputIter>::value,
-                          "Attempt to prev(it, n) with a positive n on a non-bidirectional iterator");
+  _LIBCPP_REDUNDANT_ASSERTION( //
+      _LIBCPP_ASSERT_PEDANTIC(__n <= 0 || __has_bidirectional_iterator_category<_InputIter>::value,
+                              "Attempt to prev(it, n) with a positive n on a non-bidirectional iterator"));
   std::advance(__x, -__n);
   return __x;
 }
diff --git a/libcxx/include/__mdspan/layout_left.h b/libcxx/include/__mdspan/layout_left.h
index fd644fa0c53226..74fc4398b742c2 100644
--- a/libcxx/include/__mdspan/layout_left.h
+++ b/libcxx/include/__mdspan/layout_left.h
@@ -152,8 +152,9 @@ class layout_left::mapping {
     // return a value exceeding required_span_size(), which is used to know how large an allocation one needs
     // Thus, this is a canonical point in multi-dimensional data structures to make invalid element access checks
     // However, mdspan does check this on its own, so for now we avoid double checking in hardened mode
-    _LIBCPP_ASSERT_UNCATEGORIZED(__mdspan_detail::__is_multidimensional_index_in(__extents_, __idx...),
-                                 "layout_left::mapping: out of bounds indexing");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(__mdspan_detail::__is_multidimensional_index_in(__extents_, __idx...),
+                                            "layout_left::mapping: out of bounds indexing"));
     array<index_type, extents_type::rank()> __idx_a{static_cast<index_type>(__idx)...};
     return [&]<size_t... _Pos>(index_sequence<_Pos...>) {
       index_type __res = 0;
diff --git a/libcxx/include/__mdspan/layout_right.h b/libcxx/include/__mdspan/layout_right.h
index 8e64d07dd52309..18b4c4c223cb4a 100644
--- a/libcxx/include/__mdspan/layout_right.h
+++ b/libcxx/include/__mdspan/layout_right.h
@@ -151,8 +151,9 @@ class layout_right::mapping {
     // return a value exceeding required_span_size(), which is used to know how large an allocation one needs
     // Thus, this is a canonical point in multi-dimensional data structures to make invalid element access checks
     // However, mdspan does check this on its own, so for now we avoid double checking in hardened mode
-    _LIBCPP_ASSERT_UNCATEGORIZED(__mdspan_detail::__is_multidimensional_index_in(__extents_, __idx...),
-                                 "layout_right::mapping: out of bounds indexing");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(__mdspan_detail::__is_multidimensional_index_in(__extents_, __idx...),
+                                            "layout_right::mapping: out of bounds indexing"));
     return [&]<size_t... _Pos>(index_sequence<_Pos...>) {
       index_type __res = 0;
       ((__res = static_cast<index_type>(__idx) + __extents_.extent(_Pos) * __res), ...);
diff --git a/libcxx/include/__mdspan/layout_stride.h b/libcxx/include/__mdspan/layout_stride.h
index 77934bfa11d9de..5a3ab6aa795353 100644
--- a/libcxx/include/__mdspan/layout_stride.h
+++ b/libcxx/include/__mdspan/layout_stride.h
@@ -284,8 +284,9 @@ class layout_stride::mapping {
     // return a value exceeding required_span_size(), which is used to know how large an allocation one needs
     // Thus, this is a canonical point in multi-dimensional data structures to make invalid element access checks
     // However, mdspan does check this on its own, so for now we avoid double checking in hardened mode
-    _LIBCPP_ASSERT_UNCATEGORIZED(__mdspan_detail::__is_multidimensional_index_in(__extents_, __idx...),
-                                 "layout_stride::mapping: out of bounds indexing");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(__mdspan_detail::__is_multidimensional_index_in(__extents_, __idx...),
+                                            "layout_stride::mapping: out of bounds indexing"));
     return [&]<size_t... _Pos>(index_sequence<_Pos...>) {
       return ((static_cast<index_type>(__idx) * __strides_[_Pos]) + ... + index_type(0));
     }(make_index_sequence<sizeof...(_Indices)>());
diff --git a/libcxx/include/__ranges/chunk_by_view.h b/libcxx/include/__ranges/chunk_by_view.h
index c5b3240a7d0bed..262dc4ef1cf90a 100644
--- a/libcxx/include/__ranges/chunk_by_view.h
+++ b/libcxx/include/__ranges/chunk_by_view.h
@@ -67,8 +67,10 @@ class _LIBCPP_ABI_2023_OVERLAPPING_SUBOBJECT_FIX_TAG chunk_by_view
 
   _LIBCPP_HIDE_FROM_ABI constexpr iterator_t<_View> __find_next(iterator_t<_View> __current) {
     // Note: this duplicates a check in `optional` but provides a better error message.
-    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __pred_.__has_value(), "Trying to call __find_next() on a chunk_by_view that does not have a valid predicate.");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+            __pred_.__has_value(),
+            "Trying to call __find_next() on a chunk_by_view that does not have a valid predicate."));
     auto __reversed_pred = [this]<class _Tp, class _Up>(_Tp&& __x, _Up&& __y) -> bool {
       return !std::invoke(*__pred_, std::forward<_Tp>(__x), std::forward<_Up>(__y));
     };
@@ -82,8 +84,10 @@ class _LIBCPP_ABI_2023_OVERLAPPING_SUBOBJECT_FIX_TAG chunk_by_view
     // Attempting to decrement a begin iterator is a no-op (`__find_prev` would return the same argument given to it).
     _LIBCPP_ASSERT_PEDANTIC(__current != ranges::begin(__base_), "Trying to call __find_prev() on a begin iterator.");
     // Note: this duplicates a check in `optional` but provides a better error message.
-    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __pred_.__has_value(), "Trying to call __find_prev() on a chunk_by_view that does not have a valid predicate.");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+            __pred_.__has_value(),
+            "Trying to call __find_prev() on a chunk_by_view that does not have a valid predicate."));
 
     auto __first = ranges::begin(__base_);
     reverse_view __reversed{subrange{__first, __current}};
@@ -113,8 +117,9 @@ class _LIBCPP_ABI_2023_OVERLAPPING_SUBOBJECT_FIX_TAG chunk_by_view
 
   _LIBCPP_HIDE_FROM_ABI constexpr __iterator begin() {
     // Note: this duplicates a check in `optional` but provides a better error message.
-    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __pred_.__has_value(), "Trying to call begin() on a chunk_by_view that does not have a valid predicate.");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+            __pred_.__has_value(), "Trying to call begin() on a chunk_by_view that does not have a valid predicate."));
 
     auto __first = ranges::begin(__base_);
     if (!__cached_begin_.__has_value()) {
diff --git a/libcxx/include/__ranges/drop_while_view.h b/libcxx/include/__ranges/drop_while_view.h
index b367f735c1417e..9271b834fbfaf0 100644
--- a/libcxx/include/__ranges/drop_while_view.h
+++ b/libcxx/include/__ranges/drop_while_view.h
@@ -67,10 +67,11 @@ class _LIBCPP_ABI_2023_OVERLAPPING_SUBOBJECT_FIX_TAG drop_while_view
 
   _LIBCPP_HIDE_FROM_ABI constexpr auto begin() {
     // Note: this duplicates a check in `optional` but provides a better error message.
-    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __pred_.__has_value(),
-        "drop_while_view needs to have a non-empty predicate before calling begin() -- did a previous "
-        "assignment to this drop_while_view fail?");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+            __pred_.__has_value(),
+            "drop_while_view needs to have a non-empty predicate before calling begin() -- did a previous "
+            "assignment to this drop_while_view fail?"));
     if constexpr (_UseCache) {
       if (!__cached_begin_.__has_value()) {
         __cached_begin_.__emplace(ranges::find_if_not(__base_, std::cref(*__pred_)));
diff --git a/libcxx/include/__ranges/filter_view.h b/libcxx/include/__ranges/filter_view.h
index ecb78eee381008..0d36757464ccdd 100644
--- a/libcxx/include/__ranges/filter_view.h
+++ b/libcxx/include/__ranges/filter_view.h
@@ -84,8 +84,9 @@ class _LIBCPP_ABI_2023_OVERLAPPING_SUBOBJECT_FIX_TAG filter_view : public view_i
 
   _LIBCPP_HIDE_FROM_ABI constexpr __iterator begin() {
     // Note: this duplicates a check in `optional` but provides a better error message.
-    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-        __pred_.__has_value(), "Trying to call begin() on a filter_view that does not have a valid predicate.");
+    _LIBCPP_REDUNDANT_ASSERTION( //
+        _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+            __pred_.__has_value(), "Trying to call begin() on a filter_view that does not have a valid predicate."));
     if constexpr (_UseCache) {
       if (!__cached_begin_.__has_value()) {
         __cached_begin_.__emplace(ranges::find_if(__base_, std::ref(*__pred_)));
diff --git a/libcxx/include/regex b/libcxx/include/regex
index b575a267583b5f..51c83c244fd4d2 100644
--- a/libcxx/include/regex
+++ b/libcxx/include/regex
@@ -4731,7 +4731,8 @@ _OutputIter match_results<_BidirectionalIterator, _Allocator>::format(
     const char_type* __fmt_last,
     regex_constants::match_flag_type __flags) const {
   // Note: this duplicates a check in `vector::operator[]` but provides a better error message.
-  _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(ready(), "match_results::format() called when not ready");
+  _LIBCPP_REDUNDANT_ASSERTION(
+      _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(ready(), "match_results::format() called when not ready"));
   if (__flags & regex_constants::format_sed) {
     for (; __fmt_first != __fmt_last; ++__fmt_first) {
       if (*__fmt_first == '&')

libcxx/include/__config Outdated Show resolved Hide resolved
@@ -331,6 +343,7 @@ _LIBCPP_HARDENING_MODE_DEBUG
# define _LIBCPP_ASSERT_PEDANTIC(expression, message) _LIBCPP_ASSUME(expression)
# define _LIBCPP_ASSERT_INTERNAL(expression, message) _LIBCPP_ASSUME(expression)
# define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message) _LIBCPP_ASSUME(expression)
# define _LIBCPP_REDUNDANT_ASSERTION(expression) _LIBCPP_ASSUME(expression)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If _LIBCPP_ASSUME actually did something at all, this would pessimize the code by discarding the assume information. We could fix it with something like:

#if fast-mode
# define _LIBCPP_REDUNDANT_ASSERTION(do_assert, expression) _LIBCPP_ASSUME(expression)
#elif extensive-mode
# define _LIBCPP_REDUNDANT_ASSERTION(do_assert, expression) do_assert(expression)
#elif ...
  // ...
#endif

Then call it as _LIBCPP_REDUNDANT_ASSERTION(_LIBCPP_ASSERT_NON_NULL, __imp_ != nullptr, "The end iterator cannot be dereferenced").

But this is really ugly. I would be OK with dropping the assume information since we don't use it at all right now.

However, there's currently a bug in how this is written. I think this needs to be

#if fast-mode
# define _LIBCPP_REDUNDANT_ASSERTION(expression) ((void)0)
#elif extensive-mode
# define _LIBCPP_REDUNDANT_ASSERTION(expression) expression
#elif ...
  // ...
#endif

We should add a test for this similar to libcxx/test/libcxx/assertions/single_expression.pass.cpp (maybe in the same file).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, added a TODO and updated the test.

libcxx/include/__mdspan/layout_left.h Outdated Show resolved Hide resolved
libcxx/include/__mdspan/layout_left.h Show resolved Hide resolved
@ldionne ldionne added this to the LLVM 18.0.X Release milestone Jan 16, 2024
@philnik777
Copy link
Contributor

Does this actually improve the performance anywhere? AFAICT all we gain is avoiding the messages being generated in the binary, but I'm not convinced that's much of a problem. e.g. in the std::next case, there doesn't seem to be any overhead as-is with any optimizations enabled: https://godbolt.org/z/768E1YMsf

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philnik777

Does this actually improve the performance anywhere? AFAICT all we gain is avoiding the messages being generated in the binary, but I'm not convinced that's much of a problem. e.g. in the std::next case, there doesn't seem to be any overhead as-is with any optimizations enabled: https://godbolt.org/z/768E1YMsf

I think that's a great question. I think it will depend on the circumstances, and I definitely expect that in some cases the compiler won't be able to do as good a job as we'd like with de-duplicating assertions. The std::next example you showed is not very challenging to de-duplicate, but other checks could be more challenging depending on inlining behavior, for example.

Apart from the question of "does this make the code faster", I think this macro does provide semantic value because it gives us a tool to say "this assertion exists, but you can rely on another one to have the same effect at the end of the day". I suspect this will resolve a lot of discussions like "should I add an assertion here? it's technically already checked elsewhere so we could avoid checking twice". If we have this tool, this becomes a non-issue. Our policy can be to simply add assertions when they make sense from the semantic point of view, and use this macro to ensure that codegen is still optimal.

I'd like this to land in LLVM 18. @philnik777 if you think this necessitates more discussion, please let us know and we can set up some time to address it in LLVM 19. I'd even be OK with reverting in LLVM 19 in case we find out that this is definitely not the way to go, but barring new information this seems like a good direction to me.

@@ -350,6 +363,9 @@ _LIBCPP_HARDENING_MODE_DEBUG
# define _LIBCPP_ASSERT_PEDANTIC(expression, message) _LIBCPP_ASSUME(expression)
# define _LIBCPP_ASSERT_INTERNAL(expression, message) _LIBCPP_ASSUME(expression)
# define _LIBCPP_ASSERT_UNCATEGORIZED(expression, message) _LIBCPP_ASSUME(expression)
// TODO(hardening): if `_LIBCPP_ASSUME` becomes functional again (it's currently a no-op), this would essentially
// discard the assumption.
# define _LIBCPP_REDUNDANT_ASSERTION(expression) ((void)0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# define _LIBCPP_REDUNDANT_ASSERTION(expression) ((void)0)
# define _LIBCPP_REDUNDANT_ASSERTION(assertion) ((void)0)

To make it clear that this is an assertion, not the boolean expression itself. This makes the usage of the macro clearer. And similarly below.

@philnik777
Copy link
Contributor

@philnik777

Does this actually improve the performance anywhere? AFAICT all we gain is avoiding the messages being generated in the binary, but I'm not convinced that's much of a problem. e.g. in the std::next case, there doesn't seem to be any overhead as-is with any optimizations enabled: https://godbolt.org/z/768E1YMsf

I think that's a great question. I think it will depend on the circumstances, and I definitely expect that in some cases the compiler won't be able to do as good a job as we'd like with de-duplicating assertions. The std::next example you showed is not very challenging to de-duplicate, but other checks could be more challenging depending on inlining behavior, for example.

Apart from the question of "does this make the code faster", I think this macro does provide semantic value because it gives us a tool to say "this assertion exists, but you can rely on another one to have the same effect at the end of the day". I suspect this will resolve a lot of discussions like "should I add an assertion here? it's technically already checked elsewhere so we could avoid checking twice". If we have this tool, this becomes a non-issue. Our policy can be to simply add assertions when they make sense from the semantic point of view, and use this macro to ensure that codegen is still optimal.

I'd like this to land in LLVM 18. @philnik777 if you think this necessitates more discussion, please let us know and we can set up some time to address it in LLVM 19. I'd even be OK with reverting in LLVM 19 in case we find out that this is definitely not the way to go, but barring new information this seems like a good direction to me.

I'd definitely like to discuss this more, since I don't think this is a good direction. This feels to me a lot like addressing a what-if scenario that doesn't really happen. Sure, my example was basic, but it was there to show that these checks can easily be elided by the compiler in a lot (all in this patch?) of cases. We avoid premature optimizations in the code base, and this definitely feels like one to me. I've seen this a couple of times recently, and I don't think we want to change the stance on this. IMO the patch author has to provide evidence that their optimization actually improves the code gen, and not state that it might do. It is already possible to customize the assertion handler, so avoiding the size increase due to more static data doesn't seem like a valid reason to me.

@ldionne ldionne removed this from the LLVM 18.0.X Release milestone Jan 22, 2024
@ldionne
Copy link
Member

ldionne commented Jan 22, 2024

@philnik777 Sounds good, we can discuss this more. I removed the PR from the LLVM 18 milestone, I think it's acceptable not to ship this one in that release. I'd love to know your opinion about what I said regarding the semantic meaning of this macro, though. Do you think that's unnecessary since we should be able to always blindly add assertions and assume that the compiler will de-duplicate when needed?

@philnik777
Copy link
Contributor

@philnik777 Sounds good, we can discuss this more. I removed the PR from the LLVM 18 milestone, I think it's acceptable not to ship this one in that release. I'd love to know your opinion about what I said regarding the semantic meaning of this macro, though. Do you think that's unnecessary since we should be able to always blindly add assertions and assume that the compiler will de-duplicate when needed?

Assuming you mean by "blindly" add them when it improves the messages, then yes. I see where you are coming from w.r.t. avoiding discussions, but I don't think these kinds of checks will add much of a performance regression (compared to checking it once) if any at all, and if there are any we can address them as we go. If we find such cases, I'd also rather have a solution to inline the assertions so they can be elided by the compiler if e.g. the user explicitly checks for them as well. In general I'd expect them to be just as costly checking them once as checking them twice (% binary size).

@ldionne ldionne self-requested a review January 23, 2024 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hardening Issues related to the hardening effort libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants