[V4] Small object optimizations#102
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
0440145 to
ba4412d
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
benchmark/lib/macros.hpp (1)
24-32: Optional: take the callable by forwarding reference.
auto make_argscopies the callable on each call. It's harmless here (registration-time, called once per benchmark), butauto &&make_args+std::forwardis the more idiomatic shape for a generic callback helper and avoids surprising copies if someone later passes a stateful functor.♻️ Proposed refactor
-inline void bench_thread_args(benchmark::Benchmark *bench, auto make_args) { - unsigned hw = std::max(1U, std::thread::hardware_concurrency()); - for (unsigned t : {1U, 2U, 4U, 6U, 8U, 12U, 16U, 24U, 32U, 48U, 64U, 96U}) { - if (t > hw) { - return; - } - make_args(bench, t); - } -} +inline void bench_thread_args(benchmark::Benchmark *bench, auto &&make_args) { + unsigned hw = std::max(1U, std::thread::hardware_concurrency()); + for (unsigned t : {1U, 2U, 4U, 6U, 8U, 12U, 16U, 24U, 32U, 48U, 64U, 96U}) { + if (t > hw) { + return; + } + std::forward<decltype(make_args)>(make_args)(bench, t); + } +}Side note (pre-existing, carried over from
common.hpp): on machines whosehardware_concurrency()falls between candidates (e.g. 10, 20, 128 cores), the benchmark never runs at the actual hardware maximum. Probably worth a follow-up to clamp/appendhwto the schedule, but out of scope for this PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmark/lib/macros.hpp` around lines 24 - 32, The bench_thread_args helper should accept the callable as a forwarding reference to avoid accidental copies: change the parameter from auto make_args to a templated forwarding reference (e.g., MakeArgs&& make_args) and invoke it with std::forward<MakeArgs>(make_args)(bench, t) inside the loop; update the function template signature to template<class MakeArgs> inline void bench_thread_args(benchmark::Benchmark *bench, MakeArgs&& make_args) so stateful functors are preserved and not copied.src/core/promise.cxx (1)
507-512: Reuse thesmall_trivially_copyableconcept from:opsto avoid drift.This static_assert open-codes exactly the predicate already defined as
small_trivially_copyableinsrc/core/ops.cxx(lines 25-27). If the storage policy ever changes (e.g., the2 * sizeof(void*)threshold or an added alignment check), this site will silently disagree with the actual storage rules instore_as_t/fwd_fn, breaking the noexcept invariant this assert is meant to guard.Consider exposing the concept from the
:opspartition and using it directly:♻️ Proposed refactor
- // Required for noexcept specifier to be correct: each stored type must be either a - // reference (original behavior) or a small trivially-copyable value (value-storage opt). - static_assert( - (std::is_reference_v<Fn> || (std::is_trivially_copyable_v<Fn> && sizeof(Fn) <= 2 * sizeof(void *))) && - (... && (std::is_reference_v<Args> || - (std::is_trivially_copyable_v<Args> && sizeof(Args) <= 2 * sizeof(void *))))); + // Required for noexcept specifier to be correct: each stored type must be either a + // reference (original behavior) or a small trivially-copyable value (value-storage opt). + static_assert((std::is_reference_v<Fn> || small_trivially_copyable<Fn>) && + (... && (std::is_reference_v<Args> || small_trivially_copyable<Args>)));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/promise.cxx` around lines 507 - 512, The static_assert in promise.cxx currently duplicates the predicate used for storage sizing; replace the open-coded predicate with the shared concept small_trivially_copyable (as defined in ops) so the noexcept check stays in sync with the actual storage policy. Update the static_assert to use small_trivially_copyable<Fn> && (... && small_trivially_copyable<Args>) (or the appropriate template syntax for the concept) and ensure the header/partition that defines small_trivially_copyable is included so this site and the implementations used by store_as_t and fwd_fn share the same definition.src/core/ops.cxx (1)
25-34: Consider also constraining alignment.
small_trivially_copyableonly checks size. A type likestruct alignas(64) tag{};is trivially copyable andsizeof == 64(so just barely passes on a 32-byte threshold isn't relevant, but) — more importantly, on 32-bit targets2 * sizeof(void*) == 8, which would still admit types withalignof > 8and forcepkg's overall alignment up. Addingalignof(T) <= alignof(std::max_align_t)(oralignof(void*)) would keeppkg's layout tight. Optional — may not matter for the actual call sites in this PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/ops.cxx` around lines 25 - 34, small_trivially_copyable currently only checks size and trivial copyability, which can allow types with large alignment (e.g., alignas(64)) and inadvertently increase enclosing object alignment; update the concept used by store_as_t (small_trivially_copyable and thus store_as_t) to also require alignof(std::remove_cvref_t<T>) <= alignof(std::max_align_t) (or <= alignof(void*)) so only small, low-alignment trivially copyable types are treated as stored-by-value; adjust the concept definition to apply the alignment check to std::remove_cvref_t<T> so store_as_t behavior is correct for reference and cv-qualified types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@benchmark/lib/macros.hpp`:
- Around line 24-32: The bench_thread_args helper should accept the callable as
a forwarding reference to avoid accidental copies: change the parameter from
auto make_args to a templated forwarding reference (e.g., MakeArgs&& make_args)
and invoke it with std::forward<MakeArgs>(make_args)(bench, t) inside the loop;
update the function template signature to template<class MakeArgs> inline void
bench_thread_args(benchmark::Benchmark *bench, MakeArgs&& make_args) so stateful
functors are preserved and not copied.
In `@src/core/ops.cxx`:
- Around line 25-34: small_trivially_copyable currently only checks size and
trivial copyability, which can allow types with large alignment (e.g.,
alignas(64)) and inadvertently increase enclosing object alignment; update the
concept used by store_as_t (small_trivially_copyable and thus store_as_t) to
also require alignof(std::remove_cvref_t<T>) <= alignof(std::max_align_t) (or <=
alignof(void*)) so only small, low-alignment trivially copyable types are
treated as stored-by-value; adjust the concept definition to apply the alignment
check to std::remove_cvref_t<T> so store_as_t behavior is correct for reference
and cv-qualified types.
In `@src/core/promise.cxx`:
- Around line 507-512: The static_assert in promise.cxx currently duplicates the
predicate used for storage sizing; replace the open-coded predicate with the
shared concept small_trivially_copyable (as defined in ops) so the noexcept
check stays in sync with the actual storage policy. Update the static_assert to
use small_trivially_copyable<Fn> && (... && small_trivially_copyable<Args>) (or
the appropriate template syntax for the concept) and ensure the header/partition
that defines small_trivially_copyable is included so this site and the
implementations used by store_as_t and fwd_fn share the same definition.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2b116f4b-f2c6-440f-84f7-2147533d0da9
📒 Files selected for processing (12)
benchmark/lib/CMakeLists.txtbenchmark/lib/common.hppbenchmark/lib/macros.hppbenchmark/src/baremetal/fib.cppbenchmark/src/libfork/fib.cppbenchmark/src/libfork/uts.cppbenchmark/src/serial/fib.cppbenchmark/src/serial/uts.cppsrc/core/execute.cxxsrc/core/ops.cxxsrc/core/promise.cxxsrc/core/thread_locals.cxx
💤 Files with no reviewable changes (7)
- benchmark/src/libfork/uts.cpp
- benchmark/src/libfork/fib.cpp
- benchmark/src/baremetal/fib.cpp
- benchmark/src/serial/fib.cpp
- benchmark/lib/CMakeLists.txt
- benchmark/src/serial/uts.cpp
- benchmark/lib/common.hpp
Summary by CodeRabbit
Refactor
Documentation