[V4] Benchmark suite 1/N#105
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
82e1275 to
3490792
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (5)
benchmark/lib/bench.hpp (1)
28-42: 💤 Low valueConsider documenting the formattable requirement.
The
std::formatcall at line 36 requires bothExpectedand the result type to be formattable. While the current benchmark types (e.g.,std::int64_t, customresultwith formatter) satisfy this, future users may encounter cryptic compilation errors if they use non-formattable types.📝 Suggested documentation
Add a comment above the function template:
+// Requires: Expected must be formattable with std::format and equality-comparable with result type template <typename Expected, typename Check, typename Fn> void bench(benchmark::State &state, std::int64_t threads, const Expected &expected, Check check, Fn fn) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark/lib/bench.hpp` around lines 28 - 42, Add a short comment above the bench template documenting that the code uses std::format on the expected and the function result, so both the Expected template parameter and the Fn return type (std::invoke_result_t<Fn>) must be formattable (i.e., have a std::formatter specialization for char) to avoid cryptic compile errors; mention the specific symbols involved (bench, Expected, Fn, and the std::format call that formats result and expected) so future users know the requirement.benchmark/src/serial/primes.cpp (1)
12-18: 💤 Low valueConsider starting the loop at 2 for efficiency.
The loop currently starts at
i=1, but sinceis_prime(1)always returnsfalse, this wastes one primality check per benchmark iteration. Starting ati=2would be slightly more efficient without changing correctness.⚡ Proposed optimization
run_primes(state, [](std::int64_t lim) { std::int64_t count = 0; - for (std::int64_t i = 1; i < lim; ++i) { + for (std::int64_t i = 2; i < lim; ++i) { count += is_prime(i) ? 1 : 0; } return count; });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark/src/serial/primes.cpp` around lines 12 - 18, The loop passed into run_primes currently iterates i from 1 to lim and calls is_prime(1) unnecessarily; update the lambda's loop to start at i = 2 (while keeping lim and count semantics and returning count) so it skips the trivial non-prime 1 and avoids the wasted primality check in the function passed to run_primes.benchmark/lib/matmul.hpp (1)
67-69: 💤 Low valuePrefer
std::absorstd::fabsfor clarity.The manual absolute value computation can be replaced with the standard library function.
Suggested simplification
- float diff = (A[i] - B[i]) / A[i]; - if (diff < 0) { - diff = -diff; - } + float diff = std::fabs((A[i] - B[i]) / A[i]);Note: This suggestion can be combined with the previous comment's fix for a more robust solution.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark/lib/matmul.hpp` around lines 67 - 69, Replace the manual sign-flip on variable diff with the standard library absolute function: use std::abs (or std::fabs for floating types) instead of if (diff < 0) diff = -diff; and ensure the appropriate header (<cmath> or <cstdlib>) is included so the compiler resolves std::abs/std::fabs for the type of diff.docs/benchmarks/index.md (1)
37-46: ⚡ Quick winPrefer relative links for in-repo paths.
These links point to
github.com/.../tree/main/..., which can drift from the docs version/branch and breaks fork portability. Use repo-relative links so docs stay branch-correct and self-contained.Suggested edit
-- [`benchmark/lib/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/lib) +- [`benchmark/lib/`](../../../benchmark/lib/) contains shared kernels, input sizes, and correctness helpers. -- [`benchmark/src/libfork/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/libfork) +- [`benchmark/src/libfork/`](../../../benchmark/src/libfork/) contains libfork coroutine and scheduler benchmarks. -- [`benchmark/src/serial/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/serial) +- [`benchmark/src/serial/`](../../../benchmark/src/serial/) contains single-threaded baselines. -- [`benchmark/src/openmp/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/openmp) +- [`benchmark/src/openmp/`](../../../benchmark/src/openmp/) contains OpenMP tasking comparisons where present. -- [`benchmark/src/baremetal/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/baremetal) +- [`benchmark/src/baremetal/`](../../../benchmark/src/baremetal/) contains low-level coroutine or data-structure baselines.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/benchmarks/index.md` around lines 37 - 46, The GitHub absolute links in docs/benchmarks/index.md (the list items referencing [`benchmark/lib/`], [`benchmark/src/libfork/`], [`benchmark/src/serial/`], [`benchmark/src/openmp/`], and [`benchmark/src/baremetal/`]) should be replaced with repo-relative links (e.g. ./benchmark/lib/ and ./benchmark/src/libfork/ etc.) so the docs remain branch-aware and portable; update each markdown link target to the corresponding relative path while keeping the link text unchanged.benchmark/src/serial/strassen.cpp (1)
42-49: 💤 Low valueConsider validating that
nis even or a power of 2.The recursive division at Line 49 (
unsigned m = n / 2) assumesnis even. Ifnis odd, integer division will truncate and the sub-blocks won't cover the full matrix, leading to incorrect results.If the benchmark framework guarantees power-of-2 sizes, document this assumption. Otherwise, add validation:
🛡️ Proposed validation
void strassen(float const *A, unsigned sa, float const *B, unsigned sb, float *C, unsigned sc, unsigned n) { + // Strassen requires power-of-2 dimensions or even n at minimum + if (n % 2 != 0 && n > strassen_cutoff) { + // Fall back to naive for odd sizes + naive_multiply(A, sa, B, sb, C, sc, n); + return; + } + if (n <= strassen_cutoff) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@benchmark/src/serial/strassen.cpp` around lines 42 - 49, The strassen function assumes n is even when computing m = n / 2; add validation at the start of strassen (before the strassen_cutoff check or immediately after) to ensure n is valid (either assert/document that n is a power-of-two or explicitly check n % 2 == 0), and if odd either pad inputs to the next even/power-of-two size or return an error/convert to naive_multiply; reference the strassen function, the n parameter, strassen_cutoff, and naive_multiply when implementing the chosen validation/handling approach.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@benchmark/lib/matmul.hpp`:
- Around line 63-75: The function matmul_max_relative_error divides by A[i]
without guarding against zero/near-zero; change the computation to use a safe
denominator by comparing std::abs(A[i]) with a small epsilon (e.g., 1e-8f) and
divide by std::max(std::abs(A[i]), epsilon), or when |A[i]| < epsilon use
absolute error (std::abs(A[i]-B[i])) instead; update the diff calculation and
comparisons in matmul_max_relative_error to compute diff = std::abs(A[i]-B[i]) /
std::max(std::abs(A[i]), epsilon) (or the absolute fallback) so you avoid
huge/undefined relative errors for tiny reference values.
In `@benchmark/lib/nqueens.hpp`:
- Around line 17-21: The lookup table nqueens_answers is declared with 28
entries but only 21 values are provided, causing indices 21–27 to be
zero-initialized and yield incorrect results when accessed via
.at(state.range(0)). Fix by making the table length match the provided data
(change the constexpr std::array size to 21) or by populating the remaining
seven entries with the correct N-queens counts; alternatively add runtime
validation around the .at(state.range(0)) call to ensure state.range(0) is
within the supported index range before accessing nqueens_answers.
In `@benchmark/src/serial/matmul.cpp`:
- Around line 12-17: The recursive split truncates odd n via m = n / 2 causing
incorrect results; before splitting in the recursive matmul code, guard
odd-sized matrices by either (a) falling back to the scalar/basecase
routine—call matmul_basecase_multiply<Add>(A, B, R, n, s) when (n % 2 != 0) or
(b) validate/enforce power-of-two inputs with an explicit check/assert and
error, instead of proceeding with m = n / 2; update the block that currently
computes unsigned m = n / 2 to perform this guard using the existing
matmul_basecase and matmul_basecase_multiply symbols and the A, B, R, n, s
parameters.
In `@benchmark/src/serial/quicksort.cpp`:
- Around line 38-43: The quicksort implementation always recurses on the right
partition (quicksort(pivot + 1, last)) which can cause O(n) stack depth for
degenerate pivots; change the control flow to always recurse on the smaller
partition and loop on the larger one: after partition(first,last) returns pivot,
compare sizes of left (pivot - first) and right (last - (pivot + 1)) and call
quicksort on the smaller side, then update first/last to the bounds of the
larger side and continue the while loop (use quicksort_basecase as the base case
threshold). Ensure you still call partition(first,last) each iteration and only
recurse once per loop.
---
Nitpick comments:
In `@benchmark/lib/bench.hpp`:
- Around line 28-42: Add a short comment above the bench template documenting
that the code uses std::format on the expected and the function result, so both
the Expected template parameter and the Fn return type
(std::invoke_result_t<Fn>) must be formattable (i.e., have a std::formatter
specialization for char) to avoid cryptic compile errors; mention the specific
symbols involved (bench, Expected, Fn, and the std::format call that formats
result and expected) so future users know the requirement.
In `@benchmark/lib/matmul.hpp`:
- Around line 67-69: Replace the manual sign-flip on variable diff with the
standard library absolute function: use std::abs (or std::fabs for floating
types) instead of if (diff < 0) diff = -diff; and ensure the appropriate header
(<cmath> or <cstdlib>) is included so the compiler resolves std::abs/std::fabs
for the type of diff.
In `@benchmark/src/serial/primes.cpp`:
- Around line 12-18: The loop passed into run_primes currently iterates i from 1
to lim and calls is_prime(1) unnecessarily; update the lambda's loop to start at
i = 2 (while keeping lim and count semantics and returning count) so it skips
the trivial non-prime 1 and avoids the wasted primality check in the function
passed to run_primes.
In `@benchmark/src/serial/strassen.cpp`:
- Around line 42-49: The strassen function assumes n is even when computing m =
n / 2; add validation at the start of strassen (before the strassen_cutoff check
or immediately after) to ensure n is valid (either assert/document that n is a
power-of-two or explicitly check n % 2 == 0), and if odd either pad inputs to
the next even/power-of-two size or return an error/convert to naive_multiply;
reference the strassen function, the n parameter, strassen_cutoff, and
naive_multiply when implementing the chosen validation/handling approach.
In `@docs/benchmarks/index.md`:
- Around line 37-46: The GitHub absolute links in docs/benchmarks/index.md (the
list items referencing [`benchmark/lib/`], [`benchmark/src/libfork/`],
[`benchmark/src/serial/`], [`benchmark/src/openmp/`], and
[`benchmark/src/baremetal/`]) should be replaced with repo-relative links (e.g.
./benchmark/lib/ and ./benchmark/src/libfork/ etc.) so the docs remain
branch-aware and portable; update each markdown link target to the corresponding
relative path while keeping the link text unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a982437-d99b-4984-8243-f36c29870e86
📒 Files selected for processing (63)
benchmark/lib/CMakeLists.txtbenchmark/lib/bench.hppbenchmark/lib/fib.hppbenchmark/lib/fold.hppbenchmark/lib/heat.hppbenchmark/lib/integrate.hppbenchmark/lib/knapsack.hppbenchmark/lib/macros.hppbenchmark/lib/mandelbrot.hppbenchmark/lib/matmul.hppbenchmark/lib/nqueens.hppbenchmark/lib/primes.hppbenchmark/lib/quicksort.hppbenchmark/lib/scan.hppbenchmark/lib/skynet.hppbenchmark/lib/uts.hppbenchmark/src/baremetal/fib.cppbenchmark/src/libfork/fib.cppbenchmark/src/libfork/fold.cppbenchmark/src/libfork/switch_io_pool.cppbenchmark/src/libfork/switch_random.cppbenchmark/src/libfork/uts.cppbenchmark/src/openmp/fib.cppbenchmark/src/openmp/uts.cppbenchmark/src/serial/CMakeLists.txtbenchmark/src/serial/fib.cppbenchmark/src/serial/heat.cppbenchmark/src/serial/integrate.cppbenchmark/src/serial/knapsack.cppbenchmark/src/serial/mandelbrot.cppbenchmark/src/serial/matmul.cppbenchmark/src/serial/nqueens.cppbenchmark/src/serial/primes.cppbenchmark/src/serial/quicksort.cppbenchmark/src/serial/scan.cppbenchmark/src/serial/skynet.cppbenchmark/src/serial/strassen.cppbenchmark/src/serial/uts.cppdocs/ChangeLog.mddocs/benchmarks.mddocs/benchmarks/benchmarks/fib.mddocs/benchmarks/benchmarks/fold.mddocs/benchmarks/benchmarks/heat.mddocs/benchmarks/benchmarks/index.mddocs/benchmarks/benchmarks/integrate.mddocs/benchmarks/benchmarks/knapsack.mddocs/benchmarks/benchmarks/mandelbrot.mddocs/benchmarks/benchmarks/matmul.mddocs/benchmarks/benchmarks/mergesort.mddocs/benchmarks/benchmarks/nqueens.mddocs/benchmarks/benchmarks/primes.mddocs/benchmarks/benchmarks/quicksort.mddocs/benchmarks/benchmarks/scan.mddocs/benchmarks/benchmarks/skynet.mddocs/benchmarks/benchmarks/strassen.mddocs/benchmarks/benchmarks/switch-io-pool.mddocs/benchmarks/benchmarks/switch-random.mddocs/benchmarks/benchmarks/uts.mddocs/benchmarks/index.mddocs/index.mddocs/installation.mddocs/quickstart.mdzensical.toml
💤 Files with no reviewable changes (1)
- docs/benchmarks.md
New format for benchmarks + early serial sketches of benchmarks and very very rough docs
Summary by CodeRabbit
New Features
Documentation