[V4] Benchmark suite 1/N by ConorWilliams · Pull Request #105 · ConorWilliams/libfork

ConorWilliams · 2026-05-02T14:06:53Z

New format for benchmarks + early serial sketches of benchmarks and very very rough docs

Summary by CodeRabbit

New Features
- Added twelve new benchmark algorithms: heat diffusion, numerical integration, knapsack, Mandelbrot set, matrix multiplication, N-queens, prime counting, quicksort, prefix scan, Skynet, Strassen, and enhanced reference implementations.
Documentation
- Added comprehensive benchmark documentation with individual pages for each benchmark, improved navigation, and detailed guidance on benchmark structure and scaling characteristics.

coderabbitai · 2026-05-02T14:07:01Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 085e8a0d-d0d3-4880-a17e-fa0f2ae07211

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch v4-bench-serial

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (5)

benchmark/lib/bench.hpp (1)

28-42: 💤 Low value

Consider documenting the formattable requirement.

The std::format call at line 36 requires both Expected and the result type to be formattable. While the current benchmark types (e.g., std::int64_t, custom result with formatter) satisfy this, future users may encounter cryptic compilation errors if they use non-formattable types.
📝 Suggested documentation

Add a comment above the function template:
+// Requires: Expected must be formattable with std::format and equality-comparable with result type
 template <typename Expected, typename Check, typename Fn>
 void bench(benchmark::State &state, std::int64_t threads, const Expected &expected, Check check, Fn fn) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmark/lib/bench.hpp` around lines 28 - 42, Add a short comment above the
bench template documenting that the code uses std::format on the expected and
the function result, so both the Expected template parameter and the Fn return
type (std::invoke_result_t<Fn>) must be formattable (i.e., have a std::formatter
specialization for char) to avoid cryptic compile errors; mention the specific
symbols involved (bench, Expected, Fn, and the std::format call that formats
result and expected) so future users know the requirement.

benchmark/src/serial/primes.cpp (1)

12-18: 💤 Low value

Consider starting the loop at 2 for efficiency.

The loop currently starts at i=1, but since is_prime(1) always returns false, this wastes one primality check per benchmark iteration. Starting at i=2 would be slightly more efficient without changing correctness.

⚡ Proposed optimization

 run_primes(state, [](std::int64_t lim) {
   std::int64_t count = 0;
-  for (std::int64_t i = 1; i < lim; ++i) {
+  for (std::int64_t i = 2; i < lim; ++i) {
     count += is_prime(i) ? 1 : 0;
   }
   return count;
 });

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmark/src/serial/primes.cpp` around lines 12 - 18, The loop passed into
run_primes currently iterates i from 1 to lim and calls is_prime(1)
unnecessarily; update the lambda's loop to start at i = 2 (while keeping lim and
count semantics and returning count) so it skips the trivial non-prime 1 and
avoids the wasted primality check in the function passed to run_primes.

benchmark/lib/matmul.hpp (1)

67-69: 💤 Low value

Prefer std::abs or std::fabs for clarity.

The manual absolute value computation can be replaced with the standard library function.

Suggested simplification

-    float diff = (A[i] - B[i]) / A[i];
-    if (diff < 0) {
-      diff = -diff;
-    }
+    float diff = std::fabs((A[i] - B[i]) / A[i]);

Note: This suggestion can be combined with the previous comment's fix for a more robust solution.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmark/lib/matmul.hpp` around lines 67 - 69, Replace the manual sign-flip
on variable diff with the standard library absolute function: use std::abs (or
std::fabs for floating types) instead of if (diff < 0) diff = -diff; and ensure
the appropriate header (<cmath> or <cstdlib>) is included so the compiler
resolves std::abs/std::fabs for the type of diff.

docs/benchmarks/index.md (1)

37-46: ⚡ Quick win

Prefer relative links for in-repo paths.

These links point to github.com/.../tree/main/..., which can drift from the docs version/branch and breaks fork portability. Use repo-relative links so docs stay branch-correct and self-contained.

Suggested edit

-- [`benchmark/lib/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/lib)
+- [`benchmark/lib/`](../../../benchmark/lib/)
   contains shared kernels, input sizes, and correctness helpers.
-- [`benchmark/src/libfork/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/libfork)
+- [`benchmark/src/libfork/`](../../../benchmark/src/libfork/)
   contains libfork coroutine and scheduler benchmarks.
-- [`benchmark/src/serial/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/serial)
+- [`benchmark/src/serial/`](../../../benchmark/src/serial/)
   contains single-threaded baselines.
-- [`benchmark/src/openmp/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/openmp)
+- [`benchmark/src/openmp/`](../../../benchmark/src/openmp/)
   contains OpenMP tasking comparisons where present.
-- [`benchmark/src/baremetal/`](https://github.com/conorwilliams/libfork/tree/main/benchmark/src/baremetal)
+- [`benchmark/src/baremetal/`](../../../benchmark/src/baremetal/)
   contains low-level coroutine or data-structure baselines.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/benchmarks/index.md` around lines 37 - 46, The GitHub absolute links in
docs/benchmarks/index.md (the list items referencing [`benchmark/lib/`],
[`benchmark/src/libfork/`], [`benchmark/src/serial/`],
[`benchmark/src/openmp/`], and [`benchmark/src/baremetal/`]) should be replaced
with repo-relative links (e.g. ./benchmark/lib/ and ./benchmark/src/libfork/
etc.) so the docs remain branch-aware and portable; update each markdown link
target to the corresponding relative path while keeping the link text unchanged.

benchmark/src/serial/strassen.cpp (1)

42-49: 💤 Low value

Consider validating that n is even or a power of 2.

The recursive division at Line 49 (unsigned m = n / 2) assumes n is even. If n is odd, integer division will truncate and the sub-blocks won't cover the full matrix, leading to incorrect results.

If the benchmark framework guarantees power-of-2 sizes, document this assumption. Otherwise, add validation:

🛡️ Proposed validation

 void strassen(float const *A, unsigned sa, float const *B, unsigned sb, float *C, unsigned sc, unsigned n) {
 
+  // Strassen requires power-of-2 dimensions or even n at minimum
+  if (n % 2 != 0 && n > strassen_cutoff) {
+    // Fall back to naive for odd sizes
+    naive_multiply(A, sa, B, sb, C, sc, n);
+    return;
+  }
+
   if (n <= strassen_cutoff) {

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmark/src/serial/strassen.cpp` around lines 42 - 49, The strassen
function assumes n is even when computing m = n / 2; add validation at the start
of strassen (before the strassen_cutoff check or immediately after) to ensure n
is valid (either assert/document that n is a power-of-two or explicitly check n
% 2 == 0), and if odd either pad inputs to the next even/power-of-two size or
return an error/convert to naive_multiply; reference the strassen function, the
n parameter, strassen_cutoff, and naive_multiply when implementing the chosen
validation/handling approach.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmark/lib/matmul.hpp`:
- Around line 63-75: The function matmul_max_relative_error divides by A[i]
without guarding against zero/near-zero; change the computation to use a safe
denominator by comparing std::abs(A[i]) with a small epsilon (e.g., 1e-8f) and
divide by std::max(std::abs(A[i]), epsilon), or when |A[i]| < epsilon use
absolute error (std::abs(A[i]-B[i])) instead; update the diff calculation and
comparisons in matmul_max_relative_error to compute diff = std::abs(A[i]-B[i]) /
std::max(std::abs(A[i]), epsilon) (or the absolute fallback) so you avoid
huge/undefined relative errors for tiny reference values.

In `@benchmark/lib/nqueens.hpp`:
- Around line 17-21: The lookup table nqueens_answers is declared with 28
entries but only 21 values are provided, causing indices 21–27 to be
zero-initialized and yield incorrect results when accessed via
.at(state.range(0)). Fix by making the table length match the provided data
(change the constexpr std::array size to 21) or by populating the remaining
seven entries with the correct N-queens counts; alternatively add runtime
validation around the .at(state.range(0)) call to ensure state.range(0) is
within the supported index range before accessing nqueens_answers.

In `@benchmark/src/serial/matmul.cpp`:
- Around line 12-17: The recursive split truncates odd n via m = n / 2 causing
incorrect results; before splitting in the recursive matmul code, guard
odd-sized matrices by either (a) falling back to the scalar/basecase
routine—call matmul_basecase_multiply<Add>(A, B, R, n, s) when (n % 2 != 0) or
(b) validate/enforce power-of-two inputs with an explicit check/assert and
error, instead of proceeding with m = n / 2; update the block that currently
computes unsigned m = n / 2 to perform this guard using the existing
matmul_basecase and matmul_basecase_multiply symbols and the A, B, R, n, s
parameters.

In `@benchmark/src/serial/quicksort.cpp`:
- Around line 38-43: The quicksort implementation always recurses on the right
partition (quicksort(pivot + 1, last)) which can cause O(n) stack depth for
degenerate pivots; change the control flow to always recurse on the smaller
partition and loop on the larger one: after partition(first,last) returns pivot,
compare sizes of left (pivot - first) and right (last - (pivot + 1)) and call
quicksort on the smaller side, then update first/last to the bounds of the
larger side and continue the while loop (use quicksort_basecase as the base case
threshold). Ensure you still call partition(first,last) each iteration and only
recurse once per loop.

---

Nitpick comments:
In `@benchmark/lib/bench.hpp`:
- Around line 28-42: Add a short comment above the bench template documenting
that the code uses std::format on the expected and the function result, so both
the Expected template parameter and the Fn return type
(std::invoke_result_t<Fn>) must be formattable (i.e., have a std::formatter
specialization for char) to avoid cryptic compile errors; mention the specific
symbols involved (bench, Expected, Fn, and the std::format call that formats
result and expected) so future users know the requirement.

In `@benchmark/lib/matmul.hpp`:
- Around line 67-69: Replace the manual sign-flip on variable diff with the
standard library absolute function: use std::abs (or std::fabs for floating
types) instead of if (diff < 0) diff = -diff; and ensure the appropriate header
(<cmath> or <cstdlib>) is included so the compiler resolves std::abs/std::fabs
for the type of diff.

In `@benchmark/src/serial/primes.cpp`:
- Around line 12-18: The loop passed into run_primes currently iterates i from 1
to lim and calls is_prime(1) unnecessarily; update the lambda's loop to start at
i = 2 (while keeping lim and count semantics and returning count) so it skips
the trivial non-prime 1 and avoids the wasted primality check in the function
passed to run_primes.

In `@benchmark/src/serial/strassen.cpp`:
- Around line 42-49: The strassen function assumes n is even when computing m =
n / 2; add validation at the start of strassen (before the strassen_cutoff check
or immediately after) to ensure n is valid (either assert/document that n is a
power-of-two or explicitly check n % 2 == 0), and if odd either pad inputs to
the next even/power-of-two size or return an error/convert to naive_multiply;
reference the strassen function, the n parameter, strassen_cutoff, and
naive_multiply when implementing the chosen validation/handling approach.

In `@docs/benchmarks/index.md`:
- Around line 37-46: The GitHub absolute links in docs/benchmarks/index.md (the
list items referencing [`benchmark/lib/`], [`benchmark/src/libfork/`],
[`benchmark/src/serial/`], [`benchmark/src/openmp/`], and
[`benchmark/src/baremetal/`]) should be replaced with repo-relative links (e.g.
./benchmark/lib/ and ./benchmark/src/libfork/ etc.) so the docs remain
branch-aware and portable; update each markdown link target to the corresponding
relative path while keeping the link text unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a982437-d99b-4984-8243-f36c29870e86

📥 Commits

Reviewing files that changed from the base of the PR and between 3d896e9 and f3d8367.

📒 Files selected for processing (63)

benchmark/lib/CMakeLists.txt
benchmark/lib/bench.hpp
benchmark/lib/fib.hpp
benchmark/lib/fold.hpp
benchmark/lib/heat.hpp
benchmark/lib/integrate.hpp
benchmark/lib/knapsack.hpp
benchmark/lib/macros.hpp
benchmark/lib/mandelbrot.hpp
benchmark/lib/matmul.hpp
benchmark/lib/nqueens.hpp
benchmark/lib/primes.hpp
benchmark/lib/quicksort.hpp
benchmark/lib/scan.hpp
benchmark/lib/skynet.hpp
benchmark/lib/uts.hpp
benchmark/src/baremetal/fib.cpp
benchmark/src/libfork/fib.cpp
benchmark/src/libfork/fold.cpp
benchmark/src/libfork/switch_io_pool.cpp
benchmark/src/libfork/switch_random.cpp
benchmark/src/libfork/uts.cpp
benchmark/src/openmp/fib.cpp
benchmark/src/openmp/uts.cpp
benchmark/src/serial/CMakeLists.txt
benchmark/src/serial/fib.cpp
benchmark/src/serial/heat.cpp
benchmark/src/serial/integrate.cpp
benchmark/src/serial/knapsack.cpp
benchmark/src/serial/mandelbrot.cpp
benchmark/src/serial/matmul.cpp
benchmark/src/serial/nqueens.cpp
benchmark/src/serial/primes.cpp
benchmark/src/serial/quicksort.cpp
benchmark/src/serial/scan.cpp
benchmark/src/serial/skynet.cpp
benchmark/src/serial/strassen.cpp
benchmark/src/serial/uts.cpp
docs/ChangeLog.md
docs/benchmarks.md
docs/benchmarks/benchmarks/fib.md
docs/benchmarks/benchmarks/fold.md
docs/benchmarks/benchmarks/heat.md
docs/benchmarks/benchmarks/index.md
docs/benchmarks/benchmarks/integrate.md
docs/benchmarks/benchmarks/knapsack.md
docs/benchmarks/benchmarks/mandelbrot.md
docs/benchmarks/benchmarks/matmul.md
docs/benchmarks/benchmarks/mergesort.md
docs/benchmarks/benchmarks/nqueens.md
docs/benchmarks/benchmarks/primes.md
docs/benchmarks/benchmarks/quicksort.md
docs/benchmarks/benchmarks/scan.md
docs/benchmarks/benchmarks/skynet.md
docs/benchmarks/benchmarks/strassen.md
docs/benchmarks/benchmarks/switch-io-pool.md
docs/benchmarks/benchmarks/switch-random.md
docs/benchmarks/benchmarks/uts.md
docs/benchmarks/index.md
docs/index.md
docs/installation.md
docs/quickstart.md
zensical.toml

💤 Files with no reviewable changes (1)

docs/benchmarks.md

ConorWilliams changed the base branch from main to modules May 2, 2026 14:07

Conor added 14 commits May 10, 2026 20:51

strassen

0d46b5d

skynet

0e38d94

scan

9a7c305

quicksort

e95d0e4

primes

c458703

nqueens

7f9ebad

matmul

43236d7

mandel

3443615

knapsack

f336dc1

integrate

c666a08

heat

d7d19f5

cmake

d952820

warnings

77ef144

format

3490792

ConorWilliams force-pushed the v4-bench-serial branch from 82e1275 to 3490792 Compare May 10, 2026 19:52

Conor added 6 commits May 10, 2026 21:01

include fixed

b56789d

doc benchmarks

c5e0074

move

59d1533

sub folder

d8ed9d7

order tweaks

1e7f4da

some stubs

da70180

ConorWilliams changed the title ~~[V4] Benchmarks (serial)~~ [V4] Benchmarks suite May 11, 2026

Conor added 6 commits May 11, 2026 18:52

order

433f202

nameing

fc83643

icons

75080c8

icon

e321324

common benchmark utils

22f3e80

tidy up

aea774f

Conor added 7 commits May 11, 2026 21:47

simplify

cfee685

refactor

eb606c2

tmp

d4496bd

notes

69cfdea

full refactor

96089e2

clean ups

820cbb8

tidy

83c1c9e

ConorWilliams changed the title ~~[V4] Benchmarks suite~~ [V4] Benchmarks suite 1/N May 11, 2026

ConorWilliams changed the title ~~[V4] Benchmarks suite 1/N~~ [V4] Benchmark suite 1/N May 11, 2026

dry

f3d8367

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread benchmark/lib/matmul.hpp

Comment thread benchmark/lib/nqueens.hpp Outdated

Comment thread benchmark/src/serial/matmul.cpp

Comment thread benchmark/src/serial/quicksort.cpp

comments

33c9c82

ConorWilliams merged commit b2a4543 into modules May 11, 2026
7 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V4] Benchmark suite 1/N#105

[V4] Benchmark suite 1/N#105
ConorWilliams merged 35 commits into
modulesfrom
v4-bench-serial

ConorWilliams commented May 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 2, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ConorWilliams commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ConorWilliams commented May 2, 2026 •

edited

Loading

coderabbitai Bot commented May 2, 2026 •

edited

Loading