Skip to content

Build on GH runners for RP5 benchmarks#123

Merged
recmo merged 18 commits into
mainfrom
dzejkop/build-for-benchmark
Jun 27, 2025
Merged

Build on GH runners for RP5 benchmarks#123
recmo merged 18 commits into
mainfrom
dzejkop/build-for-benchmark

Conversation

@Dzejkop
Copy link
Copy Markdown
Collaborator

@Dzejkop Dzejkop commented Jun 3, 2025

No description provided.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jun 5, 2025

CodSpeed Walltime Performance Report

Merging #123 will degrade performances by 49.21%

Comparing dzejkop/build-for-benchmark (3046382) with main (0b1fd8f)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

⚡ 7 improvements
❌ 1 regressions
✅ 22 untouched benchmarks
🆕 4 new benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
simd_mul 1.8 µs 3.6 µs -49.21%
🆕 montgomery_square_interleaved_3 N/A 3.2 µs N/A
🆕 montgomery_square_interleaved_4 N/A 2.5 µs N/A
🆕 montgomery_square_log_interleaved_3 N/A 3.2 µs N/A
🆕 montgomery_square_log_interleaved_4 N/A 2.5 µs N/A
sbox 1.8 µs 1.4 µs +21.81%
sbox_8 2.5 µs 2.1 µs +20.12%
reduce 2.2 µs 1.9 µs +12.15%
reduce_1 2 µs 1.7 µs +15.42%
reduce_1_partial 2.2 µs 1.8 µs +27.35%
reduce_add_rc 2.4 µs 2 µs +19.51%
reduce_partial 2 µs 1.7 µs +17.12%

@Dzejkop Dzejkop force-pushed the dzejkop/build-for-benchmark branch from 51093d4 to 3046382 Compare June 5, 2025 15:25
@Dzejkop Dzejkop marked this pull request as ready for review June 5, 2025 15:26
@Dzejkop Dzejkop requested review from Copilot and recmo June 5, 2025 15:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors benchmark path handling in bench.rs, extracts Raspberry Pi 5 benchmarks out of the CI workflow, and introduces a dedicated scheduled benchmark workflow.

  • Simplify file path construction and improve error context in bench.rs
  • Remove the RPi5 benchmark job from the main CI (.github/workflows/ci.yml)
  • Add a standalone benchmark.yml to schedule and run RPi5 benchmarks on a self-hosted runner

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
noir-r1cs/benches/bench.rs Switch to relative paths, add anyhow::Context, and streamline read calls
.github/workflows/ci.yml Delete the embedded Raspberry Pi 5 benchmark job
.github/workflows/benchmark.yml Add a new workflow for scheduled RPi5 benchmarks
Comments suppressed due to low confidence (1)

noir-r1cs/benches/bench.rs:22

  • [nitpick] The expect message is generic; include the file path or more detail (e.g., .expect(&format!("Failed to read proof scheme from {}", path.display()))) to aid debugging.
.expect("Reading proof scheme");

Comment thread .github/workflows/benchmark.yml
Comment thread .github/workflows/benchmark.yml
Comment thread noir-r1cs/benches/bench.rs
jobs:
build:
name: Build benchmark binaries
runs-on: ubuntu-22.04-arm
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to be explicit about what the compilation target is. If you compile this machine, which processors will the resulting executable run on? Which fancy instructions does it use? It may crash on a different processor.
Minor to this is optimization: microarchitecture (pipelines etc.) differ between processors and this can be quite a substantial difference in performance. We already see this in a ~20% performance difference between two skyscraper algorithms, with the fastest one being different on an M3 than on the RPi5!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simd_mul regression in codspeed is actually exactly what you'd expect due to microachitectural mismatch. That could be the explanation.

But it's also not a huge concern: a big part of the reason to use assembly is to avoid depending on the compiler in this regard.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I aggree, but cargo codspeed doesn't support setting the target triple right now. I'll add an issue for that, but let's leave it they way it is for now

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it's a bit more involved to build the benchmarks without it (wrangling the benchmark binaries + handling the walltime/instrumentation modes)

@Dzejkop Dzejkop force-pushed the dzejkop/build-for-benchmark branch from 3046382 to 329106e Compare June 6, 2025 17:44
@recmo recmo merged commit 653e9cf into main Jun 27, 2025
5 of 7 checks passed
dcbuild3r pushed a commit that referenced this pull request May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants