Build on GH runners for RP5 benchmarks#123
Conversation
CodSpeed Walltime Performance ReportMerging #123 will degrade performances by 49.21%Comparing
|
| Benchmark | BASE |
HEAD |
Change | |
|---|---|---|---|---|
| ❌ | simd_mul |
1.8 µs | 3.6 µs | -49.21% |
| 🆕 | montgomery_square_interleaved_3 |
N/A | 3.2 µs | N/A |
| 🆕 | montgomery_square_interleaved_4 |
N/A | 2.5 µs | N/A |
| 🆕 | montgomery_square_log_interleaved_3 |
N/A | 3.2 µs | N/A |
| 🆕 | montgomery_square_log_interleaved_4 |
N/A | 2.5 µs | N/A |
| ⚡ | sbox |
1.8 µs | 1.4 µs | +21.81% |
| ⚡ | sbox_8 |
2.5 µs | 2.1 µs | +20.12% |
| ⚡ | reduce |
2.2 µs | 1.9 µs | +12.15% |
| ⚡ | reduce_1 |
2 µs | 1.7 µs | +15.42% |
| ⚡ | reduce_1_partial |
2.2 µs | 1.8 µs | +27.35% |
| ⚡ | reduce_add_rc |
2.4 µs | 2 µs | +19.51% |
| ⚡ | reduce_partial |
2 µs | 1.7 µs | +17.12% |
51093d4 to
3046382
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR refactors benchmark path handling in bench.rs, extracts Raspberry Pi 5 benchmarks out of the CI workflow, and introduces a dedicated scheduled benchmark workflow.
- Simplify file path construction and improve error context in
bench.rs - Remove the RPi5 benchmark job from the main CI (
.github/workflows/ci.yml) - Add a standalone
benchmark.ymlto schedule and run RPi5 benchmarks on a self-hosted runner
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| noir-r1cs/benches/bench.rs | Switch to relative paths, add anyhow::Context, and streamline read calls |
| .github/workflows/ci.yml | Delete the embedded Raspberry Pi 5 benchmark job |
| .github/workflows/benchmark.yml | Add a new workflow for scheduled RPi5 benchmarks |
Comments suppressed due to low confidence (1)
noir-r1cs/benches/bench.rs:22
- [nitpick] The expect message is generic; include the file path or more detail (e.g.,
.expect(&format!("Failed to read proof scheme from {}", path.display()))) to aid debugging.
.expect("Reading proof scheme");
| jobs: | ||
| build: | ||
| name: Build benchmark binaries | ||
| runs-on: ubuntu-22.04-arm |
There was a problem hiding this comment.
It's better to be explicit about what the compilation target is. If you compile this machine, which processors will the resulting executable run on? Which fancy instructions does it use? It may crash on a different processor.
Minor to this is optimization: microarchitecture (pipelines etc.) differ between processors and this can be quite a substantial difference in performance. We already see this in a ~20% performance difference between two skyscraper algorithms, with the fastest one being different on an M3 than on the RPi5!
There was a problem hiding this comment.
The simd_mul regression in codspeed is actually exactly what you'd expect due to microachitectural mismatch. That could be the explanation.
But it's also not a huge concern: a big part of the reason to use assembly is to avoid depending on the compiler in this regard.
There was a problem hiding this comment.
I aggree, but cargo codspeed doesn't support setting the target triple right now. I'll add an issue for that, but let's leave it they way it is for now
There was a problem hiding this comment.
And it's a bit more involved to build the benchmarks without it (wrangling the benchmark binaries + handling the walltime/instrumentation modes)
3046382 to
329106e
Compare
Build on GH runners for RP5 benchmarks
No description provided.