Better Shuttle Benchmarks by dylanjwolff · Pull Request #189 · awslabs/shuttle

dylanjwolff · 2025-08-01T01:15:27Z

This PR improves the benchmarking suite of Shuttle.

(A) The benchmarks can now be run without vector-clock overhead by using the bench-no-vector-clocks feature. This feature is now enabled in the benchmarking Github action.

(B) The benchmarks have also been expanded and parameterized by the number of tasks and number of total events across all threads. More specifically:

Creation and Startup:create.rs has been added to give a rough gauge of thread creation time and Shuttle startup overhead
Parameterization: The existing benchmarks in counter.rs and lock.rs have been parameterized and set to run in a "narrow" [5 tasks] and "wide" [100 tasks] configuration. The number of total events/operations remains constant [10000 events] across these configurations (the number of events per task is computed from the tasks and the total number of events).
Scaling An additional set of benchmark groups has been added for showing scaling as the number of tasks and events increase. The sampling rate and warmup time has been greatly reduced for these to allow for the many new configurations (3*5*2=30) to be run in a reasonable amount of time. As such, statistical comparisons on individual data-points from these scaling benchmarks are not particularly informative. Unfortunately, Criterion doesn't have an automatic way to visualize trends, so the reports generated for these are not easy to interpret. In the future, it might be useful to generate custom reports from the raw data instead.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

sarsko · 2025-08-01T01:31:00Z

Do we want an "even wider" bench (eg 1000 or 10000 tasks)? 100 is just not that many.

sarsko · 2025-08-01T01:35:11Z

Just noting that bench-no-vector-clocks is fine by me (rumor has it that I suggested it), but it is also kinda bad style. Features are supposed to be additive, and this breaks that. We should have the feature separated from the rest, with a comment explaining, and also comment that this feature should never ever exist in a Cargo.TOML (or other config file), ie. it should only be invoked as we do here, as a part of a cargo ... command.

dylanjwolff · 2025-08-01T03:57:25Z

Do we want an "even wider" bench (eg 1000 or 10000 tasks)? 100 is just not that many.

I added an instance of the scaling benchmark with 1024 tasks. I'd guess that any issues with scaling threads should be visible at that size.

Just noting that bench-no-vector-clocks is fine by me (rumor has it that I suggested it), but it is also kinda bad style. Features are supposed to be additive, and this breaks that. We should have the feature separated from the rest, with a comment explaining, and also comment that this feature should never ever exist in a Cargo.TOML (or other config file), ie. it should only be invoked as we do here, as a part of a cargo ... command.

Yeah, it's a bit unfortunate, but I do suspect mostly it won't be noticed or used. I've added a comment to Cargo.toml.

sarsko reviewed Aug 1, 2025

View reviewed changes

Comment thread .github/workflows/tests.yml

dylanjwolff force-pushed the better-benchmarks branch 2 times, most recently from 17fa1b5 to 2a476c2 Compare August 1, 2025 03:54

dylanjwolff force-pushed the better-benchmarks branch from 2a476c2 to 7fdd45a Compare August 4, 2025 21:02

sarsko reviewed Aug 5, 2025

View reviewed changes

Comment thread shuttle/src/sync/once.rs

Feature for benchmarking w/out vector clocks; More diverse benchmarks

0b07402

dylanjwolff force-pushed the better-benchmarks branch from 7fdd45a to 0b07402 Compare August 5, 2025 14:28

dylanjwolff mentioned this pull request Aug 5, 2025

Persistent Vec to avoid alloc'ing runnable tasks on each schedule #191

Merged

sarsko approved these changes Aug 6, 2025

View reviewed changes

sarsko merged commit 39581cb into awslabs:main Aug 6, 2025
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better Shuttle Benchmarks#189

Better Shuttle Benchmarks#189
sarsko merged 1 commit intoawslabs:mainfrom
dylanjwolff:better-benchmarks

dylanjwolff commented Aug 1, 2025

Uh oh!

Uh oh!

sarsko commented Aug 1, 2025

Uh oh!

sarsko commented Aug 1, 2025

Uh oh!

dylanjwolff commented Aug 1, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dylanjwolff commented Aug 1, 2025

Uh oh!

Uh oh!

sarsko commented Aug 1, 2025

Uh oh!

sarsko commented Aug 1, 2025

Uh oh!

dylanjwolff commented Aug 1, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants