fix(bench): eliminate per-sample scheduler setup cost in history benchmarks#55
Merged
Conversation
… setup cost Seeding N tasks was inside iter_custom, so it ran 20× per configuration. For history_size=5000 that meant 100k task completions just for setup, blowing Criterion's measurement time and causing "unable to complete" warnings. Move the build outside bench_with_input so setup runs once per size.
…te warnings Criterion's default 5s measurement window is too tight for CI runners: with sample_size=20, each sample must finish in ≤250ms. On a 2-core GitHub Actions runner the async overhead per sample can exceed this, triggering "unable to complete N samples in 5.0s". Setting 30s gives Criterion a comfortable window without making CI unreasonably slow.
Contributor
Benchmark ComparisonClick to expand |
Switch all benches from iter() to iter_custom(), constructing the scheduler/store outside the timed region so only the measured workload is counted. Also adds pprof flamegraph support to the dep_chain_submit group and a new profile_dep_chain example for one-shot timing breakdowns.
Merged
deepjoy
pushed a commit
that referenced
this pull request
Mar 19, 2026
## 🤖 New release * `taskmill`: 0.5.0 -> 0.5.1 (✓ API compatible changes) <details><summary><i><b>Changelog</b></i></summary><p> <blockquote> ## [0.5.1](v0.5.0...v0.5.1) - 2026-03-19 ### Fixed - *(bench)* eliminate per-sample scheduler setup cost in history benchmarks ([#55](#55)) - *(bench)* remove premature cancellation token call in history benchmark setup ([#54](#54)) - *(ci)* bootstrap _benchmarks branch on first push to main ([#53](#53)) - *(ci)* restore stderr capture for benchmark output on main ([#51](#51)) - *(ci)* exclude lib target from cargo bench to fix benchmark CI ([#49](#49)) ### Other - decompose internal god objects into focused, single-responsibility modules ([#56](#56)) - eliminate stringly-typed history status and DRY violations ([#52](#52)) </blockquote> </p></details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
build_scheduler_with_historyoutsidebench_with_inputso thescheduler is seeded once per
history_size, not once per Criterion sample.critcmpbinary between CI runs to avoid recompiling it onevery workflow execution.
Problem
Each history benchmark called
build_scheduler_with_history(n)insideiter_custom, so Criterion's 20 samples forhistory_size=5000triggered100 000 task completions of setup work per benchmark group. This blew past
Criterion's default measurement-time budget and produced "unable to complete
N samples" warnings in CI.
Solution
Build the scheduler once with
rt.block_on(...)beforebench_with_input,clone the
TaskStorehandle, and pass it into the closure. Only the actualquery (history / history_stats / history_by_type) is measured in the loop.