fix(bench): eliminate per-sample scheduler setup cost in history benchmarks by deepjoy · Pull Request #55 · deepjoy/taskmill

deepjoy · 2026-03-19T02:22:53Z

Summary

Move build_scheduler_with_history outside bench_with_input so the
scheduler is seeded once per history_size, not once per Criterion sample.
Cache the critcmp binary between CI runs to avoid recompiling it on
every workflow execution.

Problem

Each history benchmark called build_scheduler_with_history(n) inside
iter_custom, so Criterion's 20 samples for history_size=5000 triggered
100 000 task completions of setup work per benchmark group. This blew past
Criterion's default measurement-time budget and produced "unable to complete
N samples" warnings in CI.

Solution

Build the scheduler once with rt.block_on(...) before bench_with_input,
clone the TaskStore handle, and pass it into the closure. Only the actual
query (history / history_stats / history_by_type) is measured in the loop.

… setup cost Seeding N tasks was inside iter_custom, so it ran 20× per configuration. For history_size=5000 that meant 100k task completions just for setup, blowing Criterion's measurement time and causing "unable to complete" warnings. Move the build outside bench_with_input so setup runs once per size.

…te warnings Criterion's default 5s measurement window is too tight for CI runners: with sample_size=20, each sample must finish in ≤250ms. On a 2-core GitHub Actions runner the async overhead per sample can exceed this, triggering "unable to complete N samples in 5.0s". Setting 30s gives Criterion a comfortable window without making CI unreasonably slow.

github-actions · 2026-03-19T03:02:55Z

Benchmark Comparison

Click to expand

group                                       current
-----                                       -------
backoff_delay/constant                      1.00     50.4±0.03ns        ? ?/sec
backoff_delay/exponential                   1.00    219.5±0.92ns        ? ?/sec
backoff_delay/exponential_jitter            1.00    386.4±1.02ns        ? ?/sec
backoff_delay/linear                        1.00     82.0±0.23ns        ? ?/sec
batch_submit_1000                           1.00     38.6±2.08ms        ? ?/sec
byte_progress/byte_reporting_500            1.00    637.5±7.66ms        ? ?/sec
byte_progress/noop_500                      1.00   601.0±10.99ms        ? ?/sec
byte_progress_snapshot_100_tasks            1.00    191.0±2.63ms        ? ?/sec
concurrency_scaling/1                       1.00   723.7±15.12ms        ? ?/sec
concurrency_scaling/2                       1.00   592.0±12.14ms        ? ?/sec
concurrency_scaling/4                       1.00    593.1±9.96ms        ? ?/sec
concurrency_scaling/8                       1.00   596.9±10.46ms        ? ?/sec
count_by_tags/100                           1.00    120.4±5.05µs        ? ?/sec
count_by_tags/1000                          1.00    195.3±8.16µs        ? ?/sec
count_by_tags/5000                          1.00   697.6±20.41µs        ? ?/sec
dep_chain_dispatch/10                       1.00     26.9±0.45ms        ? ?/sec
dep_chain_dispatch/25                       1.00     60.9±1.23ms        ? ?/sec
dep_chain_dispatch/50                       1.00    124.0±2.35ms        ? ?/sec
dep_chain_submit/10                         1.00     11.5±0.30ms        ? ?/sec
dep_chain_submit/200                        1.00   537.2±11.58ms        ? ?/sec
dep_chain_submit/50                         1.00     51.3±2.81ms        ? ?/sec
dep_fan_in_dispatch/10                      1.00     24.2±0.29ms        ? ?/sec
dep_fan_in_dispatch/100                     1.00    143.4±3.98ms        ? ?/sec
dep_fan_in_dispatch/50                      1.00     76.8±1.77ms        ? ?/sec
dispatch_and_complete_1000                  1.00  1192.5±15.93ms        ? ?/sec
dispatch_group_scaling/1                    1.00   633.7±11.57ms        ? ?/sec
dispatch_group_scaling/10                   1.00   630.4±10.49ms        ? ?/sec
dispatch_group_scaling/100                  1.00   635.0±11.45ms        ? ?/sec
dispatch_group_scaling/50                   1.00   636.6±12.57ms        ? ?/sec
dispatch_no_groups_500                      1.00   590.2±12.87ms        ? ?/sec
dispatch_one_group_500                      1.00   635.5±12.33ms        ? ?/sec
dispatch_permanent_failure_500              1.00   542.1±11.54ms        ? ?/sec
history_by_type/100                         1.00  1108.5±14.94µs        ? ?/sec
history_by_type/1000                        1.00  1160.7±23.89µs        ? ?/sec
history_by_type/5000                        1.00  1197.5±26.51µs        ? ?/sec
history_query/100                           1.00    667.0±9.64µs        ? ?/sec
history_query/1000                          1.00    683.1±9.82µs        ? ?/sec
history_query/5000                          1.00    719.5±7.83µs        ? ?/sec
history_stats/100                           1.00    133.2±2.88µs        ? ?/sec
history_stats/1000                          1.00    328.5±2.11µs        ? ?/sec
history_stats/5000                          1.00   1205.7±5.88µs        ? ?/sec
mixed_priority_dispatch_500                 1.00   600.3±11.59ms        ? ?/sec
peek_next/100                               1.00     46.7±0.99ms        ? ?/sec
peek_next/1000                              1.00    194.9±5.01ms        ? ?/sec
peek_next/5000                              1.00   862.9±14.70ms        ? ?/sec
query_by_tags/100                           1.00  1376.2±98.56µs        ? ?/sec
query_by_tags/1000                          1.00     12.5±1.16ms        ? ?/sec
query_by_tags/5000                          1.00     63.8±6.42ms        ? ?/sec
retryable_dead_letter/constant              1.00    293.2±7.54ms        ? ?/sec
retryable_dead_letter/exponential           1.00    296.4±6.74ms        ? ?/sec
retryable_dead_letter/exponential_jitter    1.00    293.8±8.51ms        ? ?/sec
retryable_dead_letter/linear                1.00    296.1±6.36ms        ? ?/sec
submit_1000_tasks                           1.00    175.1±5.40ms        ? ?/sec
submit_dedup_hit_1000                       1.00    225.2±7.15ms        ? ?/sec
submit_with_tags/0                          1.00     91.4±3.26ms        ? ?/sec
submit_with_tags/10                         1.00    221.3±7.19ms        ? ?/sec
submit_with_tags/20                         1.00   352.2±11.02ms        ? ?/sec
submit_with_tags/5                          1.00    156.6±5.08ms        ? ?/sec
tag_values/100                              1.00    126.3±5.06µs        ? ?/sec
tag_values/1000                             1.00    188.4±6.36µs        ? ?/sec
tag_values/5000                             1.00   536.8±22.89µs        ? ?/sec

Switch all benches from iter() to iter_custom(), constructing the scheduler/store outside the timed region so only the measured workload is counted. Also adds pprof flamegraph support to the dep_chain_submit group and a new profile_dep_chain example for one-shot timing breakdowns.

## 🤖 New release * `taskmill`: 0.5.0 -> 0.5.1 (✓ API compatible changes) <details><summary>Changelog</summary> <blockquote> ## [0.5.1](v0.5.0...v0.5.1) - 2026-03-19 ### Fixed - *(bench)* eliminate per-sample scheduler setup cost in history benchmarks ([#55](#55)) - *(bench)* remove premature cancellation token call in history benchmark setup ([#54](#54)) - *(ci)* bootstrap _benchmarks branch on first push to main ([#53](#53)) - *(ci)* restore stderr capture for benchmark output on main ([#51](#51)) - *(ci)* exclude lib target from cargo bench to fix benchmark CI ([#49](#49)) ### Other - decompose internal god objects into focused, single-responsibility modules ([#56](#56)) - eliminate stringly-typed history status and DRY violations ([#52](#52)) </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

deepjoy added 3 commits March 18, 2026 19:18

ci(bench): cache critcmp binary to skip recompile on every run

ca6b8ab

deepjoy enabled auto-merge (squash) March 19, 2026 03:57

style(bench): apply rustfmt formatting across bench files and example

bc9fcd3

deepjoy merged commit 6f2ba74 into main Mar 19, 2026
1 of 2 checks passed

github-actions Bot mentioned this pull request Mar 19, 2026

chore: release v0.5.1 #50

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bench): eliminate per-sample scheduler setup cost in history benchmarks#55

fix(bench): eliminate per-sample scheduler setup cost in history benchmarks#55
deepjoy merged 5 commits into
mainfrom
improve-bench

deepjoy commented Mar 19, 2026

Uh oh!

github-actions Bot commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

deepjoy commented Mar 19, 2026

Summary

Problem

Solution

Uh oh!

github-actions Bot commented Mar 19, 2026

Benchmark Comparison

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant