ci(bench): bump bench shard timeout_minutes to 120 — fixes stuck publish#161
Conversation
The 45/60/45 caps from PR #153 still aren't enough for the slowest shards (lazy, fast). Log inspection of cancelled jobs (job 76419168123, sha f3a6dad) shows linear progress with ~9s per criterion iteration and no stalls — the bench suite simply has more combinations (level × scenario × codec_side × stream_variant × {compress, decompress}) than fit in 45 min: ~11 levels × ~5 scenarios × 2 sides × 2 stream variants × 2 ops ≈ 440 iterations × 9s ≈ 66 min worst-case shard Uniform 120 min cap (GH-hosted limit is 360 min) gives ~50% headroom on the slowest shards and unblocks the publish chain so the dev/bench dashboard stays current.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe pull request increases GitHub Actions CI timeout for three benchmark targets in the ChangesBenchmark job timeout configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related issues
Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
Updates the CI benchmark workflow to prevent benchmark shard cancellations that block benchmark-aggregate and benchmark-pages, keeping the dev/bench/ dashboard up to date.
Changes:
- Increased per-target benchmark shard timeout configuration to a uniform 120 minutes in the
bench-matrixtarget inventory. - Ensures the
benchmarkjob’stimeout-minutes: ${{ matrix.bench.timeout_minutes }}has sufficient headroom for slow shards (lazy,fast) across all targets.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Summary
Raises bench shard
timeout_minutesfrom 45/60/45 (per-target) to a uniform 120 min in.github/workflows/ci.yml(bench-matrixjob, insidetargets.json).Why
PR #153 already bumped timeouts from 25/30 to 45/60/45, but slow shards (
lazy,fast) STILL cancel at the cap. Latest example — run 26003106160 (sha bbc9db4, PR #159 merge):Bench x86_64-gnu / lazy: started 21:54:12Z, completed 22:39:29Z → 45m17s ⇒ hit cap, cancelledBench x86_64-gnu / fastandBench x86_64-musl / lazycancelled same wayCancelled shards cascade:
benchmark-aggregateneeds all shards →benchmark-pagesskipped → dashboard atdev/bench/stays stale.Tests aren't cycling — they're just slow
Verified from log of a previously cancelled shard (job 76419168123, sha f3a6dad):
(level, scenario, codec_side, stream_variant)combinationsSo 45 min was structurally insufficient. 120 min gives ~50% headroom on the slowest shards. GH-hosted runner cap is 360 min, so we're well within limits.
Acceptance
x86_64-gnu,i686-gnu,x86_64-musl) carrytimeout_minutes: 120benchmark-aggregatesucceeds →benchmark-pagesruns → dashboard publishesTest plan
Closes #160
Summary by CodeRabbit