fix: use 1s benchtime for CLI helper benchmarks to eliminate false regressions by Copilot · Pull Request #35763 · github/gh-aw

Copilot · 2026-05-29T18:11:56Z

The daily-cli-performance workflow was reporting a false +45.7% regression in BenchmarkFindIncludesInContent (3,516 ns/op vs historical 2,413 ns/op). Both values were fabricated by measurement noise—the function's true steady-state cost is ~160 ns/op.

Root cause

bench-performance ran CLI helper benchmarks with -benchtime=3x (3 iterations). For a ~160 ns/op function, b.Loop() setup overhead and OS scheduling jitter dominate at that sample count, inflating measurements ~22×:

benchtime	`FindIncludesInContent`	`ExtractWorkflowNameFromFile`
`3x` (before)	~3,366–3,523 ns/op 📈	~12,136 ns/op 📈
`1s` (after)	~159–160 ns/op ✅	~6,779 ns/op ✅

Fix

Changed -benchtime=3x → -benchtime=1s for the CLI helper benchmark line in Makefile. The workflow benchmarks in pkg/workflow (slow, O(100ms+) each) retain 3x to keep CI runtime bounded.

-go test -bench='Benchmark(ExtractWorkflowNameFromFile|FindIncludesInContent)$$' \
-    -benchmem -benchtime=3x -run=^$$ ./pkg/cli >> bench_performance.txt
+go test -bench='Benchmark(ExtractWorkflowNameFromFile|FindIncludesInContent)$$' \
+    -benchmem -benchtime=1s -run=^$$ ./pkg/cli >> bench_performance.txt

With 1s, FindIncludesInContent accumulates ~7M iterations yielding measurements stable to <1% variance across runs.

…gressions The bench-performance Makefile target was using -benchtime=3x (3 iterations) for the CLI helper benchmarks (FindIncludesInContent, ExtractWorkflowNameFromFile). With only 3 iterations, b.Loop() overhead and OS scheduling jitter dominate the measurement for fast functions, inflating results ~22x (3366 ns/op vs true 160 ns/op). This caused the daily-cli-performance CI workflow to report a false regression: Historical: 2413 ns/op (also inflated by 3x timing) Current: 3516 ns/op (inflated; true value is ~160 ns/op) Fix: change the CLI helper benchmark line to -benchtime=1s, which runs for at least 1 second (~7M iterations for FindIncludesInContent), giving stable and accurate ns/op measurements consistent across runs. Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>

Copilot

Pull request overview

Fixes false-positive regression alerts in the daily CLI performance workflow by switching the CLI helper benchmarks from a fixed 3-iteration count to a 1-second duration, which provides stable measurements for sub-microsecond functions.

Changes:

Update bench-performance target's CLI helper benchmark invocation from -benchtime=3x to -benchtime=1s.

Show a summary per file

File	Description
Makefile	Increases sample size for `ExtractWorkflowNameFromFile` and `FindIncludesInContent` benchmarks to eliminate measurement noise.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 1/1 changed files
Comments generated: 0

Initial plan

4ad10b1

Copilot AI assigned Copilot and gh-aw-bot May 29, 2026

Copilot started work on behalf of gh-aw-bot May 29, 2026 18:12 View session

Copilot AI linked an issue May 29, 2026 that may be closed by this pull request

[performance] Regression in FindIncludesInContent: +45.7% slower #35741

Closed

Copilot AI changed the title ~~[WIP] Fix performance regression in FindIncludesInContent~~ fix: use 1s benchtime for CLI helper benchmarks to eliminate false regressions May 29, 2026

Copilot finished work on behalf of gh-aw-bot May 29, 2026 18:32

Copilot AI requested a review from gh-aw-bot May 29, 2026 18:32

github-actions Bot mentioned this pull request May 29, 2026

[aw] No-Op Runs #35753

Closed

pelikhan marked this pull request as ready for review May 29, 2026 19:12

Copilot AI review requested due to automatic review settings May 29, 2026 19:12

pelikhan merged commit 84a7a88 into main May 29, 2026

pelikhan deleted the copilot/performance-regression-findincludesincontent branch May 29, 2026 19:12

Copilot AI reviewed May 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use 1s benchtime for CLI helper benchmarks to eliminate false regressions#35763

fix: use 1s benchtime for CLI helper benchmarks to eliminate false regressions#35763
pelikhan merged 2 commits into
mainfrom
copilot/performance-regression-findincludesincontent

Copilot AI commented May 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Copilot AI commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented May 29, 2026 •

edited

Loading