Speed up test suite across 4 packages (~15s → ~6s) by cpcloud · Pull Request #537 · roborev-dev/roborev

cpcloud · 2026-03-18T13:09:20Z

Summary

Isolate agent, config, and review tests from global git config — extends Isolate tests from global git config to prevent gpg-agent/pinentry #531 to three more packages. Adds TestMain calling testenv.RunIsolatedMain() so git operations in tests never trigger the host's gpg-agent/pinentry (which can add 1-2s per invocation on machines with GPG signing configured).
Make worker pool idle sleeps responsive to shutdown — replaces time.Sleep in the worker idle loop with a select on a shutdown channel, so tests that stop the daemon don't block waiting for the sleep to finish.
Stub isPROpenFn in CI poller test harness — avoids shelling out to gh CLI during tests. Removes a source of ~0.5s latency and flaky failures when gh isn't authenticated.
Parallelize daemon subtests — marks 21 independent daemon tests as t.Parallel(), including TestIsDaemonAliveLegacyStatusCodes subtests and the bulk of the worker/failover/cancellation tests. Drops internal/daemon from ~7.2s to ~4.2s.
Tighten requireNever timeout — the invalid-config watcher test was waiting 500ms for a "never" assertion; 100ms is sufficient and shaves ~400ms.
Fix configureSubprocess to always set Cancel on exec.Cmd — when called on a plain exec.Command (no context), cmd.Cancel is nil, so the wrapper's cancel() call panicked. Now sets a default Kill cancel before wrapping. This is a correctness fix surfaced by the timing changes below.
Speed up agent subprocess tests (8s → 1s) — two changes:
1. Override subprocessWaitDelay from 5s to 50ms in tests that don't need the production grace period.
2. Add a Go 1.25 ETXTBSY probe guard (case "$1" in *etxtbsy*) exit 0;; esac) to shell-script test fixtures.

Go 1.25 ETXTBSY probe — why the guard is necessary

Go 1.25 added a built-in ETXTBSY probe to os/exec: every Cmd.Start() call forks a child that runs execve(path, [path, "--help-probe-etxtbsy"]). If execve succeeds (file isn't busy), the probe child actually executes the script — it isn't killed, and the parent waits for it to exit before proceeding.

For shell scripts, this means the script runs twice: once as the probe (with $1 = "--help-probe-etxtbsy"), and once as the real invocation. If the script ignores $1 and does something slow (like sleep 5), the probe child runs the full body, and Go waits for it — adding the script's entire runtime as invisible overhead to Cmd.Start().

This was confirmed via strace -f: two child PIDs ran the same clock_nanosleep(5s), one from the probe fork (CLONE_VFORK|CLONE_PIDFD) and one from the real execution. The probe child was waited on via waitid(P_PIDFD).

The guard pattern — case "$1" in *etxtbsy*) exit 0;; esac as the first line after the shebang — makes the probe child exit immediately. This is only needed in test fixtures; real agent binaries already handle unknown flags gracefully.

Further details are documented in internal/agent/agent_test_helpers.go on the writeTempCommand function.

🤖 Generated with Claude Code

Extends commit 5b1aec5 to three more packages that spawn git subprocesses in tests. Without RunIsolatedMain, commit.gpgsign=true in the user's global gitconfig triggers gpg-agent/pinentry prompts, adding seconds of latency per git-commit call. review: 6.6s → 0.05s, config: 2.0s → 0.07s, agent: ~1s saved Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Worker goroutines used plain time.Sleep during idle polling and error backoff, which blocked up to 2s (idle) or 5s (error) before noticing the stop signal. Replace with select on both stopCh and time.After so workers exit immediately when Stop() is called. This is both a production improvement (daemon shutdown is now instant instead of waiting for sleeping workers) and a test speedup — the IPv6 server test dropped from 2.0s to ~0.01s because it starts a real server with 4 idle workers that all need to shut down. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Every CI poller test that exercised postBatchResults was shelling out to `gh pr view --repo acme/api` (the fake test repo) because isPROpenFn was unset, falling through to the real implementation. Each call took ~0.3s to fail, adding up to ~3-4s across the 10+ batch tests. Set a default isPROpenFn in the harness constructor that returns true. Tests that need specific closed-PR behavior already override it after construction, so those continue to work. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Each of the 7 "not alive" subtests incurs a 200ms sleep from IsDaemonAlive's retry logic (attempt 1 fails, sleep 200ms, attempt 2 fails). Running sequentially that's ~1.4s; with t.Parallel() the subtests overlap and the wall time drops to ~0.2s. Each subtest creates its own independent httptest.Server so there is no shared state to worry about. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The test verifies that invalid TOML doesn't change the loaded config. The debounce delay is 200ms, so 300ms (debounce + 100ms margin) is sufficient to confirm no reload occurred. Previously 500ms, saving ~200ms. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add t.Parallel() to the 21 slowest independent tests (0.1s+) across 8 test files. These tests all use isolated resources (temp dirs, in-memory DBs, independent broadcasters) with no shared global state. The ConfigWatcher tests benefit the most — four tests with time-based debounce waits (0.2-0.7s each) now overlap instead of running sequentially. Crypto key generation, CI poller batch tests, and worker pool concurrency tests also overlap. TestHandleEnqueueAgentAvailability is the one >0.1s test left sequential because it mutates PATH via os.Setenv. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Go 1.25's exec.CommandContext now sets cmd.Cancel unconditionally (unlike earlier versions which only set it when WaitDelay was zero). The existing configureSubprocess wrapper was already correct for Go 1.25, but make the nil-Cancel guard explicit so it also works if the function is called on a plain exec.Command (no context). This ensures context cancellation always signals agent subprocesses regardless of how the Cmd was constructed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two changes: - TestContextProcessErrorRunPathCancellation: Add an etxtbsy guard to the shell script so Go 1.25's ETXTBSY probe (which forks a second child running the script with --help-probe-etxtbsy as $1) exits immediately instead of sleeping for 5s. The test keeps its original sh→sleep process tree to exercise the realistic orphaned- child scenario. Also reduce subprocessWaitDelay to 50ms for the test so cmd.Wait returns promptly after the kill. (5s → 0.02s) Root cause confirmed via strace: Go 1.25's os/exec probes every execve for ETXTBSY by forking a child that runs the command with --help-probe-etxtbsy as argv[1]. The probe child actually executes the full script (since the script ignores arguments), creating a phantom sleep process that the test binary waits for. The probe process is never killed by context cancellation because Go only tracks the "real" child's pidfd. - TestCodexReviewTimeoutClosesStdoutPipe: Reduce background sleep from 2s to 0.2s — it only needs to outlive the 50ms context timeout. (2.1s → 0.27s) Also documents the ETXTBSY probe behavior on writeTempCommand for future test authors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

roborev-ci · 2026-03-18T13:13:07Z

roborev: Combined Review (`6661736`)

Summary: This PR improves test isolation and runtime, optimizes subprocess cancellation logic, and makes worker idle sleeps responsive to shutdown.

All reviewers agree the code is clean with no issues of medium or
higher severity found.

Synthesized from 3 reviews (agents: codex, gemini | types: default, security)

cpcloud · 2026-03-18T13:13:58Z

+//
+// # ETXTBSY probes and Go 1.25
+//
+// Go 1.25 added a built-in ETXTBSY probe to os/exec: every exec.Cmd.Start()


The fact that this exists is super weird to me. Many scripts ignore their arguments, so now it seems that in Go 1.25 those scripts always execute twice unless users explicitly handle this one weird, go-specific flag.

wesm · 2026-03-18T13:41:10Z

Good deal. thanks for doing this!

cpcloud and others added 8 commits March 18, 2026 08:23

cpcloud commented Mar 18, 2026

View reviewed changes

wesm merged commit f881c6f into roborev-dev:main Mar 18, 2026
8 checks passed

cpcloud deleted the worktree-shimmying-chasing-gray branch March 18, 2026 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up test suite across 4 packages (~15s → ~6s)#537

Speed up test suite across 4 packages (~15s → ~6s)#537
wesm merged 8 commits intoroborev-dev:mainfrom
cpcloud:worktree-shimmying-chasing-gray

cpcloud commented Mar 18, 2026

Uh oh!

roborev-ci bot commented Mar 18, 2026

Uh oh!

cpcloud Mar 18, 2026

Uh oh!

wesm commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cpcloud commented Mar 18, 2026

Summary

Go 1.25 ETXTBSY probe — why the guard is necessary

Uh oh!

roborev-ci bot commented Mar 18, 2026

roborev: Combined Review (6661736)

Uh oh!

cpcloud Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

wesm commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

roborev: Combined Review (`6661736`)