Harden timing-sensitive tests: replace Process.sleep with proper synchronization by ppsplus-bradh · Pull Request #43 · guess/claude_code

ppsplus-bradh · 2026-03-27T17:01:04Z

Summary

Replace Process.sleep calls across 3 test files with deterministic synchronization, eliminating a class of flaky test failures on fast or slow CI runners.

Motivation

Several tests use fixed Process.sleep(50-100) calls to wait for async operations (request delivery, stream cleanup, process shutdown, supervisor restarts). These are brittle — too short on slow runners, unnecessarily slow on fast ones. The codebase already has MockCLI.poll_until/2 for proper async polling; these tests just weren't using it.

Note on timeouts

MockCLI.poll_until/2 has a default timeout of 5 seconds and a polling interval of 10ms. Both can be overridden via options (timeout:, interval:) to shorten the overall polling period where appropriate. If the condition is never met, the test fails with a raised exception rather than hanging indefinitely.

Changes

`test/claude_code/session_test.exs`

Test	Was	Now
"stream cleanup removes request" (adapter)	`Process.sleep(50)` x2	Direct `:sys.get_state` assert (sync call guarantees state) + `MockCLI.poll_until` for cleanup
"discards raw message for unknown request_id"	`Process.sleep(50)`	Synchronous `:sys.get_state` call
"logs warning for unparseable raw message"	`Process.sleep(50)`	Synchronous `:sys.get_state` call
"stream cleanup removes request" (streaming)	`Process.sleep(100)` + `Process.sleep(50)`	Direct `:sys.get_state` assert + `MockCLI.poll_until` for cleanup
"handles multiple concurrent queries"	`Process.sleep(100)`	`MockCLI.poll_until` on `map_size(state.requests)`
"handles sequential streaming queries"	`Process.sleep(100)`	`MockCLI.poll_until` on `map_size(state.requests)`

`test/claude_code_test.exs`

Test	Was	Now
"returns false for stopped session"	`Process.sleep(100)`	`Process.monitor` + `assert_receive {:DOWN, ...}`
"stops a running session"	`Process.sleep(100)`	`Process.monitor` + `assert_receive {:DOWN, ...}`

`test/claude_code/supervisor_test.exs`

Test	Was	Now
"restarts an existing session"	`Process.sleep(50)`	`MockCLI.poll_until` for new pid
"restarts crashed sessions automatically"	`Process.sleep(100)`	`MockCLI.poll_until` for new pid
"handles multiple session crashes independently"	`Process.sleep(100)`	`MockCLI.poll_until` for count + pid change

Additionally, two supervisor tests were corrected after validation revealed their premises were no longer accurate:

Test	Was	Now	Reason
"handles session startup errors gracefully"	`Process.sleep(100)` + `assert length(children) <= 1`	Renamed to "handles session startup with empty config", asserts `count == 1`	Empty config `[]` is valid — `api_key` is now optional (defaults to `ANTHROPIC_API_KEY` env var). The child starts successfully and stays alive; it never crashes. The original test assumed a missing `api_key` would cause a crash, which is no longer the case.
"handles invalid session config"	`Process.sleep(50)` + `assert count >= 0`	Renamed to "starts session with empty config", asserts `count == 1` and `Process.alive?(pid)`	Same reason — testing the inverse (child survives) validates current behavior.

Test plan

All 1509 tests pass
mix quality passes (compile, format, credo, dialyzer)
Zero Process.sleep calls remain in the 3 modified files
Code review verified polling conditions are correct and no no-op polls remain
CHANGELOG.md updated

Eliminates shell escaping entirely by spawning the CLI binary directly via Port.open with native :args, :env, and :cd options instead of building a concatenated command string for /bin/sh -c.

Prevents leaking sensitive host environment (SSH keys, database URLs, cloud credentials) to the CLI subprocess. Filters by CLI-recognized prefixes (ANTHROPIC_, CLAUDE_CODE_, CLAUDE_, VERTEX_REGION_), an explicit allowlist of non-namespaced CLI vars, and essential system vars (PATH, HOME, etc.). User-provided :env bypasses the filter.

The test expected exactly {:unhealthy, :provisioning} but on fast runners the adapter can resolve (and fail) before the assertion, landing in :not_connected. Accept either state since both are valid unhealthy states during startup without a real CLI.

Direct spawn_executable resolves faster than sh -c, making it more likely that the adapter fails before the stream request is queued. Accept both :stream_init_error and :stream_error in session_test.exs and session_adapter_test.exs.

ppsplus-bradh · 2026-03-27T17:09:05Z

CI Note: The test failure (health/1 during provisioning in port_integration_test.exs) is a pre-existing race condition that is already fixed in #42. This PR is dependent on #42 merging first.

Once #42 is merged, this branch will need to be rebased onto the updated main for CI to pass.

If #42 is rejected or closed, the fix for that test can be brought into this PR directly.

guess

LGTM - we can merge after getting in #42

Add `allowed_env` option that accepts a list of environment variable names to pass through from the system environment to the CLI, beyond the built-in allowlist. Unlike `env` (key-value pairs), `allowed_env` takes only keys — values are read from System.get_env() at spawn time. This enables applications to forward specific env vars (e.g. DATABASE_URL, custom config) without hardcoding values in the `env` map, while still benefiting from the security filtering that excludes RELEASE_*, SSH keys, and other sensitive process-level vars. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Runs `mix format` after every Write or Edit tool use on Elixir files, ensuring code is always formatted before it reaches git. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Complete the env control surface with two new options: - `filter_env` (boolean, default true) — when true, applies the built-in allowlist (ANTHROPIC_*, CLAUDE_*, PATH, HOME, etc.). When false, passes all system env vars through unfiltered. - `disallowed_env` (list of strings) — keys to exclude from the CLI environment. Works in both filtered and unfiltered modes. Combined with the existing `allowed_env` and `env` options, this gives users full control over what reaches the CLI: # Filtered (default): built-in allowlist + extras filter_env: true, allowed_env: ["DATABASE_URL"] # Unfiltered: everything minus exclusions filter_env: false, disallowed_env: ["RELEASE_COOKIE", "SECRET_KEY"] # Explicit overrides always win regardless of mode env: %{"FORCE_THIS" => "value"} Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove filter_env, allowed_env, disallowed_env, and all filtering infrastructure from this PR. These will be submitted as a separate PR to keep the shell escaping refactor focused. build_env now passes System.get_env() through unfiltered, matching the pre-refactor behavior. The spawn_executable change is the sole focus of this PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…hronization Replace 13 Process.sleep calls across 3 test files with deterministic synchronization: - session_test.exs: Use MockCLI.poll_until to wait for request map state changes instead of fixed 50-100ms sleeps - claude_code_test.exs: Use Process.monitor + assert_receive {:DOWN} instead of sleeping after stop - supervisor_test.exs: Use MockCLI.poll_until to wait for supervisor child restarts instead of fixed sleeps. Tighten meaningless ">= 0" assertion to "in [0, 1]" Fixes #5

guess · 2026-03-28T19:43:40Z

@ppsplus-bradh can you rebase this please 🙏

ppsplus-bradh added 6 commits March 27, 2026 09:04

Refactor port spawning to use direct spawn_executable instead of sh -c

5f33513

Eliminates shell escaping entirely by spawning the CLI binary directly via Port.open with native :args, :env, and :cd options instead of building a concatenated command string for /bin/sh -c.

retrigger CI

8ca98e6

Update CHANGELOG with spawn_executable refactor and env filtering

1e103c0

Fix race conditions in provisioning error tests

ee47e21

Direct spawn_executable resolves faster than sh -c, making it more likely that the adapter fails before the stream request is queued. Accept both :stream_init_error and :stream_error in session_test.exs and session_adapter_test.exs.

guess approved these changes Mar 28, 2026

View reviewed changes

ppsplus-bradh and others added 7 commits March 28, 2026 10:22

style: apply mix format to allowed_env changes

ccbed13

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: add PostToolUse hook for auto mix format on .ex/.exs files

4fd06c0

Runs `mix format` after every Write or Edit tool use on Elixir files, ensuring code is always formatted before it reaches git. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: update changelog with full env control surface

30b932c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: combine filter+reject into single Enum.filter for Credo

8a97958

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ci: re-trigger upstream CI after fork validation

13004c9

guess mentioned this pull request Mar 28, 2026

Refactor: direct spawn_executable (remove shell escaping) #42

Merged

5 tasks

ppsplus-bradh and others added 2 commits March 28, 2026 13:51

ppsplus-bradh force-pushed the fix/harden-timing-sensitive-tests branch from f055520 to 67cb465 Compare March 28, 2026 18:58

ci: trigger CI run

f9b059d

guess approved these changes Mar 28, 2026

View reviewed changes

guess merged commit f086ebe into guess:main Mar 28, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden timing-sensitive tests: replace Process.sleep with proper synchronization#43

Harden timing-sensitive tests: replace Process.sleep with proper synchronization#43
guess merged 16 commits intoguess:mainfrom
ppsplus-bradh:fix/harden-timing-sensitive-tests

ppsplus-bradh commented Mar 27, 2026

Uh oh!

ppsplus-bradh commented Mar 27, 2026

Uh oh!

guess left a comment

Uh oh!

guess commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ppsplus-bradh commented Mar 27, 2026

Summary

Motivation

Note on timeouts

Changes

test/claude_code/session_test.exs

test/claude_code_test.exs

test/claude_code/supervisor_test.exs

Test plan

Uh oh!

ppsplus-bradh commented Mar 27, 2026

Uh oh!

guess left a comment

Choose a reason for hiding this comment

Uh oh!

guess commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`test/claude_code/session_test.exs`

`test/claude_code_test.exs`

`test/claude_code/supervisor_test.exs`