Harden timing-sensitive tests: replace Process.sleep with proper synchronization#43
Merged
guess merged 16 commits intoguess:mainfrom Mar 28, 2026
Merged
Conversation
Eliminates shell escaping entirely by spawning the CLI binary directly via Port.open with native :args, :env, and :cd options instead of building a concatenated command string for /bin/sh -c.
Prevents leaking sensitive host environment (SSH keys, database URLs, cloud credentials) to the CLI subprocess. Filters by CLI-recognized prefixes (ANTHROPIC_, CLAUDE_CODE_, CLAUDE_, VERTEX_REGION_), an explicit allowlist of non-namespaced CLI vars, and essential system vars (PATH, HOME, etc.). User-provided :env bypasses the filter.
The test expected exactly {:unhealthy, :provisioning} but on fast runners
the adapter can resolve (and fail) before the assertion, landing in
:not_connected. Accept either state since both are valid unhealthy
states during startup without a real CLI.
Direct spawn_executable resolves faster than sh -c, making it more likely that the adapter fails before the stream request is queued. Accept both :stream_init_error and :stream_error in session_test.exs and session_adapter_test.exs.
Contributor
Author
|
CI Note: The test failure ( Once #42 is merged, this branch will need to be rebased onto the updated main for CI to pass. If #42 is rejected or closed, the fix for that test can be brought into this PR directly. |
Add `allowed_env` option that accepts a list of environment variable names to pass through from the system environment to the CLI, beyond the built-in allowlist. Unlike `env` (key-value pairs), `allowed_env` takes only keys — values are read from System.get_env() at spawn time. This enables applications to forward specific env vars (e.g. DATABASE_URL, custom config) without hardcoding values in the `env` map, while still benefiting from the security filtering that excludes RELEASE_*, SSH keys, and other sensitive process-level vars. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runs `mix format` after every Write or Edit tool use on Elixir files, ensuring code is always formatted before it reaches git. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Complete the env control surface with two new options:
- `filter_env` (boolean, default true) — when true, applies the
built-in allowlist (ANTHROPIC_*, CLAUDE_*, PATH, HOME, etc.).
When false, passes all system env vars through unfiltered.
- `disallowed_env` (list of strings) — keys to exclude from the
CLI environment. Works in both filtered and unfiltered modes.
Combined with the existing `allowed_env` and `env` options, this
gives users full control over what reaches the CLI:
# Filtered (default): built-in allowlist + extras
filter_env: true, allowed_env: ["DATABASE_URL"]
# Unfiltered: everything minus exclusions
filter_env: false, disallowed_env: ["RELEASE_COOKIE", "SECRET_KEY"]
# Explicit overrides always win regardless of mode
env: %{"FORCE_THIS" => "value"}
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
5 tasks
Remove filter_env, allowed_env, disallowed_env, and all filtering infrastructure from this PR. These will be submitted as a separate PR to keep the shell escaping refactor focused. build_env now passes System.get_env() through unfiltered, matching the pre-refactor behavior. The spawn_executable change is the sole focus of this PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…hronization
Replace 13 Process.sleep calls across 3 test files with deterministic
synchronization:
- session_test.exs: Use MockCLI.poll_until to wait for request map
state changes instead of fixed 50-100ms sleeps
- claude_code_test.exs: Use Process.monitor + assert_receive {:DOWN}
instead of sleeping after stop
- supervisor_test.exs: Use MockCLI.poll_until to wait for supervisor
child restarts instead of fixed sleeps. Tighten meaningless
">= 0" assertion to "in [0, 1]"
Fixes #5
f055520 to
67cb465
Compare
Owner
|
@ppsplus-bradh can you rebase this please 🙏 |
guess
approved these changes
Mar 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace
Process.sleepcalls across 3 test files with deterministic synchronization, eliminating a class of flaky test failures on fast or slow CI runners.Motivation
Several tests use fixed
Process.sleep(50-100)calls to wait for async operations (request delivery, stream cleanup, process shutdown, supervisor restarts). These are brittle — too short on slow runners, unnecessarily slow on fast ones. The codebase already hasMockCLI.poll_until/2for proper async polling; these tests just weren't using it.Note on timeouts
MockCLI.poll_until/2has a default timeout of 5 seconds and a polling interval of 10ms. Both can be overridden via options (timeout:,interval:) to shorten the overall polling period where appropriate. If the condition is never met, the test fails with a raised exception rather than hanging indefinitely.Changes
test/claude_code/session_test.exsProcess.sleep(50)x2:sys.get_stateassert (sync call guarantees state) +MockCLI.poll_untilfor cleanupProcess.sleep(50):sys.get_statecallProcess.sleep(50):sys.get_statecallProcess.sleep(100)+Process.sleep(50):sys.get_stateassert +MockCLI.poll_untilfor cleanupProcess.sleep(100)MockCLI.poll_untilonmap_size(state.requests)Process.sleep(100)MockCLI.poll_untilonmap_size(state.requests)test/claude_code_test.exsProcess.sleep(100)Process.monitor+assert_receive {:DOWN, ...}Process.sleep(100)Process.monitor+assert_receive {:DOWN, ...}test/claude_code/supervisor_test.exsProcess.sleep(50)MockCLI.poll_untilfor new pidProcess.sleep(100)MockCLI.poll_untilfor new pidProcess.sleep(100)MockCLI.poll_untilfor count + pid changeAdditionally, two supervisor tests were corrected after validation revealed their premises were no longer accurate:
Process.sleep(100)+assert length(children) <= 1count == 1[]is valid —api_keyis now optional (defaults toANTHROPIC_API_KEYenv var). The child starts successfully and stays alive; it never crashes. The original test assumed a missingapi_keywould cause a crash, which is no longer the case.Process.sleep(50)+assert count >= 0count == 1andProcess.alive?(pid)Test plan
mix qualitypasses (compile, format, credo, dialyzer)Process.sleepcalls remain in the 3 modified files