Refactor: direct spawn_executable (remove shell escaping)#42
Refactor: direct spawn_executable (remove shell escaping)#42guess merged 14 commits intoguess:mainfrom
Conversation
|
The shell escaping was stuck in my brain. 🤷♂️ |
The env filtering was a bonus (also stuck in my brain). |
|
Quick question - does this still allow passing additional env vars via the 'env' option without them being filtered? I definitely agree with avoiding leaking env vars from the parent Beam process. I hit that issue just a few hours ago so this is perfectly timed! 🙏 |
Yes, sir, it does! https://github.com/guess/claude_code/pull/42/changes#diff-4efd7a87a453012236311f25c8f78325c50fec63c901258131124c0bb7c3a4c7L499 The user_env is still merged in as it was previously. |
Eliminates shell escaping entirely by spawning the CLI binary directly via Port.open with native :args, :env, and :cd options instead of building a concatenated command string for /bin/sh -c.
Prevents leaking sensitive host environment (SSH keys, database URLs, cloud credentials) to the CLI subprocess. Filters by CLI-recognized prefixes (ANTHROPIC_, CLAUDE_CODE_, CLAUDE_, VERTEX_REGION_), an explicit allowlist of non-namespaced CLI vars, and essential system vars (PATH, HOME, etc.). User-provided :env bypasses the filter.
The test expected exactly {:unhealthy, :provisioning} but on fast runners
the adapter can resolve (and fail) before the assertion, landing in
:not_connected. Accept either state since both are valid unhealthy
states during startup without a real CLI.
a4df881 to
1e103c0
Compare
|
CI failure seems to be another race condition in the test suite. Reviewing the test suite for any other potential/similar conditions, and may open a new PR depending on scope to address the race conditions. |
Direct spawn_executable resolves faster than sh -c, making it more likely that the adapter fails before the stream request is queued. Accept both :stream_init_error and :stream_error in session_test.exs and session_adapter_test.exs.
|
Addressed two race conditions introduced by the fact that spawn_executable resolves faster than sh -c. Investigation did uncover some other potentially brittle tests. Will create a separate PR for consideration on addressing those. Also, if there's any issue with this PR being dual purpose, let me know and I can separate the two tasks (refactoring out the sh -c and filtering env vars) so they may be considered individually. |
|
wow thanks, @ppsplus-bradh ! looking at this, i'm wondering if it's worth automatically allowing sdk-known cli env vars by default? maybe it'll be better for the user to specify which env vars they want to leak from the parent beam process? the way it is now we'd have to maintain this list as they add or remove supported env vars. i think the system-critical ones are good to have & i'm on the fence about the prefixed ones. what do you think? we could potentially have [
"HTTP_PROXY",
{:prefix, "CLAUDE_CODE_"},
...
]And if people don't want anything to leak they can just set it to open to whatever you think would be best, just thinking out loud :) |
|
:passthrough_env with a sensible default sounds like a good plan with enough flexibility to cater for all reasonable use cases. 👍 I have cases where I want it to pass through some non-Claude env vars and some cases where I need to block some. |
Add `allowed_env` option that accepts a list of environment variable names to pass through from the system environment to the CLI, beyond the built-in allowlist. Unlike `env` (key-value pairs), `allowed_env` takes only keys — values are read from System.get_env() at spawn time. This enables applications to forward specific env vars (e.g. DATABASE_URL, custom config) without hardcoding values in the `env` map, while still benefiting from the security filtering that excludes RELEASE_*, SSH keys, and other sensitive process-level vars. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runs `mix format` after every Write or Edit tool use on Elixir files, ensuring code is always formatted before it reaches git. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Complete the env control surface with two new options:
- `filter_env` (boolean, default true) — when true, applies the
built-in allowlist (ANTHROPIC_*, CLAUDE_*, PATH, HOME, etc.).
When false, passes all system env vars through unfiltered.
- `disallowed_env` (list of strings) — keys to exclude from the
CLI environment. Works in both filtered and unfiltered modes.
Combined with the existing `allowed_env` and `env` options, this
gives users full control over what reaches the CLI:
# Filtered (default): built-in allowlist + extras
filter_env: true, allowed_env: ["DATABASE_URL"]
# Unfiltered: everything minus exclusions
filter_env: false, disallowed_env: ["RELEASE_COOKIE", "SECRET_KEY"]
# Explicit overrides always win regardless of mode
env: %{"FORCE_THIS" => "value"}
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Hey guys. This is Claude. Brad asked me to outline our thinking around a more comprehensive and flexible env var filtering system. Environment Variable Control The env filtering introduced earlier in this PR (built-in allowlist that reduced 91 vars down to 9) was a good security default, but it locked users into a single mode. We've expanded it into a complete control surface with three complementary options that follow the SDK's existing naming conventions (
Two operating modes for different use cases Filtered mode ( Unfiltered mode ( Notably, if a user simply sets Both paths then merge identically: The merge order is intentional — Examples: # Default: secure minimum, just add DATABASE_URL
ClaudeCode.start_link(allowed_env: ["DATABASE_URL"])
# Everything except secrets (matches pre-filter behavior + exclusions)
ClaudeCode.start_link(filter_env: false, disallowed_env: ["RELEASE_COOKIE", "GITHUB_SSH_KEY"])
# Pre-filter behavior exactly (no filtering at all)
ClaudeCode.start_link(filter_env: false)
# Filtered + force a specific override regardless of filtering
ClaudeCode.start_link(
allowed_env: ["MY_CONFIG"],
disallowed_env: ["CLAUDE_CODE_SOME_FLAG"],
env: %{"FORCED_VAR" => "always_set"}
) |
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
And I swear, I'll break stuff like this apart next time. 🤣 Single concern PRs. |
|
Looking into this further, I think we should keep the behaviour the same (inherit all env), since that is also what the Python SDK is doing, with the exception of filtering out I think splitting this into 4 options will get confusing and a single additional option could support 99% of use-cases: # everything gets inherited (current behaviour)
ClaudeCode.start_link()
# nothing gets inherited from parent
ClaudeCode.start_link(inherit_env: [])
# only these get inherited from parent
ClaudeCode.start_link(inherit_env: ["CLAUDE_CODE_SOME_FLAG"])And we still have |
|
@ppsplus-bradh We could also just remove the env var filtering from this PR, keeping the behaviour the same so we can merge this and #43 in. Then we can make a sep. focused PR for the env filtering changes? |
Yeah... My head was going there. On it. |
Remove filter_env, allowed_env, disallowed_env, and all filtering infrastructure from this PR. These will be submitted as a separate PR to keep the shell escaping refactor focused. build_env now passes System.get_env() through unfiltered, matching the pre-refactor behavior. The spawn_executable change is the sole focus of this PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@guess 2 PRs now. Separate concerns. |

Summary
/bin/sh -ccommand string invocation with directPort.open({:spawn_executable, ...})using native:args,:env, and:cdport optionsbuild_shell_command/4,shell_escape/1, and@shell_safe_pattern— shell escaping is no longer neededMotivation
open_cli_port/4previously spawned the CLI by building a single shell command string:This required every component (env values, paths, arguments) to be shell-escaped via
shell_escape/1. That function has already had one bug (#38) where the trigger list missed characters like!,#,<,>,?,[,],*,~, and tab. Even after the safelist fix (#39), hand-rolled shell escaping remains a maintenance burden and a source of correctness risk.Erlang's
Port.open/2with{:spawn_executable, path}passes arguments directly toexecvp(3)— no shell is involved. This is the approach recommended by the Erlang Security WG and what Elixir's ownSystem.cmd/3uses internally.sh -c){:args, ["--flag", "value"]}— passed directlyKEY='escaped_val'prefix in shell string{:env, [{~c"KEY", ~c"val"}]}— set by Erlang runtimecd /path &&prefix in shell string{:cd, ~c"/path"}— set by Erlang runtimeChanges
open_cli_port/4— rewritten to spawn the CLI binary directly:Deleted:
build_shell_command/4,shell_escape/1,@shell_safe_patternprepare_env/1— returns[{charlist(), charlist()}]for Erlang's native:envport option.MCP tool DSL —
tool/3replaced withtool/2. Description moves inside the block:Related
description/0not exported (test issue discovered during this work)Test plan
mix compile --warnings-as-errorscleanmix format --check-formattedclean