Skip to content

Refactor: direct spawn_executable (remove shell escaping)#42

Merged
guess merged 14 commits intoguess:mainfrom
ppsplus-bradh:refactor/direct-spawn-executable
Mar 28, 2026
Merged

Refactor: direct spawn_executable (remove shell escaping)#42
guess merged 14 commits intoguess:mainfrom
ppsplus-bradh:refactor/direct-spawn-executable

Conversation

@ppsplus-bradh
Copy link
Copy Markdown
Contributor

@ppsplus-bradh ppsplus-bradh commented Mar 27, 2026

Summary

  • Replace /bin/sh -c command string invocation with direct Port.open({:spawn_executable, ...}) using native :args, :env, and :cd port options
  • Delete build_shell_command/4, shell_escape/1, and @shell_safe_pattern — shell escaping is no longer needed
  • MCP tool DSL: move description from positional argument to inside the block

Motivation

open_cli_port/4 previously spawned the CLI by building a single shell command string:

cd /path && KEY1='val1' KEY2='val2' exec /path/to/claude --flag arg

This required every component (env values, paths, arguments) to be shell-escaped via shell_escape/1. That function has already had one bug (#38) where the trigger list missed characters like !, #, <, >, ?, [, ], *, ~, and tab. Even after the safelist fix (#39), hand-rolled shell escaping remains a maintenance burden and a source of correctness risk.

Erlang's Port.open/2 with {:spawn_executable, path} passes arguments directly to execvp(3) — no shell is involved. This is the approach recommended by the Erlang Security WG and what Elixir's own System.cmd/3 uses internally.

Concern Before (sh -c) After (direct)
Arguments Concatenated + escaped string {:args, ["--flag", "value"]} — passed directly
Env vars KEY='escaped_val' prefix in shell string {:env, [{~c"KEY", ~c"val"}]} — set by Erlang runtime
Working dir cd /path && prefix in shell string {:cd, ~c"/path"} — set by Erlang runtime
Special chars Must be escaped for shell interpretation Not interpreted — passed as-is
Process tree sh → exec → CLI CLI directly

Changes

open_cli_port/4 — rewritten to spawn the CLI binary directly:

defp open_cli_port(executable, args, state, opts) do
  exe_path = executable |> String.to_charlist() |> :os.find_executable()
  if !exe_path, do: raise "CLI executable not found: #{executable}"

  env_list = prepare_env(state)
  port_opts = maybe_add_cd([{:args, args}, {:env, env_list}, :binary, :exit_status, :stderr_to_stdout], opts)
  port = Port.open({:spawn_executable, exe_path}, port_opts)
  {:ok, port}
end

Deleted: build_shell_command/4, shell_escape/1, @shell_safe_pattern

prepare_env/1 — returns [{charlist(), charlist()}] for Erlang's native :env port option.

MCP tool DSLtool/3 replaced with tool/2. Description moves inside the block:

# Before
tool :add, "Add two numbers" do ... end

# After
tool :add do
  description "Add two numbers"
  ...
end

Related

Test plan

@ppsplus-bradh ppsplus-bradh marked this pull request as ready for review March 27, 2026 04:03
@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

The shell escaping was stuck in my brain. 🤷‍♂️

@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

The shell escaping was stuck in my brain. 🤷‍♂️

The env filtering was a bonus (also stuck in my brain).

@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

From smoke testing, FWIW.
image

@col
Copy link
Copy Markdown
Contributor

col commented Mar 27, 2026

Quick question - does this still allow passing additional env vars via the 'env' option without them being filtered?

I definitely agree with avoiding leaking env vars from the parent Beam process. I hit that issue just a few hours ago so this is perfectly timed! 🙏

@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

Quick question - does this still allow passing additional env vars via the 'env' option without them being filtered?

I definitely agree with avoiding leaking env vars from the parent Beam process. I hit that issue just a few hours ago so this is perfectly timed! 🙏

Yes, sir, it does! https://github.com/guess/claude_code/pull/42/changes#diff-4efd7a87a453012236311f25c8f78325c50fec63c901258131124c0bb7c3a4c7L499 The user_env is still merged in as it was previously.

Eliminates shell escaping entirely by spawning the CLI binary directly
via Port.open with native :args, :env, and :cd options instead of
building a concatenated command string for /bin/sh -c.
Prevents leaking sensitive host environment (SSH keys, database URLs,
cloud credentials) to the CLI subprocess. Filters by CLI-recognized
prefixes (ANTHROPIC_, CLAUDE_CODE_, CLAUDE_, VERTEX_REGION_), an
explicit allowlist of non-namespaced CLI vars, and essential system
vars (PATH, HOME, etc.). User-provided :env bypasses the filter.
The test expected exactly {:unhealthy, :provisioning} but on fast runners
the adapter can resolve (and fail) before the assertion, landing in
:not_connected. Accept either state since both are valid unhealthy
states during startup without a real CLI.
@ppsplus-bradh ppsplus-bradh force-pushed the refactor/direct-spawn-executable branch from a4df881 to 1e103c0 Compare March 27, 2026 14:07
@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

CI failure seems to be another race condition in the test suite. Reviewing the test suite for any other potential/similar conditions, and may open a new PR depending on scope to address the race conditions.

Direct spawn_executable resolves faster than sh -c, making it more
likely that the adapter fails before the stream request is queued.
Accept both :stream_init_error and :stream_error in session_test.exs
and session_adapter_test.exs.
@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

Addressed two race conditions introduced by the fact that spawn_executable resolves faster than sh -c. Investigation did uncover some other potentially brittle tests. Will create a separate PR for consideration on addressing those. Also, if there's any issue with this PR being dual purpose, let me know and I can separate the two tasks (refactoring out the sh -c and filtering env vars) so they may be considered individually.

@guess
Copy link
Copy Markdown
Owner

guess commented Mar 28, 2026

wow thanks, @ppsplus-bradh !

looking at this, i'm wondering if it's worth automatically allowing sdk-known cli env vars by default? maybe it'll be better for the user to specify which env vars they want to leak from the parent beam process?

the way it is now we'd have to maintain this list as they add or remove supported env vars. i think the system-critical ones are good to have & i'm on the fence about the prefixed ones. what do you think?

we could potentially have :env and :passthrough_env or something like that. then we can set a list as a sensible default, like:

[
  "HTTP_PROXY",
  {:prefix, "CLAUDE_CODE_"},
  ...
]

And if people don't want anything to leak they can just set it to []

open to whatever you think would be best, just thinking out loud :)

@col
Copy link
Copy Markdown
Contributor

col commented Mar 28, 2026

:passthrough_env with a sensible default sounds like a good plan with enough flexibility to cater for all reasonable use cases. 👍

I have cases where I want it to pass through some non-Claude env vars and some cases where I need to block some.

ppsplus-bradh and others added 5 commits March 28, 2026 10:22
Add `allowed_env` option that accepts a list of environment variable
names to pass through from the system environment to the CLI, beyond
the built-in allowlist. Unlike `env` (key-value pairs), `allowed_env`
takes only keys — values are read from System.get_env() at spawn time.

This enables applications to forward specific env vars (e.g.
DATABASE_URL, custom config) without hardcoding values in the `env`
map, while still benefiting from the security filtering that excludes
RELEASE_*, SSH keys, and other sensitive process-level vars.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runs `mix format` after every Write or Edit tool use on Elixir files,
ensuring code is always formatted before it reaches git.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Complete the env control surface with two new options:

- `filter_env` (boolean, default true) — when true, applies the
  built-in allowlist (ANTHROPIC_*, CLAUDE_*, PATH, HOME, etc.).
  When false, passes all system env vars through unfiltered.

- `disallowed_env` (list of strings) — keys to exclude from the
  CLI environment. Works in both filtered and unfiltered modes.

Combined with the existing `allowed_env` and `env` options, this
gives users full control over what reaches the CLI:

  # Filtered (default): built-in allowlist + extras
  filter_env: true, allowed_env: ["DATABASE_URL"]

  # Unfiltered: everything minus exclusions
  filter_env: false, disallowed_env: ["RELEASE_COOKIE", "SECRET_KEY"]

  # Explicit overrides always win regardless of mode
  env: %{"FORCE_THIS" => "value"}

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

ppsplus-bradh commented Mar 28, 2026

Hey guys. This is Claude. Brad asked me to outline our thinking around a more comprehensive and flexible env var filtering system.

Environment Variable Control

The env filtering introduced earlier in this PR (built-in allowlist that reduced 91 vars down to 9) was a good security default, but it locked users into a single mode. We've expanded it into a complete control surface with three complementary options that follow the SDK's existing naming conventions (allowed_tools/disallowed_toolsallowed_env/disallowed_env):

Option Type Default Purpose
filter_env boolean true Toggle built-in allowlist filtering on/off
allowed_env [String.t()] [] Additional keys to pass through (additive to built-in allowlist)
disallowed_env [String.t()] [] Keys to exclude (works in both modes)
env %{String.t() => String.t()} %{} Explicit key-value overrides (always applied, highest priority)

Two operating modes for different use cases

Filtered mode (filter_env: true, default) — start from a secure minimum and add what you need:

System.get_env()  (91 vars in a typical BEAM release)
  → filter: key in (built-in allowlist OR allowed_env) AND key not in disallowed_env  (→ ~9 + extras)
  → Map.new()

Unfiltered mode (filter_env: false) — start from everything and remove what you don't want:

System.get_env()  (91 vars)
  → reject: key in disallowed_env  (→ 91 minus exclusions)
  → Map.new()

Notably, if a user simply sets filter_env: false with no other options, the system behaves exactly as it did before this PR — all system env vars pass through to the CLI unchanged. This makes the filtering opt-in by default with zero breaking changes for existing users who want the previous behavior.

Both paths then merge identically:

  → Map.merge(sdk_env_vars())        # CLAUDE_CODE_ENTRYPOINT, SDK_VERSION
  → Map.merge(user_env)              # explicit :env key-value overrides (highest priority)
  → maybe_put_api_key()              # ANTHROPIC_API_KEY from :api_key option
  → maybe_put_file_checkpointing()   # optional checkpoint flag

The merge order is intentional — :env overrides always win. Even if you disallowed_env a key, you can still force it through via :env with an explicit value. disallowed_env filters the inherited system environment, not the user's explicit intent.

Examples:

# Default: secure minimum, just add DATABASE_URL
ClaudeCode.start_link(allowed_env: ["DATABASE_URL"])

# Everything except secrets (matches pre-filter behavior + exclusions)
ClaudeCode.start_link(filter_env: false, disallowed_env: ["RELEASE_COOKIE", "GITHUB_SSH_KEY"])

# Pre-filter behavior exactly (no filtering at all)
ClaudeCode.start_link(filter_env: false)

# Filtered + force a specific override regardless of filtering
ClaudeCode.start_link(
  allowed_env: ["MY_CONFIG"],
  disallowed_env: ["CLAUDE_CODE_SOME_FLAG"],
  env: %{"FORCED_VAR" => "always_set"}
)

@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

Alright, @guess and @col. Let me know what you think of that. Comment above.

@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

And I swear, I'll break stuff like this apart next time. 🤣 Single concern PRs.

@guess
Copy link
Copy Markdown
Owner

guess commented Mar 28, 2026

Looking into this further, I think we should keep the behaviour the same (inherit all env), since that is also what the Python SDK is doing, with the exception of filtering out CLAUDECODE variable. So maybe we should follow suit and do that as well by default.

I think splitting this into 4 options will get confusing and a single additional option could support 99% of use-cases:

# everything gets inherited (current behaviour)
ClaudeCode.start_link()

# nothing gets inherited from parent
ClaudeCode.start_link(inherit_env: [])

# only these get inherited from parent
ClaudeCode.start_link(inherit_env: ["CLAUDE_CODE_SOME_FLAG"])

And we still have :env to be able to pass in whatever environment overrides on top of these options.

@guess
Copy link
Copy Markdown
Owner

guess commented Mar 28, 2026

@ppsplus-bradh We could also just remove the env var filtering from this PR, keeping the behaviour the same so we can merge this and #43 in. Then we can make a sep. focused PR for the env filtering changes?

@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

@ppsplus-bradh We could also just remove the env var filtering from this PR, keeping the behaviour the same so we can merge this and #43 in. Then we can make a sep. focused PR for the env filtering changes?

Yeah... My head was going there. On it.

Remove filter_env, allowed_env, disallowed_env, and all filtering
infrastructure from this PR. These will be submitted as a separate
PR to keep the shell escaping refactor focused.

build_env now passes System.get_env() through unfiltered, matching
the pre-refactor behavior. The spawn_executable change is the sole
focus of this PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ppsplus-bradh
Copy link
Copy Markdown
Contributor Author

@guess 2 PRs now. Separate concerns.

@ppsplus-bradh ppsplus-bradh changed the title Refactor: direct spawn_executable + env var filtering Refactor: direct spawn_executable (remove shell escaping) Mar 28, 2026
@guess guess merged commit e896ca3 into guess:main Mar 28, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants