Skip to content

refactor: split bash tool safety logic into testable modules#258

Merged
emal-avala merged 3 commits into
mainfrom
refactor/bash-tool-hardening
May 1, 2026
Merged

refactor: split bash tool safety logic into testable modules#258
emal-avala merged 3 commits into
mainfrom
refactor/bash-tool-hardening

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Summary

  • Move the Bash tool's safety helpers out of the 700-line bash.rs
    flat-file into a directory module tools/bash/ with six focused
    sub-modules. Each helper is now individually testable.
  • bash.rs becomes a thin orchestrator that imports the helpers; the
    public BashTool API is unchanged so tools/registry.rs and the
    existing sandbox_integration integration tests resolve via
    tools::bash::BashTool exactly as before.
  • Add 35+ new unit tests across the helpers plus one end-to-end test
    proving sed -i in a forbidden directory is refused before exec.

The new modules:

  • bash_security.rsclassify_destructive(cmd) -> DestructivenessLevel
    consolidating the historical inline destructive-pattern, piped-base,
    and chained-segment scans into one place.
  • command_semantics.rs — coarse Effect classification (ReadOnly,
    Mutating, Network, Privileged); a single command can have
    multiple effects, and unknown binaries default to Mutating.
  • read_only_validation.rsvalidate_read_only(cmd, profile),
    used when the active permission profile (Plan or Deny) should
    refuse mutating shell commands before exec.
  • sandbox_decision.rsdecide(cmd, ctx) -> SandboxDecision,
    the single place that says "sandbox / prompt / refuse / run freely"
    for a given command.
  • sed_edit_parser.rs — parses sed -i invocations (BSD and GNU
    forms) and extracts the file targets.
  • sed_validation.rs — routes those targets through the same
    PermissionChecker path as FileEdit so a sed -i cannot touch
    a directory FileEdit would refuse.

This is a behaviour-preserving refactor. All 87 existing bash unit
tests continue to pass; the sandbox integration tests are unchanged
and still pass.

Test Plan

  • cargo build --workspace
  • cargo test -p agent-code-lib bash — 87 passed (was 87)
  • cargo test -p agent-code-lib --test sandbox_integration — new
    sed_inplace_in_forbidden_dir_is_refused plus existing tests
  • cargo clippy --workspace --all-targets -- -D warnings clean
  • cargo fmt --check -p agent-code-lib clean

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f77830e352

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +122 to +126
if tok == "-n" || tok == "-E" || tok == "-r" || tok == "-s" || tok == "--posix" {
continue;
}
if tok.starts_with('-') {
continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse clustered sed flags that include -i

parse_sed_segment only marks in-place edits when it sees -i or -iSUFFIX as a standalone token, then drops any other short-option token via tok.starts_with('-'). This misses common clustered forms like sed -Ei 's/x/y/' .git/config, so validate_sed_edits returns no violation and the protected-path check is bypassed. I verified the CLI semantics locally (sed --help exposes -i[SUFFIX] and -E, and sed -Ei ... file performs in-place edits), so this gap allows real writes to .git/ to slip through this new guard.

Useful? React with 👍 / 👎.

@emal-avala
Copy link
Copy Markdown
Member Author

Addressed codex review finding: parse_sed_segment now decomposes clustered short-flag tokens (-Ei, -nEi, -Eie, -ie .bak, -iSUFFIX, etc.) so any i in the cluster triggers the in-place path and validate_sed_edits no longer misses these forms. Added 8 new tests covering each form plus an integration test mirroring sed_inplace_in_forbidden_dir_is_refused with sed -Ei .git/config.

@emal-avala
Copy link
Copy Markdown
Member Author

Addressed second-round codex findings: plugged FileEdit-gate bypasses via non-sed writers (cp/mv/tee/dd/install/ln/rsync/truncate), redirections (>, >>, &>, 2>), recursive shell payloads (bash -c, eval), inline interpreters (python -c, node -e, perl -e, ruby -e), the sed --expression=/-eSCRIPT/-fFILE attached forms, plus the dangerouslyDisableSandbox-skips-destructive-validation hole and read_only_validation treating "echo foo > file" as read-only. New protected_paths module reuses PROTECTED_DIRS and BLOCKED_WRITE_PATHS — see commit 2db9808.

emal-avala added 3 commits May 1, 2026 13:49
Move the Bash tool's destructive-pattern detection, command-effect
classification, sandbox-decision logic, and sed -i parsing into named
sub-modules under tools/bash/ so each helper can be unit-tested in
isolation.

bash.rs becomes a thin orchestrator that imports the helpers from the
new directory module. The public BashTool API is unchanged; tools/
registry and the existing sandbox_integration test still resolve via
tools::bash::BashTool.

New modules:
  - bash_security: classify_destructive(cmd) -> DestructivenessLevel
  - command_semantics: classify(cmd) -> Vec<Effect>
  - read_only_validation: validate_read_only(cmd, profile)
  - sandbox_decision: decide(cmd, ctx) -> SandboxDecision
  - sed_edit_parser: parse_sed_edits(cmd) -> Vec<SedEdit>
  - sed_validation: validate_sed_edits routes sed -i through the
    same FileEdit permission gate so protected directories stay
    protected from both surfaces.

Tests:
  - 5+ unit tests per new module (35+ total)
  - End-to-end test: sed -i in .git/ is refused before exec and the
    file on disk stays untouched.

All 87 existing bash unit tests plus the sandbox integration tests
continue to pass; behaviour is unchanged.
The parser previously only matched a standalone '-i' or '-iSUFFIX'
token, then dropped any other short-option token via the catch-all
'tok.starts_with("-")' arm. That missed common clustered forms
('-Ei', '-nEi', '-Eie', '-ie .bak'), so 'validate_sed_edits' never
inspected those calls and they bypassed the FileEdit permission
gate.

Decompose short-option tokens character-by-character so any 'i' in
the cluster triggers the in-place path. Anything trailing 'i' is
treated as the GNU inline backup suffix; otherwise the next token is
considered as a separate BSD-style suffix when it looks like one.
…irections

Second-round review surfaced six P1 bypasses around the FileEdit
permission gate that the bash tool routed around. The gate is the
load-bearing security boundary for `.git/`, `.husky/`, and
`node_modules/`, and it was skippable via:

* `sed --expression=ARG` / `sed -eSCRIPT` / `sed -fFILE` attached forms
  — previously fell through to the generic short-cluster decomposer,
  which silently swallowed the script characters as boolean flags.
* Non-sed writers — `cp`, `mv`, `tee`, `dd of=`, `install`, `ln`,
  `rsync`, `truncate`, plus shell redirections (`>`, `>>`, `&>`,
  `2>>`). The historical inline check covered only `>`, `tee `, and
  `mv `, with a bonus operator-precedence bug that meant `mv` matched
  any path on the line.
* Interpreter shells — `bash -c`, `sh -c`, `zsh -c`, `eval` payloads
  were re-evaluated by the shell but never re-validated. Re-parse
  recursively (depth-limited to 3) and apply the same gate.
* Inline-source interpreters — `python -c`, `node -e`, `perl -e`,
  `ruby -e`, etc. cannot be parsed semantically; conservatively scan
  the literal payload for any protected-path substring and refuse.
* System paths — same writer/redirection coverage, applied to the
  existing `BLOCKED_WRITE_PATHS` list (`/etc/`, `/usr/`, `/bin/`,
  `/sbin/`, `/boot/`, `/sys/`, `/proc/`).
* `dangerouslyDisableSandbox: true` skipped destructive-pattern
  validation entirely. The flag now only chooses sandbox-wrapped vs
  raw-host execution; destructive and protected-path validation always
  run. Mirrors the same fix in `sandbox_decision::decide`.

P2 polish: `validate_read_only` previously classified `echo foo > file`
as read-only because `echo` is on the read-only list. Now treats any
output redirection (`>`, `>>`, `&>`, `2>`, process substitution
`>(...)`) as Mutating regardless of the underlying command name, while
still allowing `2>&1` fd-duplications.

The new `protected_paths` module reuses `permissions::PROTECTED_DIRS`
and `bash::BLOCKED_WRITE_PATHS` so adding a new protected entry still
updates every surface.
@emal-avala emal-avala force-pushed the refactor/bash-tool-hardening branch from 2db9808 to 2616cdb Compare May 1, 2026 20:54
@emal-avala
Copy link
Copy Markdown
Member Author

Rebased onto main; conflicts resolved across:

  • crates/lib/src/tools/bash/mod.rs — three conflict regions:
    • BLOCKED_WRITE_PATHS const: kept this PR's pub(crate) system-only list and dropped .agent/team-memory/ from the array (it now belongs in the centralized handler).
    • validate_input body: kept this PR's call to protected_paths::check (replaces main's per-path substring loop).
    • #[cfg(test)] block: union of both — main's test_blocks_redirection_to_team_memory plus this PR's dangerously_disable_sandbox_does_not_skip_protected_path_check, protected_dir_writes_via_writers_refused, and system_path_writes_via_writers_refused.
  • crates/lib/src/tools/bash/protected_paths.rs — extended ensure_not_protected and ensure_no_protected_substring to delegate team-memory checks to permissions::is_team_memory_write_target (project-root-less fallback), so the team-memory invariant carries through the unified protected-path handler instead of the dropped BLOCKED_WRITE_PATHS entry.

Verified locally: cargo check --all-targets, cargo fmt --all -- --check, cargo clippy --all-targets -- -D warnings, and cargo test --all-targets all pass — modulo the three pre-existing bwrap_* sandbox-integration failures that depend on user-namespace permissions in this environment.

@emal-avala emal-avala merged commit 6601b7c into main May 1, 2026
10 of 14 checks passed
@emal-avala emal-avala deleted the refactor/bash-tool-hardening branch May 1, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant