fix: security hardening from aardvark/codex scanner findings by johntmyers · Pull Request #352 · NVIDIA/OpenShell

johntmyers · 2026-03-16T07:16:49Z

Summary

Remediates validated security findings from the aardvark/codex automated scanner (PRs #337-#349). The original PRs were closed due to provenance issues and re-implemented cleanly here.

Closes #350

Findings Status

PR	Severity	Title	Status
#342	Critical	L7 REST parser overread enables request smuggling	Fixed
#345	Critical	OPA policy matches attacker-controlled cmdline paths	Fixed
#347	Critical	Symlink following in read_write chown enables privilege escalation	Fixed
#341	High	TLS secret volume readable by sandbox user (0644 default)	Fixed
#344	High	Provider CRUD RPCs return plaintext credentials	Fixed
#339	High	Forward proxy bypasses L7 method/path enforcement	Fixed + e2e test
#346	Low	drop_privileges no-op when process user unset	Fixed (also supersedes #337)
#337	High	Missing process identity skips privilege dropping	Superseded by #346
#340	High	No validation on discovered policy without baseline	Deferred — has UX/product impact on image-baked policies, needs separate design discussion
#348	Low	Server binds 0.0.0.0 by default	Skipped — low security value (mTLS enforced), high breakage risk (Helm chart, env var rename), not worth the churn
#338	Critical	No application-layer auth in dual-auth/edge mode	Deferred — large change (new module, interceptor, WS tunnel), warrants its own PR
#343	-	Require sandbox identity metadata for provider env RPC	Rejected — security theater (self-asserted header, no crypto binding), breaks all existing callers
#349	-	Block direct inet sockets in proxy mode	Rejected — destructive (breaks proxy mode entirely, sandbox processes need inet sockets to reach the proxy)

Changes

openshell-sandbox: byte-at-a-time HTTP/1.1 header parsing to prevent request smuggling; remove cmdline_paths from OPA binary matching; symlink/non-dir guard before chown on read_write paths; non-root fallback in drop_privileges; deny forward proxy for L7-configured endpoints
openshell-server: TLS secret volume defaultMode 0400; redact provider credentials in CRUD responses
e2e: regression test for forward proxy L7 bypass

Testing

mise run pre-commit — all checks pass
mise run test — all unit tests pass (250 openshell-sandbox, 146 openshell-server)
New e2e test forward_proxy_rejects_l7_configured_endpoint added

Checklist

Follows conventional commit format
All pre-commit checks pass
All existing tests pass
Tests updated for changed behavior (OPA cmdline, provider credential redaction)
New e2e test for forward proxy L7 bypass
No secrets or credentials in changes

The HTTP/1.1 parser used a 1024-byte read buffer that could capture bytes from a pipelined second request. Those overflow bytes were forwarded upstream as body overflow without L7 policy evaluation, enabling request smuggling that bypasses per-request method/path enforcement. Replace multi-byte read with byte-at-a-time read_u8 that stops exactly at the CRLFCRLF header terminator. Add regression test proving two pipelined requests are parsed independently. Refs: #350

Kubernetes secret volumes default to 0644, allowing the unprivileged sandbox user to read the mTLS client private key via the Landlock baseline /etc read set. A compromised sandbox could use the key to impersonate the control-plane client. Set defaultMode to 256 (octal 0400, owner-read only) on both the default and custom pod template paths. The supervisor reads TLS materials as root before forking, so this does not affect normal operation. Refs: #350

/proc/pid/cmdline is fully attacker-controlled (argv[0] can be set to any string via execve) and had no integrity verification, unlike exec.path and ancestors which are kernel-managed and get TOFU/SHA256 checks. The Rego policy used cmdline_paths as a grant-access signal, allowing any sandboxed process to claim the identity of an allowed binary and bypass network restrictions. Remove the cmdline exact-match rule and exclude cmdline_paths from the glob-match rule. Only exec.path and exec.ancestors (from /proc/pid/exe) are now used for binary identity. cmdline_paths remain in the OPA input for deny-reason diagnostics only. Refs: #350

The supervisor runs as root and calls chown on each read_write path. Since chown follows symlinks, a malicious container image could place a symlink (e.g. /sandbox -> /etc/shadow) to trick the supervisor into transferring ownership of arbitrary files to the sandbox user. Add symlink_metadata (lstat) check before chown to reject symlinks and non-directory entries. The TOCTOU window is not exploitable because no untrusted child process has been forked yet at this point. Refs: #350

Provider CRUD RPCs (create, get, list, update) returned full Provider objects including plaintext credentials (API keys, secrets). Any authenticated client -- including sandbox workloads running untrusted code -- could read credentials for all providers. Add redact_provider_credentials helper that clears the credentials map before returning. Internal server paths (inference routing, sandbox env injection) read from the store directly and are unaffected. Update tests to verify redaction and assert persistence via direct store reads. Refs: #350

drop_privileges silently returned Ok(()) when both run_as_user and run_as_group were None, even when running as root. In local/dev mode policies are loaded from disk without passing through the server-side ensure_sandbox_process_identity normalization, so child processes could retain root and all capabilities (SYS_ADMIN, NET_ADMIN). When running as root with no process identity configured, fall back to sandbox:sandbox instead of no-oping. Non-root runtimes are unaffected. Refs: #350

The forward proxy path only performed L4 (endpoint) and allowed_ips checks. If an endpoint had L7 rules (method/path restrictions), a sandboxed process could bypass them by using HTTP_PROXY with plain http:// requests instead of CONNECT tunneling, since L7 inspection only runs in the CONNECT path. Add a guard in handle_forward_proxy that queries the endpoint's L7 config and returns 403 if any L7 rules are present, forcing traffic through the CONNECT path where per-request inspection happens. Refs: #350

Verifies that the forward proxy path (plain http:// via HTTP_PROXY) returns 403 for endpoints with L7 rules configured, preventing sandboxed processes from bypassing per-request method/path enforcement by avoiding the CONNECT tunnel. Refs: #350

Update OPA tests to verify cmdline_paths no longer grant access (regression tests for the security fix). Fix formatting in TLS volume test. Refs: #350

Provider CRUD gRPC responses no longer include credential values. Update three e2e tests to assert credentials are empty in responses. Provider functionality is verified by existing e2e tests that check env var injection into sandboxes (which read from the store directly). Refs: #350

CI runs as root but has no 'sandbox' user. The security fix correctly errors when running as root with no process identity and no fallback user available -- this is the intended behavior (refuse to run as root). Update tests to expect that error in root-without-sandbox-user environments instead of unconditionally asserting Ok. Refs: #350

The symlink guard incorrectly rejected all non-directory entries. Character devices like /dev/null are legitimate read_write paths used in sandbox policies. Only reject symlinks, which are the actual attack vector for the chown privilege escalation. Refs: #350

johntmyers requested review from drew and pimlock March 16, 2026 07:33

johntmyers added e2e labels Mar 16, 2026

drew force-pushed the main branch from 9a39d92 to 079c8f8 Compare March 16, 2026 13:39

johntmyers added 13 commits March 16, 2026 06:47

fix: update tests for cmdline path and TLS volume changes

04525fe

Update OPA tests to verify cmdline_paths no longer grant access (regression tests for the security fix). Fix formatting in TLS volume test. Refs: #350

chore: reformat for Rust 1.94 assert! macro style

9950592

johntmyers force-pushed the fix/l7-rest-parser-overread-smuggling branch from bd4edc1 to d536278 Compare March 16, 2026 13:47

drew force-pushed the fix/l7-rest-parser-overread-smuggling branch from d536278 to dc3de5c Compare March 16, 2026 13:55

johntmyers force-pushed the fix/l7-rest-parser-overread-smuggling branch from dc3de5c to d536278 Compare March 16, 2026 14:11

pimlock approved these changes Mar 16, 2026

View reviewed changes

johntmyers merged commit 647b794 into main Mar 16, 2026
17 of 18 checks passed

johntmyers deleted the fix/l7-rest-parser-overread-smuggling branch March 16, 2026 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: security hardening from aardvark/codex scanner findings#352

fix: security hardening from aardvark/codex scanner findings#352
johntmyers merged 13 commits intomainfrom
fix/l7-rest-parser-overread-smuggling

johntmyers commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

johntmyers commented Mar 16, 2026

Summary

Findings Status

Changes

Testing

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants