Add kernel-level sandbox via nono-py for scan command#712
Merged
christophetd merged 17 commits intos.obregoso/v3from Apr 10, 2026
Merged
Add kernel-level sandbox via nono-py for scan command#712christophetd merged 17 commits intos.obregoso/v3from
christophetd merged 17 commits intos.obregoso/v3from
Conversation
Defense-in-depth against archive extraction vulnerabilities (path traversal, zip bombs) that led to CVE-2022-23530, CVE-2022-23531, CVE-2026-22870, CVE-2026-22871. - New guarddog/sandbox.py wrapping nono-py: filesystem restrictions and network blocking via CapabilitySet - --sandbox (default) / --no-sandbox CLI flag on scan command - Fail-safe: exits if sandbox cannot be set up on the platform - Local scans: full sandbox (network blocked, filesystem restricted) before extraction and analysis - Remote scans: two-phase approach. Download + metadata analysis run unsandboxed (need network), then sandbox is applied before source code analysis (YARA/Semgrep) - New download_package() on PackageScanner for the two-phase flow - Scope: scan command only; verify deferred to follow-up
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 513309fa58
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
- Extract archives in a sandboxed subprocess (python -m guarddog.sandbox) that applies nono before calling safe_extract. This closes the TOCTOU gap for remote scans: extraction is now fully sandboxed regardless of scan type. - Simplify _scan_remote_sandboxed: monkey-patch scanner._extract_archive to route through the sandboxed subprocess, then delegate to scan_remote. No more duplicated merge/risk-formatting logic. - Remove download_package() from PackageScanner (no longer needed). - Move sandbox imports to top-level in cli.py. - Add DEBUG logging of full sandbox capability set. - Handle nested .gem extraction in subprocess.
- Allow parent directory when scan_path is a file (nono-py expects directories) - Add [project] table to pyproject.toml so uv can install the package
Three issues with the sandboxed extraction subprocess: - Writable paths must exist before allow_path (nono-py requirement) - Remove dry-run QueryContext validation that gave false denials - Set subprocess CWD to temp dir so tarsafe's os.getcwd() works under the sandbox (the inherited CWD was outside allowed paths) - Don't pass archive as scan_path in subprocess since the temp dir already provides access
sobregosodd
reviewed
Apr 10, 2026
sobregosodd
approved these changes
Apr 10, 2026
Previously only archive extraction ran in a sandbox for remote packages. Now the main process also enters a sandbox (network blocked, filesystem restricted) before running YARA/Semgrep analysis.
- Add guarddog package directory to sandbox read paths so YARA/Semgrep rule files are accessible when running from source (outside sys.prefix) - Pass tmpdir as writable in remote sandboxed scans so temp dir cleanup and Semgrep temp writes work correctly - Remove leftover test code (touch /tmp/pwned)
Use mkdtemp + realpath instead of TemporaryDirectory context manager. On macOS /var symlinks to /private/var, and nono doesn't resolve symlinks, so cleanup via the symlink path was blocked by the sandbox.
Metadata rules (e.g. unclaimed_maintainer_email_domain) need network access for DNS checks. Split the analyze() call so metadata runs in phase 1 (unsandboxed) and source code analysis runs in phase 2 (sandboxed).
- Default (no flag): use sandbox if available, warn if not - --sandbox: force sandbox, hard-fail if unavailable - --no-sandbox: skip sandbox entirely
nono-py is a required dependency. ImportError means a broken install, not an unsupported platform. is_available() now only checks nono.is_supported().
sobregosodd
approved these changes
Apr 10, 2026
christophetd
added a commit
that referenced
this pull request
Apr 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
scancommand as defense-in-depth against archive extraction vulnerabilities (path traversal, zip bombs) that led to CVE-2022-23530, CVE-2022-23531, CVE-2026-22870, CVE-2026-22871--sandboxis on by default;--no-sandboxto disable. Fail-safe: exits if nono can't set up the sandbox on the platformpython -m guarddog.sandbox) with network blocked and filesystem restricted.gemarchive extraction in the subprocessscancommand only;verifydeferred to follow-upNew files
guarddog/sandbox.py--is_available(),apply_sandbox(),extract_sandboxed(), and subprocess entry pointtests/core/test_sandbox.py-- 9 unit testsHow sandboxing works
Demo 1: RCE in the unsandboxed code path
Simulates a vulnerability in the
--no-sandboxcode path:Demo 2: RCE during source code analysis
Simulates a vulnerability in YARA analysis (e.g. a crafted file exploiting a parser bug):