Skip to content

Add kernel-level sandbox via nono-py for scan command#712

Merged
christophetd merged 17 commits intos.obregoso/v3from
christophe.tafanidereeper/nono-sandbox
Apr 10, 2026
Merged

Add kernel-level sandbox via nono-py for scan command#712
christophetd merged 17 commits intos.obregoso/v3from
christophe.tafanidereeper/nono-sandbox

Conversation

@christophetd
Copy link
Copy Markdown
Contributor

@christophetd christophetd commented Apr 10, 2026

Summary

  • Adds kernel-level sandboxing (via nono-py) to the scan command as defense-in-depth against archive extraction vulnerabilities (path traversal, zip bombs) that led to CVE-2022-23530, CVE-2022-23531, CVE-2026-22870, CVE-2026-22871
  • --sandbox is on by default; --no-sandbox to disable. Fail-safe: exits if nono can't set up the sandbox on the platform
  • Local scans (directory or archive): full sandbox applied to the main process (network blocked, filesystem restricted to scanned path + temp dir) before extraction and analysis
  • Remote scans: three-phase approach:
    1. Download + metadata analysis run unsandboxed (need network for package registry and DNS/email checks)
    2. Archive extraction runs in a sandboxed subprocess (python -m guarddog.sandbox) with network blocked and filesystem restricted
    3. Source code analysis (YARA/Semgrep) runs in the main process after a sandbox is applied (network blocked, filesystem restricted to extracted files)
  • Handles nested .gem archive extraction in the subprocess
  • Scope: scan command only; verify deferred to follow-up

New files

  • guarddog/sandbox.py -- is_available(), apply_sandbox(), extract_sandboxed(), and subprocess entry point
  • tests/core/test_sandbox.py -- 9 unit tests

How sandboxing works

Scan type Extraction Metadata analysis Source code analysis Network
Local dir N/A N/A Main process (sandboxed) Blocked
Local archive Main process (sandboxed) N/A Main process (sandboxed) Blocked
Remote package Sandboxed subprocess Unsandboxed Main process (sandboxed) Blocked after metadata

Demo 1: RCE in the unsandboxed code path

Simulates a vulnerability in the --no-sandbox code path:

diff --git a/guarddog/cli.py b/guarddog/cli.py
index a28b5e5..cb9d5ef 100644
--- a/guarddog/cli.py
+++ b/guarddog/cli.py
@@ -254,6 +254,7 @@ def _scan(
                 )
             else:
                 result |= scanner.scan_remote(identifier, version, rule_param)
+                import subprocess; subprocess.run(["touch", "/tmp/pwned"])
 
     except Exception as e:
         log.error(f"Error occurred while scanning target {identifier}: '{e}'\n")
$ git apply demo1.diff
$ uv run guarddog npm scan requests && cat /tmp/pwned
cat: /tmp/pwned: No such file or directory   # --no-sandbox path not reached

$ uv run guarddog npm scan requests --no-sandbox && cat /tmp/pwned
# no error -- file was created

Demo 2: RCE during source code analysis

Simulates a vulnerability in YARA analysis (e.g. a crafted file exploiting a parser bug):

diff --git a/guarddog/analyzer/analyzer.py b/guarddog/analyzer/analyzer.py
index 78a36a6..289e5e3 100644
--- a/guarddog/analyzer/analyzer.py
+++ b/guarddog/analyzer/analyzer.py
@@ -376,6 +376,7 @@ class Analyzer:
             dict[str]: map from each IOC rule and their corresponding output
         """
         log.debug(f"Running yara rules against directory '{path}'")
+        import subprocess; subprocess.run(["touch", "/tmp/pwned-during-analysis"])
 
         all_rules = self.yara_ruleset
         if rules is not None:
$ git apply demo2.diff
$ uv run guarddog pypi scan requests
$ ls /tmp/pwned-during-analysis
ls: /tmp/pwned-during-analysis: No such file or directory   # blocked by sandbox

Defense-in-depth against archive extraction vulnerabilities (path
traversal, zip bombs) that led to CVE-2022-23530, CVE-2022-23531,
CVE-2026-22870, CVE-2026-22871.

- New guarddog/sandbox.py wrapping nono-py: filesystem restrictions
  and network blocking via CapabilitySet
- --sandbox (default) / --no-sandbox CLI flag on scan command
- Fail-safe: exits if sandbox cannot be set up on the platform
- Local scans: full sandbox (network blocked, filesystem restricted)
  before extraction and analysis
- Remote scans: two-phase approach. Download + metadata analysis run
  unsandboxed (need network), then sandbox is applied before source
  code analysis (YARA/Semgrep)
- New download_package() on PackageScanner for the two-phase flow
- Scope: scan command only; verify deferred to follow-up
@christophetd christophetd requested a review from a team as a code owner April 10, 2026 15:12
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 513309fa58

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread guarddog/cli.py Outdated
- Extract archives in a sandboxed subprocess (python -m guarddog.sandbox)
  that applies nono before calling safe_extract. This closes the TOCTOU
  gap for remote scans: extraction is now fully sandboxed regardless of
  scan type.
- Simplify _scan_remote_sandboxed: monkey-patch scanner._extract_archive
  to route through the sandboxed subprocess, then delegate to scan_remote.
  No more duplicated merge/risk-formatting logic.
- Remove download_package() from PackageScanner (no longer needed).
- Move sandbox imports to top-level in cli.py.
- Add DEBUG logging of full sandbox capability set.
- Handle nested .gem extraction in subprocess.
- Allow parent directory when scan_path is a file (nono-py expects directories)
- Add [project] table to pyproject.toml so uv can install the package
Three issues with the sandboxed extraction subprocess:

- Writable paths must exist before allow_path (nono-py requirement)
- Remove dry-run QueryContext validation that gave false denials
- Set subprocess CWD to temp dir so tarsafe's os.getcwd() works
  under the sandbox (the inherited CWD was outside allowed paths)
- Don't pass archive as scan_path in subprocess since the temp dir
  already provides access
Comment thread pyproject.toml Outdated
Previously only archive extraction ran in a sandbox for remote packages.
Now the main process also enters a sandbox (network blocked, filesystem
restricted) before running YARA/Semgrep analysis.
- Add guarddog package directory to sandbox read paths so YARA/Semgrep
  rule files are accessible when running from source (outside sys.prefix)
- Pass tmpdir as writable in remote sandboxed scans so temp dir cleanup
  and Semgrep temp writes work correctly
- Remove leftover test code (touch /tmp/pwned)
Use mkdtemp + realpath instead of TemporaryDirectory context manager.
On macOS /var symlinks to /private/var, and nono doesn't resolve
symlinks, so cleanup via the symlink path was blocked by the sandbox.
Metadata rules (e.g. unclaimed_maintainer_email_domain) need network
access for DNS checks. Split the analyze() call so metadata runs in
phase 1 (unsandboxed) and source code analysis runs in phase 2
(sandboxed).
- Default (no flag): use sandbox if available, warn if not
- --sandbox: force sandbox, hard-fail if unavailable
- --no-sandbox: skip sandbox entirely
nono-py is a required dependency. ImportError means a broken install,
not an unsupported platform. is_available() now only checks
nono.is_supported().
Comment thread guarddog/cli.py
@christophetd christophetd merged commit acf7b4e into s.obregoso/v3 Apr 10, 2026
1 check passed
@christophetd christophetd deleted the christophe.tafanidereeper/nono-sandbox branch April 10, 2026 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants