Skip to content

feat: layered prompt injection security scanning#239

Merged
ralphbean merged 8 commits into
mainfrom
feat-security-scanning
Apr 20, 2026
Merged

feat: layered prompt injection security scanning#239
ralphbean merged 8 commits into
mainfrom
feat-security-scanning

Conversation

@waynesun09
Copy link
Copy Markdown
Contributor

Summary

  • Adds defense-in-depth security scanning pipeline with two execution paths for the fullsend CLI runner
  • Path A (GHA pre-step): fullsend scan input|output|context|url commands scan EVENT_PAYLOAD before sandbox creation, with optional LLM Guard ML detection
  • Path B (sandbox-internal): pre-agent context file scanning, runtime Claude Code hooks (Tirith, SSRF, secret redaction), post-agent output scanning
  • Extends harness YAML schema with SecurityConfig — secure by default (omitting the block enables all scanners with fail_mode: closed)
  • Embeds Python hook scripts via go:embed with JSONL audit logging and trace ID correlation

New packages/files

  • internal/security/ — Scanner interface, Pipeline, UnicodeNormalizer, ContextInjectionScanner, SSRFValidator, SecretRedactor, LLMGuardScanner, trace ID generation, hook embedding
  • internal/security/hooks/ — 3 Python hook scripts (ssrf_pretool, secret_redact_posttool, tirith_check)
  • internal/cli/scan.gofullsend scan CLI subcommands

Modified files

  • internal/harness/harness.go — SecurityConfig schema types, validation, helper methods
  • internal/cli/run.go — pre-agent scan (step 7d), post-agent output scan (step 8e), hook bootstrap, trace ID injection
  • internal/cli/root.go — wire scan command

Dependencies

Draft: Depends on PR #231 (sandbox runner) merging first. This branch is based on PR 231's commit 1383726.

Implements story 6 (issue #129).

Test plan

  • go test ./... — all 10 packages pass
  • go vet ./internal/... — clean
  • go build ./cmd/fullsend/ — binary builds
  • E2E test with known-bad payload → verify host scan blocks
  • E2E test with security enabled → verify hooks fire and findings logged
  • E2E test with fail_mode: open → verify warnings but no blocking

@waynesun09 waynesun09 force-pushed the feat-security-scanning branch 2 times, most recently from b1b3991 to 57591ad Compare April 16, 2026 19:26
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 16, 2026

Site preview

Preview: https://39c2c0d4-site.fullsend-ai.workers.dev

Commit: 3e49da87615320dd5f6636c77c9fecc9c1b7ecfe

@waynesun09 waynesun09 marked this pull request as ready for review April 16, 2026 20:12
@waynesun09
Copy link
Copy Markdown
Contributor Author

Heads up @maruiz93 — this PR adds a new base sandbox Containerfile at images/sandbox/Containerfile that extracts the common runtime layer (Claude Code, rsync, tirith) out of the experiment-specific image.

The experiment Containerfile at experiments/runner-hello-world/experiment/Containerfile now extends this base via ARG BASE_IMAGE, and run-experiment.sh builds the base first then the experiment image on top.

Key changes to the image setup:

  • Claude Code installed via the official curl -fsSL https://claude.ai/install.sh | bash (replaces npm approach, no Node.js dependency needed)
  • tirith v0.2.12 baked in with sha256 checksum verification for supply chain safety
  • TIRITH_REQUIRED=1 env var set by the harness so the hook fails closed if the binary is unexpectedly missing

This builds on the sandbox infrastructure from #231 — would appreciate your review on the image layering approach.

Add defense-in-depth security scanning pipeline with two execution paths:

Path A (GHA pre-step): fullsend scan input/output/context/url commands
scan EVENT_PAYLOAD before sandbox creation, with LLM Guard ML detection.

Path B (sandbox-internal): pre-agent context file scanning after repo
copy, runtime PreToolUse/PostToolUse hooks (Tirith, SSRF, secret
redaction), and post-agent output scanning with secret redaction.

Changes:
- Add internal/security package: Scanner interface, Pipeline,
  UnicodeNormalizer, ContextInjectionScanner, SSRFValidator,
  SecretRedactor, LLMGuardScanner, trace ID generation
- Add SecurityConfig to harness schema (secure by default, *bool for
  nil=enabled pattern, fail_mode closed/open, escalation config)
- Embed Python hook scripts via go:embed (ssrf_pretool,
  secret_redact_posttool, tirith_check) with JSONL audit logging
- Wire fullsend scan subcommands into CLI root
- Add pre-agent scan step (7d) and post-agent output scan (8e) to run.go
- Bootstrap security hooks and settings.json in sandbox
- Generate and propagate trace IDs for finding correlation

Implements story 6 (issue #129).

Signed-off-by: Wayne Sun <gsun@redhat.com>
The test file contains a private key fixture used to verify the
SecretRedactor works correctly. This follows the same pattern as
internal/layers/secrets_test.go which is already excluded.

Signed-off-by: Wayne Sun <gsun@redhat.com>
Address review findings from security audit of the scanning pipeline:

Critical:
- LLM Guard fail-open chain converted to fail-closed (Python exception,
  Go JSON unmarshal error now block instead of passing)
- TraceID validated before shell interpolation in buildScanContextCommand

High:
- SSRF URL extraction uses regex instead of strings.Fields to catch URLs
  in markdown/JSON contexts
- Secret redactor uses ReplaceAll to catch duplicate occurrences
- GITHUB_OUTPUT uses multiline delimiter syntax to prevent injection
- All three Python hooks enforce 10 MB stdin size limit

Medium:
- scanOutputFiles walks subdirectories recursively
- buildScanContextCommand generates -iname args from ScannableFiles map
- Findings file permissions tightened from 0644 to 0600

Low:
- TraceID regex tightened to strict UUID v4 format

Infrastructure:
- Add images/sandbox/Containerfile as production base sandbox image with
  Claude Code (official installer), rsync, and tirith v0.2.12 (sha256
  verified)
- Experiment Containerfile now extends base via ARG BASE_IMAGE
- run-experiment.sh builds base image then experiment image
- tirith_check.py fails closed when TIRITH_REQUIRED=1 (set by harness
  when tirith is enabled)
- Export harness.BoolDefault for cross-package use

Signed-off-by: Wayne Sun <gsun@redhat.com>
Install Python, llm-guard[onnxruntime], and pre-download the
DeBERTa-v3 prompt injection model at image build time so scans
inside the sandbox have no cold-start latency.

Update comments in llmguard.go and harness.go to reflect that
LLM Guard now runs in both Path A (GHA pre-step) and Path B
(sandbox) when the base image is used.

Signed-off-by: Wayne Sun <gsun@redhat.com>
Pre-tool hooks (SSRF, tirith) now fail closed on parse errors instead
of silently allowing tool calls through. Tirith treats any non-zero
exit code as a block when output is unparseable.

Supply chain hardening:
- Pin llm-guard==0.3.14 in Containerfile
- Use crypto/rand for GitHub Output delimiter instead of predictable
  timestamp
- Reject newlines in writeGitHubOutput values

Secret redaction improvements:
- Reduce mask visibility from prefix[6]+suffix[4] to prefix[4] only
- Scan first 10MB of oversized input instead of skipping entirely
- Wrap redact_text in try/except with logging on failure

Other:
- Quote TIRITH_FAIL_ON value in shell command defensively
- Fix trace ID fallback to valid UUID v4 format
- Remove undocumented FULLSEND_SKIP_LLM_GUARD env var bypass

Signed-off-by: Wayne Sun <gsun@redhat.com>
Copy link
Copy Markdown
Contributor

@ralphbean ralphbean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security review — 3 high-severity findings related to fail-open behavior under degraded conditions.

Comment thread internal/security/ssrf.go
Comment thread internal/security/hooks/tirith_check.py
Comment thread internal/security/llmguard.go
@waynesun09
Copy link
Copy Markdown
Contributor Author

Addressed all three findings from @ralphbean's security review in two commits:

Commit bc7790d — direct fixes for the 3 high-severity findings

  • ssrf.go: Default resolveDNS to true in Scan() and ValidateRedirectChain() to prevent DNS rebinding bypasses
  • ssrf_pretool.py: Add socket.getaddrinfo() DNS resolution check to the Python hook for parity with the Go scanner
  • tirith_check.py: Apply TIRITH_REQUIRED guard to TimeoutExpired and generic Exception branches (not just FileNotFoundError)
  • llmguard.go: Add Required field that fails closed when Python is unavailable in sandbox context

Commit a96f208 — hardening from follow-up review (3 agents: Gemini, security-focused, code quality)

  • llmguard.go: NewLLMGuardScanner constructor now accepts required bool param (was dead code before — field was never set); LLM_GUARD_REQUIRED=1 env var propagated to inline Python so ImportError also fails closed; warning writes to stderr instead of stdout
  • scan.go: CLI scan url --resolve-dns defaults to true for consistency with Scan()/ValidateRedirectChain()
  • ssrf_pretool.py: 2s timeout on socket.getaddrinfo() to prevent hook blocking on slow/malicious DNS
  • tirith_check.py: Exception message sanitized to type(e).__name__ only (prevents leaking env details to agent); module docstring updated to reflect conditional fail-open/fail-closed behavior
  • scanner_test.go: 8 new tests covering DNS resolution paths (localhost, NXDOMAIN, Scan default, redirect chain) and LLMGuardScanner.Required (fail-closed, fail-open, constructor propagation)

TOCTOU DNS rebinding (filed separately)

The DNS rebinding TOCTOU race (hook resolves DNS at check time, but Claude Code's tool performs its own lookup at request time) is filed as a separate issue. In the OpenShell sandbox, network policies block connections to private IPs at the supervisor level regardless of DNS, so the hook provides defense-in-depth + audit trail while OpenShell is the authoritative enforcement. See the linked issue for details.

Address @ralphbean's security review (3 high-severity findings):

- ssrf.go: default resolveDNS to true in Scan() and ValidateRedirectChain()
  to prevent DNS rebinding bypasses (e.g. metadata.attacker.com → 169.254.169.254)
- ssrf_pretool.py: add socket.getaddrinfo() DNS resolution check to the
  Python hook for parity with the Go scanner
- tirith_check.py: apply TIRITH_REQUIRED guard to TimeoutExpired and generic
  Exception branches, not just FileNotFoundError
- llmguard.go: add Required field that fails closed when Python is unavailable
  (for sandbox context where missing Python indicates tampering); log warning
  when failing open in Path A

Signed-off-by: Wayne Sun <gsun@redhat.com>
…dening

- llmguard.go: add required param to NewLLMGuardScanner constructor so
  callers can enable fail-closed mode; set LLM_GUARD_REQUIRED=1 env var
  so the inline Python script also fails closed on ImportError; write
  warning to stderr instead of stdout
- scan.go: update NewLLMGuardScanner call with required=false for Path A;
  change scan url --resolve-dns default to true for consistency with
  Scan()/ValidateRedirectChain()
- ssrf_pretool.py: add 2s timeout on socket.getaddrinfo() to prevent
  hook blocking on slow/malicious DNS
- tirith_check.py: sanitize exception message to type name only; update
  module docstring to reflect conditional fail-open/fail-closed behavior
- scanner_test.go: add tests for DNS resolution (localhost, NXDOMAIN,
  Scan default, redirect chain), LLMGuardScanner.Required (fail-closed,
  fail-open, constructor propagation)

Signed-off-by: Wayne Sun <gsun@redhat.com>
…mage

Add gitleaks (v8.30.1), pre-commit (v4.5.1), and gitlint (v0.19.1)
to the base sandbox Containerfile so all agent images inherit them.
These are universal tools needed by post-scripts and agent workflows.

Add ENV block for OpenShell TLS proxy CA propagation as a workaround
for NVIDIA/OpenShell#886 — the sandbox proxy re-signs traffic with
its own CA but does not auto-configure git/curl/pip/node to trust it.
Sets GIT_SSL_CAINFO, SSL_CERT_FILE, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE,
and NODE_EXTRA_CA_CERTS to /etc/openshell-tls/ca-bundle.pem. Remove
after upstream fixes CA propagation.

Signed-off-by: Wayne Sun <gsun@redhat.com>
@ralphbean ralphbean added this pull request to the merge queue Apr 20, 2026
Merged via the queue into main with commit 452f1fa Apr 20, 2026
3 of 4 checks passed
@ralphbean ralphbean deleted the feat-security-scanning branch April 20, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants