feat: layered prompt injection security scanning#239
Conversation
b1b3991 to
57591ad
Compare
Site previewPreview: https://39c2c0d4-site.fullsend-ai.workers.dev Commit: |
|
Heads up @maruiz93 — this PR adds a new base sandbox Containerfile at The experiment Containerfile at Key changes to the image setup:
This builds on the sandbox infrastructure from #231 — would appreciate your review on the image layering approach. |
Add defense-in-depth security scanning pipeline with two execution paths: Path A (GHA pre-step): fullsend scan input/output/context/url commands scan EVENT_PAYLOAD before sandbox creation, with LLM Guard ML detection. Path B (sandbox-internal): pre-agent context file scanning after repo copy, runtime PreToolUse/PostToolUse hooks (Tirith, SSRF, secret redaction), and post-agent output scanning with secret redaction. Changes: - Add internal/security package: Scanner interface, Pipeline, UnicodeNormalizer, ContextInjectionScanner, SSRFValidator, SecretRedactor, LLMGuardScanner, trace ID generation - Add SecurityConfig to harness schema (secure by default, *bool for nil=enabled pattern, fail_mode closed/open, escalation config) - Embed Python hook scripts via go:embed (ssrf_pretool, secret_redact_posttool, tirith_check) with JSONL audit logging - Wire fullsend scan subcommands into CLI root - Add pre-agent scan step (7d) and post-agent output scan (8e) to run.go - Bootstrap security hooks and settings.json in sandbox - Generate and propagate trace IDs for finding correlation Implements story 6 (issue #129). Signed-off-by: Wayne Sun <gsun@redhat.com>
The test file contains a private key fixture used to verify the SecretRedactor works correctly. This follows the same pattern as internal/layers/secrets_test.go which is already excluded. Signed-off-by: Wayne Sun <gsun@redhat.com>
Address review findings from security audit of the scanning pipeline: Critical: - LLM Guard fail-open chain converted to fail-closed (Python exception, Go JSON unmarshal error now block instead of passing) - TraceID validated before shell interpolation in buildScanContextCommand High: - SSRF URL extraction uses regex instead of strings.Fields to catch URLs in markdown/JSON contexts - Secret redactor uses ReplaceAll to catch duplicate occurrences - GITHUB_OUTPUT uses multiline delimiter syntax to prevent injection - All three Python hooks enforce 10 MB stdin size limit Medium: - scanOutputFiles walks subdirectories recursively - buildScanContextCommand generates -iname args from ScannableFiles map - Findings file permissions tightened from 0644 to 0600 Low: - TraceID regex tightened to strict UUID v4 format Infrastructure: - Add images/sandbox/Containerfile as production base sandbox image with Claude Code (official installer), rsync, and tirith v0.2.12 (sha256 verified) - Experiment Containerfile now extends base via ARG BASE_IMAGE - run-experiment.sh builds base image then experiment image - tirith_check.py fails closed when TIRITH_REQUIRED=1 (set by harness when tirith is enabled) - Export harness.BoolDefault for cross-package use Signed-off-by: Wayne Sun <gsun@redhat.com>
Install Python, llm-guard[onnxruntime], and pre-download the DeBERTa-v3 prompt injection model at image build time so scans inside the sandbox have no cold-start latency. Update comments in llmguard.go and harness.go to reflect that LLM Guard now runs in both Path A (GHA pre-step) and Path B (sandbox) when the base image is used. Signed-off-by: Wayne Sun <gsun@redhat.com>
Pre-tool hooks (SSRF, tirith) now fail closed on parse errors instead of silently allowing tool calls through. Tirith treats any non-zero exit code as a block when output is unparseable. Supply chain hardening: - Pin llm-guard==0.3.14 in Containerfile - Use crypto/rand for GitHub Output delimiter instead of predictable timestamp - Reject newlines in writeGitHubOutput values Secret redaction improvements: - Reduce mask visibility from prefix[6]+suffix[4] to prefix[4] only - Scan first 10MB of oversized input instead of skipping entirely - Wrap redact_text in try/except with logging on failure Other: - Quote TIRITH_FAIL_ON value in shell command defensively - Fix trace ID fallback to valid UUID v4 format - Remove undocumented FULLSEND_SKIP_LLM_GUARD env var bypass Signed-off-by: Wayne Sun <gsun@redhat.com>
ab897a6 to
714dca7
Compare
ralphbean
left a comment
There was a problem hiding this comment.
Security review — 3 high-severity findings related to fail-open behavior under degraded conditions.
|
Addressed all three findings from @ralphbean's security review in two commits: Commit
|
Address @ralphbean's security review (3 high-severity findings): - ssrf.go: default resolveDNS to true in Scan() and ValidateRedirectChain() to prevent DNS rebinding bypasses (e.g. metadata.attacker.com → 169.254.169.254) - ssrf_pretool.py: add socket.getaddrinfo() DNS resolution check to the Python hook for parity with the Go scanner - tirith_check.py: apply TIRITH_REQUIRED guard to TimeoutExpired and generic Exception branches, not just FileNotFoundError - llmguard.go: add Required field that fails closed when Python is unavailable (for sandbox context where missing Python indicates tampering); log warning when failing open in Path A Signed-off-by: Wayne Sun <gsun@redhat.com>
…dening - llmguard.go: add required param to NewLLMGuardScanner constructor so callers can enable fail-closed mode; set LLM_GUARD_REQUIRED=1 env var so the inline Python script also fails closed on ImportError; write warning to stderr instead of stdout - scan.go: update NewLLMGuardScanner call with required=false for Path A; change scan url --resolve-dns default to true for consistency with Scan()/ValidateRedirectChain() - ssrf_pretool.py: add 2s timeout on socket.getaddrinfo() to prevent hook blocking on slow/malicious DNS - tirith_check.py: sanitize exception message to type name only; update module docstring to reflect conditional fail-open/fail-closed behavior - scanner_test.go: add tests for DNS resolution (localhost, NXDOMAIN, Scan default, redirect chain), LLMGuardScanner.Required (fail-closed, fail-open, constructor propagation) Signed-off-by: Wayne Sun <gsun@redhat.com>
…mage Add gitleaks (v8.30.1), pre-commit (v4.5.1), and gitlint (v0.19.1) to the base sandbox Containerfile so all agent images inherit them. These are universal tools needed by post-scripts and agent workflows. Add ENV block for OpenShell TLS proxy CA propagation as a workaround for NVIDIA/OpenShell#886 — the sandbox proxy re-signs traffic with its own CA but does not auto-configure git/curl/pip/node to trust it. Sets GIT_SSL_CAINFO, SSL_CERT_FILE, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, and NODE_EXTRA_CA_CERTS to /etc/openshell-tls/ca-bundle.pem. Remove after upstream fixes CA propagation. Signed-off-by: Wayne Sun <gsun@redhat.com>
a96f208 to
3e49da8
Compare
Summary
fullsend scan input|output|context|urlcommands scanEVENT_PAYLOADbefore sandbox creation, with optional LLM Guard ML detectionSecurityConfig— secure by default (omitting the block enables all scanners withfail_mode: closed)go:embedwith JSONL audit logging and trace ID correlationNew packages/files
internal/security/— Scanner interface, Pipeline, UnicodeNormalizer, ContextInjectionScanner, SSRFValidator, SecretRedactor, LLMGuardScanner, trace ID generation, hook embeddinginternal/security/hooks/— 3 Python hook scripts (ssrf_pretool, secret_redact_posttool, tirith_check)internal/cli/scan.go—fullsend scanCLI subcommandsModified files
internal/harness/harness.go— SecurityConfig schema types, validation, helper methodsinternal/cli/run.go— pre-agent scan (step 7d), post-agent output scan (step 8e), hook bootstrap, trace ID injectioninternal/cli/root.go— wire scan commandDependencies
Implements story 6 (issue #129).
Test plan
go test ./...— all 10 packages passgo vet ./internal/...— cleango build ./cmd/fullsend/— binary buildsfail_mode: open→ verify warnings but no blocking