feat: layered prompt injection security scanning by waynesun09 · Pull Request #239 · fullsend-ai/fullsend

waynesun09 · 2026-04-15T01:11:19Z

Summary

Adds defense-in-depth security scanning pipeline with two execution paths for the fullsend CLI runner
Path A (GHA pre-step): fullsend scan input|output|context|url commands scan EVENT_PAYLOAD before sandbox creation, with optional LLM Guard ML detection
Path B (sandbox-internal): pre-agent context file scanning, runtime Claude Code hooks (Tirith, SSRF, secret redaction), post-agent output scanning
Extends harness YAML schema with SecurityConfig — secure by default (omitting the block enables all scanners with fail_mode: closed)
Embeds Python hook scripts via go:embed with JSONL audit logging and trace ID correlation

New packages/files

internal/security/ — Scanner interface, Pipeline, UnicodeNormalizer, ContextInjectionScanner, SSRFValidator, SecretRedactor, LLMGuardScanner, trace ID generation, hook embedding
internal/security/hooks/ — 3 Python hook scripts (ssrf_pretool, secret_redact_posttool, tirith_check)
internal/cli/scan.go — fullsend scan CLI subcommands

Modified files

internal/harness/harness.go — SecurityConfig schema types, validation, helper methods
internal/cli/run.go — pre-agent scan (step 7d), post-agent output scan (step 8e), hook bootstrap, trace ID injection
internal/cli/root.go — wire scan command

Dependencies

Draft: Depends on PR #231 (sandbox runner) merging first. This branch is based on PR 231's commit 1383726.

Implements story 6 (issue #129).

Test plan

go test ./... — all 10 packages pass
go vet ./internal/... — clean
go build ./cmd/fullsend/ — binary builds
E2E test with known-bad payload → verify host scan blocks
E2E test with security enabled → verify hooks fire and findings logged
E2E test with fail_mode: open → verify warnings but no blocking

github-actions · 2026-04-16T19:27:14Z

Site preview

Preview: https://39c2c0d4-site.fullsend-ai.workers.dev

Commit: 3e49da87615320dd5f6636c77c9fecc9c1b7ecfe

waynesun09 · 2026-04-16T20:12:14Z

Heads up @maruiz93 — this PR adds a new base sandbox Containerfile at images/sandbox/Containerfile that extracts the common runtime layer (Claude Code, rsync, tirith) out of the experiment-specific image.

The experiment Containerfile at experiments/runner-hello-world/experiment/Containerfile now extends this base via ARG BASE_IMAGE, and run-experiment.sh builds the base first then the experiment image on top.

Key changes to the image setup:

Claude Code installed via the official curl -fsSL https://claude.ai/install.sh | bash (replaces npm approach, no Node.js dependency needed)
tirith v0.2.12 baked in with sha256 checksum verification for supply chain safety
TIRITH_REQUIRED=1 env var set by the harness so the hook fails closed if the binary is unexpectedly missing

This builds on the sandbox infrastructure from #231 — would appreciate your review on the image layering approach.

Add defense-in-depth security scanning pipeline with two execution paths: Path A (GHA pre-step): fullsend scan input/output/context/url commands scan EVENT_PAYLOAD before sandbox creation, with LLM Guard ML detection. Path B (sandbox-internal): pre-agent context file scanning after repo copy, runtime PreToolUse/PostToolUse hooks (Tirith, SSRF, secret redaction), and post-agent output scanning with secret redaction. Changes: - Add internal/security package: Scanner interface, Pipeline, UnicodeNormalizer, ContextInjectionScanner, SSRFValidator, SecretRedactor, LLMGuardScanner, trace ID generation - Add SecurityConfig to harness schema (secure by default, *bool for nil=enabled pattern, fail_mode closed/open, escalation config) - Embed Python hook scripts via go:embed (ssrf_pretool, secret_redact_posttool, tirith_check) with JSONL audit logging - Wire fullsend scan subcommands into CLI root - Add pre-agent scan step (7d) and post-agent output scan (8e) to run.go - Bootstrap security hooks and settings.json in sandbox - Generate and propagate trace IDs for finding correlation Implements story 6 (issue #129). Signed-off-by: Wayne Sun <gsun@redhat.com>

The test file contains a private key fixture used to verify the SecretRedactor works correctly. This follows the same pattern as internal/layers/secrets_test.go which is already excluded. Signed-off-by: Wayne Sun <gsun@redhat.com>

Address review findings from security audit of the scanning pipeline: Critical: - LLM Guard fail-open chain converted to fail-closed (Python exception, Go JSON unmarshal error now block instead of passing) - TraceID validated before shell interpolation in buildScanContextCommand High: - SSRF URL extraction uses regex instead of strings.Fields to catch URLs in markdown/JSON contexts - Secret redactor uses ReplaceAll to catch duplicate occurrences - GITHUB_OUTPUT uses multiline delimiter syntax to prevent injection - All three Python hooks enforce 10 MB stdin size limit Medium: - scanOutputFiles walks subdirectories recursively - buildScanContextCommand generates -iname args from ScannableFiles map - Findings file permissions tightened from 0644 to 0600 Low: - TraceID regex tightened to strict UUID v4 format Infrastructure: - Add images/sandbox/Containerfile as production base sandbox image with Claude Code (official installer), rsync, and tirith v0.2.12 (sha256 verified) - Experiment Containerfile now extends base via ARG BASE_IMAGE - run-experiment.sh builds base image then experiment image - tirith_check.py fails closed when TIRITH_REQUIRED=1 (set by harness when tirith is enabled) - Export harness.BoolDefault for cross-package use Signed-off-by: Wayne Sun <gsun@redhat.com>

Install Python, llm-guard[onnxruntime], and pre-download the DeBERTa-v3 prompt injection model at image build time so scans inside the sandbox have no cold-start latency. Update comments in llmguard.go and harness.go to reflect that LLM Guard now runs in both Path A (GHA pre-step) and Path B (sandbox) when the base image is used. Signed-off-by: Wayne Sun <gsun@redhat.com>

Pre-tool hooks (SSRF, tirith) now fail closed on parse errors instead of silently allowing tool calls through. Tirith treats any non-zero exit code as a block when output is unparseable. Supply chain hardening: - Pin llm-guard==0.3.14 in Containerfile - Use crypto/rand for GitHub Output delimiter instead of predictable timestamp - Reject newlines in writeGitHubOutput values Secret redaction improvements: - Reduce mask visibility from prefix[6]+suffix[4] to prefix[4] only - Scan first 10MB of oversized input instead of skipping entirely - Wrap redact_text in try/except with logging on failure Other: - Quote TIRITH_FAIL_ON value in shell command defensively - Fix trace ID fallback to valid UUID v4 format - Remove undocumented FULLSEND_SKIP_LLM_GUARD env var bypass Signed-off-by: Wayne Sun <gsun@redhat.com>

ralphbean

Security review — 3 high-severity findings related to fail-open behavior under degraded conditions.

waynesun09 · 2026-04-17T18:13:05Z

Addressed all three findings from @ralphbean's security review in two commits:

Commit `bc7790d` — direct fixes for the 3 high-severity findings

ssrf.go: Default resolveDNS to true in Scan() and ValidateRedirectChain() to prevent DNS rebinding bypasses
ssrf_pretool.py: Add socket.getaddrinfo() DNS resolution check to the Python hook for parity with the Go scanner
tirith_check.py: Apply TIRITH_REQUIRED guard to TimeoutExpired and generic Exception branches (not just FileNotFoundError)
llmguard.go: Add Required field that fails closed when Python is unavailable in sandbox context

Commit `a96f208` — hardening from follow-up review (3 agents: Gemini, security-focused, code quality)

llmguard.go: NewLLMGuardScanner constructor now accepts required bool param (was dead code before — field was never set); LLM_GUARD_REQUIRED=1 env var propagated to inline Python so ImportError also fails closed; warning writes to stderr instead of stdout
scan.go: CLI scan url --resolve-dns defaults to true for consistency with Scan()/ValidateRedirectChain()
ssrf_pretool.py: 2s timeout on socket.getaddrinfo() to prevent hook blocking on slow/malicious DNS
tirith_check.py: Exception message sanitized to type(e).__name__ only (prevents leaking env details to agent); module docstring updated to reflect conditional fail-open/fail-closed behavior
scanner_test.go: 8 new tests covering DNS resolution paths (localhost, NXDOMAIN, Scan default, redirect chain) and LLMGuardScanner.Required (fail-closed, fail-open, constructor propagation)

TOCTOU DNS rebinding (filed separately)

The DNS rebinding TOCTOU race (hook resolves DNS at check time, but Claude Code's tool performs its own lookup at request time) is filed as a separate issue. In the OpenShell sandbox, network policies block connections to private IPs at the supervisor level regardless of DNS, so the hook provides defense-in-depth + audit trail while OpenShell is the authoritative enforcement. See the linked issue for details.

@ralphbean

Address @ralphbean's security review (3 high-severity findings): - ssrf.go: default resolveDNS to true in Scan() and ValidateRedirectChain() to prevent DNS rebinding bypasses (e.g. metadata.attacker.com → 169.254.169.254) - ssrf_pretool.py: add socket.getaddrinfo() DNS resolution check to the Python hook for parity with the Go scanner - tirith_check.py: apply TIRITH_REQUIRED guard to TimeoutExpired and generic Exception branches, not just FileNotFoundError - llmguard.go: add Required field that fails closed when Python is unavailable (for sandbox context where missing Python indicates tampering); log warning when failing open in Path A Signed-off-by: Wayne Sun <gsun@redhat.com>

…dening - llmguard.go: add required param to NewLLMGuardScanner constructor so callers can enable fail-closed mode; set LLM_GUARD_REQUIRED=1 env var so the inline Python script also fails closed on ImportError; write warning to stderr instead of stdout - scan.go: update NewLLMGuardScanner call with required=false for Path A; change scan url --resolve-dns default to true for consistency with Scan()/ValidateRedirectChain() - ssrf_pretool.py: add 2s timeout on socket.getaddrinfo() to prevent hook blocking on slow/malicious DNS - tirith_check.py: sanitize exception message to type name only; update module docstring to reflect conditional fail-open/fail-closed behavior - scanner_test.go: add tests for DNS resolution (localhost, NXDOMAIN, Scan default, redirect chain), LLMGuardScanner.Required (fail-closed, fail-open, constructor propagation) Signed-off-by: Wayne Sun <gsun@redhat.com>

…mage Add gitleaks (v8.30.1), pre-commit (v4.5.1), and gitlint (v0.19.1) to the base sandbox Containerfile so all agent images inherit them. These are universal tools needed by post-scripts and agent workflows. Add ENV block for OpenShell TLS proxy CA propagation as a workaround for NVIDIA/OpenShell#886 — the sandbox proxy re-signs traffic with its own CA but does not auto-configure git/curl/pip/node to trust it. Sets GIT_SSL_CAINFO, SSL_CERT_FILE, REQUESTS_CA_BUNDLE, CURL_CA_BUNDLE, and NODE_EXTRA_CA_CERTS to /etc/openshell-tls/ca-bundle.pem. Remove after upstream fixes CA propagation. Signed-off-by: Wayne Sun <gsun@redhat.com>

waynesun09 force-pushed the feat-security-scanning branch 2 times, most recently from b1b3991 to 57591ad Compare April 16, 2026 19:26

github-actions Bot deployed to site-preview April 16, 2026 19:27 View deployment

github-actions Bot deployed to site-preview April 16, 2026 20:09 View deployment

waynesun09 marked this pull request as ready for review April 16, 2026 20:12

github-actions Bot deployed to site-preview April 16, 2026 20:17 View deployment

github-actions Bot deployed to site-preview April 16, 2026 20:26 View deployment

waynesun09 added 5 commits April 17, 2026 08:57

waynesun09 force-pushed the feat-security-scanning branch from ab897a6 to 714dca7 Compare April 17, 2026 12:58

github-actions Bot deployed to site-preview April 17, 2026 12:59 View deployment

ralphbean reviewed Apr 17, 2026

View reviewed changes

Comment thread internal/security/ssrf.go

Comment thread internal/security/hooks/tirith_check.py

Comment thread internal/security/llmguard.go

github-actions Bot deployed to site-preview April 17, 2026 18:12 View deployment

waynesun09 mentioned this pull request Apr 17, 2026

SSRF: TOCTOU DNS rebinding race between hook validation and tool execution #265

Open

waynesun09 added 3 commits April 17, 2026 19:10

waynesun09 force-pushed the feat-security-scanning branch from a96f208 to 3e49da8 Compare April 20, 2026 20:12

github-actions Bot deployed to site-preview April 20, 2026 20:12 View deployment

ralphbean approved these changes Apr 20, 2026

View reviewed changes

ralphbean added this pull request to the merge queue Apr 20, 2026

Merged via the queue into main with commit 452f1fa Apr 20, 2026
3 of 4 checks passed

ralphbean deleted the feat-security-scanning branch April 20, 2026 20:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: layered prompt injection security scanning#239

feat: layered prompt injection security scanning#239
ralphbean merged 8 commits into
mainfrom
feat-security-scanning

waynesun09 commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

waynesun09 commented Apr 16, 2026

Uh oh!

ralphbean left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

waynesun09 commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

waynesun09 commented Apr 15, 2026

Summary

New packages/files

Modified files

Dependencies

Test plan

Uh oh!

github-actions Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Site preview

Uh oh!

waynesun09 commented Apr 16, 2026

Uh oh!

ralphbean left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

waynesun09 commented Apr 17, 2026

Commit bc7790d — direct fixes for the 3 high-severity findings

Commit a96f208 — hardening from follow-up review (3 agents: Gemini, security-focused, code quality)

TOCTOU DNS rebinding (filed separately)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 16, 2026 •

edited

Loading

Commit `bc7790d` — direct fixes for the 3 high-severity findings

Commit `a96f208` — hardening from follow-up review (3 agents: Gemini, security-focused, code quality)