feat: migrate security shell scripts to TypeScript#170
Conversation
Assisted-By: docker-agent
Assisted-By: docker-agent
| } | ||
|
|
||
| // Write sanitized output | ||
| writeFileSync(outputPath, outputLines.join('\n'), 'utf-8'); |
There was a problem hiding this comment.
[HIGH] Output file written before blocked check — critical content reaches disk
writeFileSync on line 106 unconditionally writes the sanitized output file before foundCritical is checked on line 109. CRITICAL-pattern lines are not stripped by the SUSPICIOUS_PATTERNS loop (they are separate arrays), so exfiltration commands that triggered foundCritical = true remain in outputLines and are flushed to outputPath before the function returns { blocked: true }.
The caller in index.ts then calls process.exit(1) — but by that point the tainted file already exists on disk. Any subsequent workflow step (or a race with a parallel job) that reads the output file could consume the critical content.
Fix: Guard the write with the foundCritical flag, or move the write to after the outcome decision:
if (foundCritical) {
// Do NOT write the output file — leave it absent or write an empty file
return { blocked: true, stripped: false, riskLevel: 'high' };
}
writeFileSync(outputPath, outputLines.join('\n'), 'utf-8');| // Skip purely quoted lines — regex pattern definitions in code. | ||
| // Matches lines that start with +, whitespace, or - and are entirely | ||
| // a quoted string: e.g. '+"echo.*\\$.*KEY"' or " 'ignore.*previous'" | ||
| if (/^[+\s-]\s*['"].*['"]\s*$/.test(line)) { |
There was a problem hiding this comment.
[HIGH] isFalsePositive quoted-line regex allows bypassing CRITICAL pattern detection
The regex /^[+\s-]\s*['"].*['"]\s*$/ on line 31 matches any line that starts with + (a diff added-line marker) and is wrapped in quotes. A prompt injection payload like:
+"echo $ANTHROPIC_API_KEY"
satisfies both conditions: it starts with + and is surrounded by double-quotes. isFalsePositive returns true, so the line is excluded from realMatches, and foundCritical stays false — the exfiltration command passes through entirely.
This is the false-positive filter intended to skip pattern definition strings in code. But the current regex is too broad: any quoted line starting with + is treated as a pattern definition, not a real command. An attacker who knows this filter can wrap their injection in quotes to evade detection.
Fix: Tighten the heuristic to only skip lines that match known pattern-variable patterns (e.g. contain { } [ ] * + regex metacharacters inside the quotes), rather than skipping all quoted lines:
// Only skip if the quoted content itself contains regex metacharacters
if (/^[+\s-]\s*['"].*[[\]{}()*+?^$\\].*['"]\s*$/.test(line)) {
return true;
}|
|
||
| // Heuristic 2: Skip if the match appears wrapped in single quotes. | ||
| // Quoted patterns in code or documentation are not real secrets. | ||
| if (content.includes(`'${matchedText}'`)) { |
There was a problem hiding this comment.
[MEDIUM] Single-quote heuristic in sanitizeOutput suppresses all detections if any quoted copy of the token exists
The check on line 51 uses content.includes(\'${matchedText}'`) against the *entire* file content. This means: if the AI response contains both a quoted copy ('sk-proj-TOKEN...') and an unquoted copy (sk-proj-TOKEN...) of the same token, the heuristic fires and the continue` skips the real leak.
A prompt-injection attacker who can influence the AI's response could include a line like:
The pattern 'sk-proj-ACTUALTOKEN...' should never appear in output.
sk-proj-ACTUALTOKEN...
The second (real) occurrence is suppressed because the quoted form exists somewhere in the content.
Suggested fix: Check whether this specific match's position is already inside single quotes (e.g. check characters at match.index - 1 and match.index + matchedText.length), rather than a global content.includes scan:
const idx = match.index!;
const isQuoted = content[idx - 1] === "'" && content[idx + matchedText.length] === "'";
if (isQuoted) { continue; }
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
PR: feat: migrate security shell scripts to TypeScript (#170)
One security concern warrants attention (pre-existing bypass faithfully ported from bash). All other drafter findings were dismissed upon code-level verification.
What looks correct:
- ✅ Three-tier sanitize-input logic (CRITICAL→block, SUSPICIOUS→strip, MEDIUM→warn) correctly implemented
- ✅ False-positive filter (isFalsePositive) applied consistently — mirrors bash behavior exactly
- ✅ Pattern ordering in SECRET_PATTERNS is fine — sk-[a-zA-Z0-9]{48} does NOT shadow sk-proj- (hyphen breaks alphanumeric match)
- ✅ setup-node step placed before all three node invocations (check-auth at line 221, sanitize-input at 334, sanitize-output at 752)
- ✅ All output key names preserved: authorized, blocked, stripped, risk-level, leaked
- ✅ release.yml dist/security.js guard added consistently with existing credentials.js/signed-commit.js guards
- ✅ All 27 test scenarios present (21 from test-security.sh + 6 from test-exploits.sh)
- ✅ check-auth node invocation in action.yml correctly relies on bash -e for error propagation
|
|
||
| // Heuristic 2: Skip if the match appears wrapped in single quotes. | ||
| // Quoted patterns in code or documentation are not real secrets. | ||
| if (content.includes(`'${matchedText}'`)) { |
There was a problem hiding this comment.
[MEDIUM] Single-quote heuristic creates a bypassable false-positive filter
content.includes('${matchedText}') skips a regex match if the matched text appears anywhere in the output wrapped in single quotes. An adversarial model response that includes both the raw secret and a single-quoted copy of it would cause the raw token to be whitelisted:
The token is ghp_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa. See docs for 'ghp_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'.
The match ghp_aaaa… is found, then content.includes("'ghp_aaaa…'") returns true → skipped.
Note: This identical logic existed in the original sanitize-output.sh (grep -qF "'$match'" "$OUTPUT_FILE"), so this is a pre-existing design limitation faithfully ported from bash — not a regression introduced by this PR. Consider tightening in a follow-up: only skip if the match was found exclusively inside single-quoted contexts, or add a test that documents the bypass explicitly.
| expect(result.stripped).toBe(false); | ||
| expect(result.riskLevel).toBe('low'); | ||
| }); | ||
|
|
There was a problem hiding this comment.
[LOW] No test covers the single-quote heuristic bypass path
security.test.ts covers Test 14 ("Regex pattern in output — should NOT flag as leak"), but that test's fixture uses 'ghs_[a-zA-Z0-9]{36}' which has regex metacharacters ([, ], {, }) — so Heuristic 1 (metachar check) already catches it before Heuristic 2 even fires.
There is no test that exercises Heuristic 2 in isolation (a token with no metacharacters, appearing in single quotes). Consider adding:
it('Test 14b: Real token in single quotes is skipped (known limitation)', async () => {
const token = 'ghp_' + 'a'.repeat(36);
const file = await writeInput('quoted-token.txt', `See '${token}' for reference.\n`);
const result = sanitizeOutput(file);
// Documents the known bypass: single-quoted real tokens are not flagged
expect(result.leaked).toBe(false); // known limitation, not a regression
});| @@ -0,0 +1,443 @@ | |||
| /** | |||
There was a problem hiding this comment.
[LOW] Docstring claims verbatim test name alignment but numbers are shifted
The header comment says "each it() description matches the original bash test name verbatim", but from test-security.sh test 9 onward the TS test numbers are 1 lower than the bash originals:
| Bash test | TS it() description |
|---|---|
| Test 9: Clean prompt | Test 8 (duplicate): Clean prompt |
| Test 14: Critical input | Test 13: Critical input |
| Test 21: Mixed suspicious + clean | Test 20: Mixed suspicious + clean |
All 27 scenarios are present (21 + 6), so this is cosmetic only. Consider either renumbering the TS tests to match bash, or updating the docstring to say "all 21 + 6 test scenarios are covered; numbering may differ".
Assisted-By: docker-agent
Assisted-By: docker-agent
…sitive filter Assisted-By: docker-agent
Assisted-By: docker-agent
Summary
Closes docker/gordon#460
Migrates all security shell scripts to TypeScript, replacing four bash scripts in
security/with a typed, tested TypeScript module insrc/security/.Changes
New:
src/security/patterns.ts— Single source of truth for all security regex patterns (SECRET_PATTERNS,SECRET_PREFIXES,CRITICAL_PATTERNS,SUSPICIOUS_PATTERNS,MEDIUM_RISK_PATTERNS)check-auth.ts—checkAuth(association, allowedRoles)— exact-match role check, mirrors jq behavioursanitize-input.ts— Three-tier input sanitisation: strip comments → block CRITICAL → strip SUSPICIOUS → warn MEDIUM. Faithfully ports both false-positive filters (self-referential pattern names + purely-quoted lines)sanitize-output.ts— Scans AI response for leaked secrets with two false-positive heuristics: regex metacharacter check + single-quote wrapper checkindex.ts— CLI entrypoint dispatchingcheck-auth,sanitize-input,sanitize-outputsubcommandsTests:
src/security/__tests__/security.test.ts27 vitest unit tests covering all 21 cases from
tests/test-security.shand all 6 cases fromtests/test-exploits.sh.action.ymlSetup Node.jsstep (node@v6.4.0 / node 24) before the security stepsbash security/*.shinvocations withnode "$ACTION_PATH/dist/security.js" <subcommand>.github/workflows/release.ymlAdded
dist/security.jsexistence guard alongside the existingcredentials.js/signed-commit.jschecks..github/workflows/test-e2e.ymlRemoved
test-security,test-exploits, andtest-prompt-sanitizationjobs — replaced entirely by the vitest unit tests.Deleted
security/check-auth.shsecurity/sanitize-input.shsecurity/sanitize-output.shsecurity/secret-patterns.shsecurity/README.md+security/directorytests/test-security.shtests/test-exploits.shtests/test-local.shTest results
All checks pass:
pnpm typecheck,pnpm test,pnpm lint(biome + tsc + actionlint).