fix(security): close 3 LLM prompt injection vectors in commit analyzer (GLOOK-6)#45
Conversation
Captures the design for hardening the commit analyzer against PI-05 (invisible Unicode in diffs), PI-06 (malicious git user.name), and PI-07 (poisoned analyzer-system.txt prompt). Four-layer defense: sanitize invisible chars from diff/message, sanitize authorName, wrap diff in <untrusted_data> tags, and add a CI grep that blocks invisible Unicode from landing in prompts/*.txt. Co-Authored-By: Claude <noreply@anthropic.com>
5 bite-sized tasks: sanitize.ts module + tests (TDD), analyzer.ts hardening + integration tests (TDD), system-prompt rule on both variants, CI grep for prompts/, local smoke. Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Claude <noreply@anthropic.com>
…K-6) PI-05/06 mitigation. Author name, commit message, and diff all flow through the sanitizers before reaching the LLM. The diff is also wrapped in <untrusted_data> tags so the system prompt rule (added in the next commit) can instruct the model to ignore directives within. Co-Authored-By: Claude <noreply@anthropic.com>
…s (GLOOK-6) System-level rule applied to both analyzer prompts (normal + ai-confirmed variant). Combined with the input sanitization in analyzer.ts, this is the third layer of defense against PI-05/06. Co-Authored-By: Claude <noreply@anthropic.com>
PI-07 mitigation. A perl one-liner scans every prompt template for the character ranges that carry hidden LLM instructions. If a future PR adds any such character to a prompt file, the build fails before merge. Co-Authored-By: Claude <noreply@anthropic.com>
msogin
left a comment
There was a problem hiding this comment.
Good security hardening overall — the four-layer defense (input sanitization + <untrusted_data> tagging + system prompt rule + CI gate) is the right approach and the implementation is clean. Two Important findings need to be addressed before merge; the rest are suggestions and a question.
Must fix before merge:
commit.repois unsanitized and sits outside the<untrusted_data>tags — it bypasses both defense layers (see inline)- The C0 control-char test has no C0 chars in its input — only exercises pass-through, not stripping (see inline)
Should fix:
3. CI grep is missing the C0 range that stripInvisible() covers (see inline)
4. Mid-file import in sanitize.test.ts (see inline)
5. commit.author (GitHub login) is unsanitized — low risk since GitHub constrains it to [a-zA-Z0-9-], but worth a comment explaining the reasoning
Question:
6. Should the plan doc be merged to main? (see inline)
| const safeAuthor = sanitizeAuthorName(commit.authorName); | ||
| const safeMessage = stripInvisible(commit.message); | ||
| const safeDiff = stripInvisible(commit.diff || '(no diff available)'); | ||
| return `Repository: ${commit.repo} |
There was a problem hiding this comment.
Must fix — commit.repo is unsanitized and outside <untrusted_data> tags.
All other attacker-influenced fields (authorName, message, diff) are sanitized before reaching the LLM. commit.repo is not — and it appears on the Repository: line above the <untrusted_data> block, so the system prompt rule doesn't cover it either.
While repo names come from the GitHub API (lower risk than a git config field), the inconsistency is a gap in the claimed coverage. Fix:
export function buildAnalyzerUserMessage(commit: CommitData): string {
const safeRepo = stripInvisible(commit.repo); // add this
const safeAuthor = sanitizeAuthorName(commit.authorName);
const safeMessage = stripInvisible(commit.message);
const safeDiff = stripInvisible(commit.diff || '(no diff available)');
return `Repository: ${safeRepo}
...| const safeMessage = stripInvisible(commit.message); | ||
| const safeDiff = stripInvisible(commit.diff || '(no diff available)'); | ||
| return `Repository: ${commit.repo} | ||
| Author: ${safeAuthor} (@${commit.author}) |
There was a problem hiding this comment.
commit.author (GitHub login) is also unsanitized. GitHub constrains logins to [a-zA-Z0-9-] so the practical risk is low, but a comment explaining that assumption would prevent a future reviewer from thinking it was accidentally omitted:
// commit.author is a GitHub login — constrained to [a-zA-Z0-9-] by GitHub, no sanitization needed
return `Repository: ${safeRepo}
Author: ${safeAuthor} (@${commit.author})| - run: npm ci | ||
| - name: Detect invisible Unicode in prompts | ||
| run: | | ||
| if perl -CSD -ne 'exit 1 if /[\x{E0020}-\x{E007F}\x{200B}-\x{200F}\x{202A}-\x{202E}\x{2060}\x{FEFF}]/' prompts/*.txt; then |
There was a problem hiding this comment.
Should fix — CI grep is missing the C0 control char range.
stripInvisible() removes C0 control chars (U+0001–U+001F, excluding \t \n \r) from runtime inputs, but this CI gate on prompts/*.txt doesn't scan for them. A prompt file containing a C0 control char would pass CI but get stripped at runtime — an inconsistency.
The C0 range was likely omitted to avoid false positives from \t and \n that legitimately appear in prompt files. If so, add a comment saying so. Otherwise, extend the pattern:
'exit 1 if /[\x{E0020}-\x{E007F}\x{200B}-\x{200F}\x{202A}-\x{202E}\x{2060}\x{FEFF}\x{0001}-\x{0008}\x{000B}\x{000C}\x{000E}-\x{001F}]/'
# \x{0009}(tab), \x{000A}(LF), \x{000D}(CR) excluded to avoid false positives in prompt files| }); | ||
|
|
||
| it('strips C0 control chars (U+0001 — U+001F) except \\t \\n \\r', () => { | ||
| expect(stripInvisible('abcd')).toBe('abcd'); |
There was a problem hiding this comment.
Must fix — this test doesn't actually test C0 stripping.
The test is labeled "strips C0 control chars (U+0001 — U+001F) except \t \n \r" but the input 'abcd' contains no C0 characters. It only verifies that printable chars pass through — which is already covered by the "passes through normal printable ASCII" test above.
Fix:
it('strips C0 control chars (U+0001 — U+001F) except \\t \\n \\r', () => {
expect(stripInvisible('a\x01b\x07c')).toBe('abc'); // strips SOH, BEL
expect(stripInvisible('a\tb\nc')).toBe('a\tb\nc'); // preserves tab, newline
});| }); | ||
| }); | ||
|
|
||
| import { buildAnalyzerUserMessage } from '@/lib/analyzer'; |
There was a problem hiding this comment.
Should fix — import should be at the top of the file.
This import { buildAnalyzerUserMessage } statement appears mid-file, after the sanitizeAuthorName describe block. ESLint's import/first rule flags imports that appear after non-import statements. Move it to the top alongside the existing sanitize import:
import { stripInvisible, sanitizeAuthorName } from '@/lib/sanitize';
import { buildAnalyzerUserMessage } from '@/lib/analyzer';| @@ -0,0 +1,447 @@ | |||
| # GLOOK-6 Analyzer Prompt Injection — Implementation Plan | |||
There was a problem hiding this comment.
Question — should this plan doc be merged to main?
A few concerns:
- The regex example in this plan source (around line 156) has embedded invisible Unicode characters in the pattern — the plan doc itself contains the attack vector the PR protects against. If the CI grep were to scan
docs/this file would fail it. - The doc contains absolute local machine paths (e.g.
/Users/msogin/Desktop/claudecode/glooker/...) that have no meaning outside the original dev machine. - The implementation checklist is fully completed — its value ended when the plan was executed.
The spec doc (docs/superpowers/specs/...) has clear ongoing reference value. The plan doc's purpose was to guide agentic execution, which is now done. Recommend either removing it from this PR or keeping it out of main.
Must-fix:
- 🟡 commit.repo was unsanitized AND sat outside the <untrusted_data>
tags — bypassed both defense layers. Now flows through stripInvisible()
in buildAnalyzerUserMessage. Lower practical risk (repo names come from
GitHub API, more constrained than git config), but the inconsistency
was a real gap in the claimed coverage.
- 🟡 C0 control-char test only asserted pass-through of printable text;
the input had literal control bytes that didn't survive the read tool's
rendering. Replaced with explicit \x01/\x07/\x1F escape sequences plus
a separate assertion that \t/\n/\r are preserved.
Should-fix:
- 🔵 CI grep extended to cover the C0 control-char range so it matches
the runtime stripInvisible() character set. \t/\n/\r intentionally
excluded (legitimately appear in prompt files); rationale documented
in the workflow comment.
- 🔵 buildAnalyzerUserMessage import moved to the top of the test file
alongside the existing sanitize import. ESLint import/first compliance.
- 🔵 Added a one-line comment in analyzer.ts explaining why commit.author
(GitHub login) is not sanitized: it's constrained to [a-zA-Z0-9-] by
GitHub.
Plan doc:
- 🟣 Reviewer flagged that the plan markdown contained literal invisible
Unicode characters (zero-width, bidi, C0 controls) in test fixture
examples — making the plan itself a future attack vector if CI ever
extends to scan docs/. Scrubbed all of them, replacing with explicit
\u{...} and \xXX escape sequences. The plan now satisfies the same
CI gate as prompts/.
Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Implements GLOOK-6: closes three LLM prompt-injection vectors in the commit analyzer.
Spec:
docs/superpowers/specs/2026-05-20-glook-6-analyzer-prompt-injection-design.mdPlan:
docs/superpowers/plans/2026-05-20-glook-6-analyzer-prompt-injection.mdVectors closed
git user.namecarrying instructions throughcommit.authorName.prompts/analyzer-system.txt(which would permanently poison the system prompt for every commit analyzed forever).The analyzer's response feeds
complexity/type/impact_summary/risk_level/maybe_aiintodeveloper_stats, which feeds the impact score — flipping a single field cascades into the entire metric chain. Defense matters.Four-layer defense
stripInvisible()removes invisible chars fromcommit.diffandcommit.messagebefore the LLM sees them.sanitizeAuthorName()restrictscommit.authorNameto printable ASCII, capped at 80 chars.<untrusted_data>tags wrap the diff in the user message. System prompts on both analyzer variants instruct the model to ignore any directives that appear inside those tags.test.ymlfails the build ifprompts/*.txtever contains invisible Unicode — blocks PI-07 at merge time.Smoke test against a synthetic payload
After
buildAnalyzerUserMessage:All invisible carriers stripped; diff wrapped in tags. The remaining ASCII fragment of the author payload is the visible-ASCII residual that the spec explicitly defers (mitigated only by the system prompt rule, which only applies inside the
<untrusted_data>tags).Files
src/lib/sanitize.ts—stripInvisible()+sanitizeAuthorName()pure helpers.src/lib/__tests__/unit/sanitize.test.ts— 13 unit tests + 5 integration tests againstbuildAnalyzerUserMessage.src/lib/analyzer.ts— extracted user-message construction into exported purebuildAnalyzerUserMessage(); applies sanitizers; wraps diff in<untrusted_data>tags.prompts/analyzer-system.txt+prompts/analyzer-system-ai-confirmed.txt— one rule instructing the model to ignore directives inside<untrusted_data>tags. Same text in both variants (the analyzer routes between them based oncommit.aiCoAuthored; hardening only one would leave the other exposed)..github/workflows/test.yml— new step betweennpm ciandnpm testthat fails on invisible Unicode inprompts/*.txt.<untrusted_data>wrapping.Tests
npm test→ 64 suites / 674 tests / 8 snapshots, all green.npx tsc --noEmit -p tsconfig.json→ clean.prompts/*.txt→ OK.Out of scope (documented in spec)
Test plan
🤖 Generated with Claude Code