WP-B: Security Hardening - Prompt Injection Defense & Audit Logging by Steffen025 · Pull Request #43 · Steffen025/pai-opencode

Steffen025 · 2026-03-06T18:58:35Z

Summary

Implements comprehensive security hardening for PAI-OpenCode v3.0 (WP-B).

Changes

New Files

.opencode/plugins/lib/injection-patterns.ts — 6 categories of prompt injection patterns (~30 total)
.opencode/plugins/lib/sanitizer.ts — Input normalization pipeline
docs/architecture/adr/ADR-011-security-hardening.md — Architectural decision record

Modified Files

.opencode/plugins/adapters/types.ts — Added 6 new DANGEROUS_PATTERNS
.opencode/plugins/handlers/security-validator.ts — Full refactor with audit logging

Security Improvements

Before WP-B	After WP-B
5 inline injection patterns	30 categorized patterns
No input sanitization	4-step normalization pipeline
Only checks args.content	Scans all 9 text fields
No audit trail	JSONL audit log
15 dangerous patterns	21 dangerous patterns

Testing

All files pass Biome checks
No console.log in new files
Type-safe exports for all new modules

Summary by CodeRabbit

New Features
- Erweiterte Erkennung moderner Angriffsvektoren (obfuskierte RCE, Befehls‑Substitution, Umgebungs‑Exfiltration, SSH/Schlüssel‑Lecks) mit kategorisierten Mustern.
- Umfangreiche Eingabesanitierung (Base64‑Decodierung, Unicode‑Normalisierung, Obfuskationsauflösung, HTML‑Strip) und mehrfeld‑Scanning aller relevanten Texteingaben.
- Redigierte Befehlsvorschauen und asynchrones Audit‑Logging (append‑only JSONL).
Bug Fixes / Verhalten
- Fehlertolerantes (fail‑open) Verhalten: Fehler protokolliert, Abläufe bleiben erlaubt.
Documentation
- Neue Architektur‑ADR und Sicherheits‑Hardening‑Plan ergänzt.

Plan documents: - Current state analysis (what security-validator.ts already covers) - 4 blocks: injection-patterns.ts, sanitizer.ts, types.ts extensions, ADR-011 - 6 injection categories: instruction override, role hijacking, system prompt extraction, safety bypass, context separators, MCP tool injection - Sanitizer pipeline: base64 decode, unicode normalize, spacing collapse, HTML strip - Security audit log: JSONL at MEMORY/STATE/security-audit.jsonl - New DANGEROUS_PATTERNS: base64+exec, env exfiltration, Python/Node RCE - File manifest with exact line count estimates - ISC preview criteria for verification

Implements comprehensive security improvements: - lib/injection-patterns.ts: 6 categories of injection patterns (~30 total) * instruction_override, role_hijacking, system_prompt_extraction * safety_bypass, context_separator, mcp_tool_injection - lib/sanitizer.ts: Input normalization pipeline * decodeBase64Payloads, normalizeUnicode, collapseObfuscatedSpacing * stripHtmlTags, sanitizeForSecurityCheck - adapters/types.ts: 6 new DANGEROUS_PATTERNS * base64+exec, env exfiltration, Python/Node RCE, SSH recon - handlers/security-validator.ts: Full refactor * Uses new libraries, multi-field scanning, security audit logging * Logs to MEMORY/STATE/security-audit.jsonl - ADR-011: Documents security hardening decisions Closes WP-B (PR #B)

coderabbitai · 2026-03-06T18:58:57Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Fügt ein modulares Security‑Hardening hinzu: neue Sanitizer‑Pipeline, katalogisierte Injection‑Pattern‑Bibliothek, erweitertes Multi‑Feld‑Scanning in der Security‑Validator‑Logik, asynchrones JSONL‑Audit‑Logging und Erweiterung von DANGEROUS_PATTERNS um moderne Regex‑Vektoren.

Changes

Cohort / File(s)	Summary
Injection Patterns Library `.opencode/plugins/lib/injection-patterns.ts`	Neues Modul mit kategorisierten, vorcompilierten RegExp‑Arrays (`instruction_override`, `role_hijacking`, `system_prompt_extraction`, `safety_bypass`, `context_separator`, `mcp_tool_injection`, `pii_credential_leak`), Export von `ALL_INJECTION_PATTERNS`, Typen (`InjectionCategory`, `InjectionMatch`) und `detectInjections()`‑Funktion; liefert eine Match‑Struktur mit Positionen und redigiertem Text.
Input Sanitization Pipeline `.opencode/plugins/lib/sanitizer.ts`	Neues Sanitizer‑Modul: `decodeBase64Payloads`, `normalizeUnicode`, `collapseObfuscatedSpacing`, `stripHtmlTags`, `sanitizeForSecurityCheck` und exportierte `INJECTION_SCAN_FIELDS`; orchestriert Vorverarbeitung vor Pattern‑Matching und entobfuskierende Transforms.
Security Validator Enhancement `.opencode/plugins/handlers/security-validator.ts`	Umfassende Überarbeitung: Multi‑Field‑Scanning über `INJECTION_SCAN_FIELDS`, Nutzung von `sanitizeForSecurityCheck` und `detectInjections`, asynchrones JSONL‑Audit‑Logging (`MEMORY/STATE/security-audit.jsonl`), `SecurityAuditEntry` Interface, `logSecurityEvent()` Helper; redigiert Secrets in Previews und implementiert Fail‑Open‑Verhalten mit Audit‑Einträgen bei allen Entscheidungspunkten.
Dangerous Patterns Extension `.opencode/plugins/adapters/types.ts`	Erweiterung von `DANGEROUS_PATTERNS` um mehrere neue Regex‑Muster für obfuskierte RCE (Base64, command substitution/pipe), env‑Exfiltration, Python/Node one‑liner RCE und SSH/Identity‑Exposure.
Docs: ADR & Epic `docs/architecture/adr/ADR-011-security-hardening.md`, `docs/architecture/adr/README.md`, `docs/epic/WP-B-SECURITY-HARDENING-PLAN.md`	Neue ADR und Epic‑Dokumentation beschreibt Architektur, Sanitization‑Pipeline, Pattern‑Kategorien, Audit‑Format (JSONL), Fail‑Open‑Rationale und Verifikationspläne; ADR‑Index ergänzt.
Runtime artifact (logging) `MEMORY/STATE/security-audit.jsonl` (usage)	Neuer Audit‑Log‑Pfad wird asynchron im JSONL‑Format angehängt; Logeinträge mit `SecurityAuditEntry`‑Feldsatz (timestamp, tool, action, reason, pattern?, category?, commandPreview?).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Validator as SecurityValidator
    participant Sanitizer as Sanitizer
    participant Detector as InjectionDetector
    participant Logger as AuditLogger

    Client->>Validator: validateSecurity(args)
    Validator->>Validator: extractCommand(args)

    alt no command extracted
        Validator->>Logger: logSecurityEvent({action:"allowed", reason:"no_command"})
        Logger-->>Validator: ack
        Validator-->>Client: allow
    else command present
        Validator->>Sanitizer: sanitizeForSecurityCheck(field)
        Sanitizer->>Sanitizer: decodeBase64Payloads -> normalizeUnicode
        Sanitizer->>Sanitizer: collapseObfuscatedSpacing -> stripHtmlTags
        Sanitizer-->>Validator: sanitizedContent

        Validator->>Detector: detectInjections(sanitizedContent)
        Detector-->>Validator: [InjectionMatch...]

        alt injection detected
            Validator->>Logger: logSecurityEvent({action:"blocked", category, pattern, commandPreview})
            Logger-->>Validator: ack
            Validator-->>Client: block
        else requires confirmation
            Validator->>Logger: logSecurityEvent({action:"confirmed", pattern, commandPreview})
            Logger-->>Validator: ack
            Validator-->>Client: request_confirmation
        else all clear
            Validator->>Logger: logSecurityEvent({action:"allowed", reason:"all_checks_passed", commandPreview})
            Logger-->>Validator: ack
            Validator-->>Client: allow
        end
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇
Ich hopp' durchs Bytefeld, such' nach dunklem Licht,
entwirr' versteckte Zeilen, bring' Transparenz ans Licht.
Ich dekodier', ich glätt', ich schütz' vor falschem Tun,
schreib' leise in das Log — so kann man’s später ruh'n.
Hopp! Ein sichrer Sprung, die Warnung bleibt im Nu.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Der PR-Titel bezieht sich auf die Hauptänderungen des Changeset: Security Hardening mit Prompt Injection Defense und Audit Logging, was genau den Inhalten der neuen Module, Pattern-Library und Logging-Funktionalität entspricht.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/wp-b-security-hardening

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.opencode/plugins/handlers/security-validator.ts (1)

177-192: ⚠️ Potential issue | 🔴 Critical

Injection-Checks werden für Inputs ohne extrahierbaren Command komplett übersprungen.

In Line 179 erfolgt allow vor checkAllFieldsForInjection(...). Damit greift der neue Multi-Field-Scan nur für Fälle mit extractCommand(...)-Treffer.

🛠️ Vorschlag (Reihenfolge korrigieren)

-		const command = extractCommand(input);
-
-		if (!command) {
-			fileLog(`No command extracted from input`, "warn");
-			// No command to validate - allow by default
-			await logSecurityEvent({
-				timestamp: new Date().toISOString(),
-				tool: input.tool,
-				action: "allowed",
-				reason: "No command extracted",
-			});
-			return {
-				action: "allow",
-				reason: "No command to validate",
-			};
-		}
+		const injectionResult = input.args
+			? checkAllFieldsForInjection(input.args)
+			: null;
+		if (injectionResult) {
+			// existing block-path for injection detection
+			// (moved here so it runs regardless of command extraction)
+		}
+
+		const command = extractCommand(input);
+		if (!command) {
+			fileLog(`No command extracted from input`, "warn");
+			void logSecurityEvent({
+				timestamp: new Date().toISOString(),
+				tool: input.tool,
+				action: "allowed",
+				reason: "No command extracted",
+			});
+			return {
+				action: "allow",
+				reason: "No command to validate",
+			};
+		}

Also applies to: 216-246

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.opencode/plugins/handlers/security-validator.ts around lines 177 - 192, The
early-return path that allows inputs when extractCommand(input) returns falsy
skips the injection scanner; move or add invocation of
checkAllFieldsForInjection(input, ...) before the allow return in the
extractCommand branch so inputs without an extracted command still undergo
injection checks, and if checkAllFieldsForInjection reports an injection, call
logSecurityEvent(...) and return { action: "deny", reason: "injection detected"
} instead of allowing; apply the same fix to the analogous block that currently
allows before scanning (the block around the other early-return mentioned) so
both code paths use checkAllFieldsForInjection and logSecurityEvent
consistently.

🧹 Nitpick comments (3)

.opencode/plugins/lib/sanitizer.ts (1)

80-82: Obfuskierte Wörter werden nur am Anfang des Strings erkannt.

Durch den ^-Anker in Line 80 wird z. B. please i g n o r e ... nicht normalisiert. Für Security-Checks sollte die Erkennung im gesamten String greifen.

♻️ Vorschlag

-	if (/^(\w\s){4,}/.test(content.trim())) {
+	if (/(?:\b\w\s){4,}\w\b/.test(content)) {
 		return content.replace(/(\w)\s(?=\w)/g, "$1");
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.opencode/plugins/lib/sanitizer.ts around lines 80 - 82, The detection regex
in sanitizer.ts currently uses the start-anchor (^ in the condition
/^(\\w\\s){4,}/) so obfuscated words are only detected at the string start;
change the condition to test the entire string (e.g. remove the ^ so it becomes
/(\w\s){4,}/ and run it against the full content rather than content.trim()),
leaving the replacement (/(\w)\s(?=\w)/g in the return call) intact so
spaced-out words anywhere in the string get normalized.

docs/architecture/adr/ADR-011-security-hardening.md (1)

127-137: Für die Architektur fehlt ein Mermaid-Detaildiagramm im einklappbaren Abschnitt.

Die ADR ist inhaltlich gut, aber für die gewünschte Doku-Form wäre ein kurzes Flow-Diagramm (Sanitizer -> Detection -> Decision -> Audit) hilfreich.
📈 Beispielstruktur
<details>
<summary>Mermaid: Security-Flow (Detail)</summary>

```mermaid
flowchart LR
  A[Tool Input] --> B[Sanitizer Pipeline]
  B --> C[detectInjections + Command Checks]
  C --> D{Decision}
  D -->|block/confirm/allow| E[security-audit.jsonl]
```

As per coding guidelines: `docs/**`: Mermaid diagrams in collapsible sections for detail.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/adr/ADR-011-security-hardening.md` around lines 127 - 137,
Füge in der ADR-011-security-hardening.md unter der Implementation- oder Files
Modified-Sektion einen einklappbaren Details-Block mit einem Mermaid-Flowchart
namens "Mermaid: Security-Flow (Detail)" ein; der Block soll einen flowchart LR
zeigen: Tool Input -> Sanitizer Pipeline -> detectInjections + Command Checks ->
Entscheidung (block/confirm/allow) -> security-audit.jsonl, und muss als
```mermaid fenced code block innerhalb eines <details><summary>Mermaid:
Security-Flow (Detail)</summary> ... </details> eingefügt werden, damit die
Dokumentationsrichtlinie docs/** erfüllt wird.

docs/epic/WP-B-SECURITY-HARDENING-PLAN.md (1)

55-73: ASCII-Übersicht ist gut — ergänzend fehlt Mermaid-Detailansicht.

Für den Security-Flow wäre zusätzlich ein einklappbares Mermaid-Diagramm sinnvoll, damit Architekturpfade schneller erfassbar sind.

As per coding guidelines: docs/**: ASCII diagrams for quick overview and Mermaid diagrams in collapsible sections for detail.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/epic/WP-B-SECURITY-HARDENING-PLAN.md` around lines 55 - 73, Add a
collapsible Mermaid diagram immediately after the "Scope: 3 Themenblöcke" ASCII
overview that visualizes the Security-Flow (showing interactions between
security-validator.ts, lib/injection-patterns.ts, lib/sanitizer.ts,
adapters/types.ts/DANGEROUS_PATTERNS, MCP Tool-Description checks, and
ADR-011-security-hardening.md); wrap the Mermaid block in a collapsible section
per docs/** guidelines, include clear directional arrows and node labels for
normalization → pattern-matching → validation → audit logging, and keep the
ASCII overview intact above the collapsible Mermaid detail.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/plugins/handlers/security-validator.ts:
- Around line 182-187: The call to logSecurityEvent is being awaited on the hot
path (e.g., await logSecurityEvent(...)), causing blocking I/O; change to
fire-and-forget by removing await and invoking logSecurityEvent(...) without
blocking the decision path, and ensure you attach a .catch(...) handler to the
returned promise to log failures (so no unhandled rejections). Update every
occurrence of await logSecurityEvent(...) in this file (and other handlers) to
this non-blocking pattern so the security decision does not wait for filesystem
I/O.
- Around line 204-207: The audit entry currently stores raw command text in the
commandPreview field (constructed from command.slice(0, 100)), which may persist
secrets; update the code that builds the object (the block referencing
dangerousMatch and commandPreview) to first redact sensitive values from command
(e.g., via an existing redactSecrets/redactCommand utility or by adding a small
sanitizer that masks API keys/tokens/credentials with regexes) and then set
commandPreview to the redacted result (e.g., redactedCommand.slice(0,100));
ensure the sanitizer is applied wherever commandPreview is set and add/adjust
tests to confirm secrets are masked in security-audit.jsonl.

In @.opencode/plugins/lib/injection-patterns.ts:
- Line 25: Das aktuelle Regex
/you\s+are\s+now\s+(a\s+|an\s+)?(?!Claude|an\s+AI)/i ist zu generisch und
matched z.B. "you are now ready"; ändere die Pattern-Logik in
injection-patterns.ts so dass ein echtes Rollen-Token verlangt wird (z.B. ein
explizites Role-Label oder ein in Anführungszeichen/Kapitalisiert angegebenes
Rollennamen), und ersetze das bestehende Pattern durch eines, das nach "you are
now (a|an)?", gefolgt von einem klar abgegrenzten Rollen-Token wie "role:" oder
einem in quotes/CapitalizedWord stehenden Rollennamen sucht; aktualisiere die
Einträge, die den alten Regex verwenden, auf das neue strengere Pattern
(verwende das vorhandene Literal
/you\s+are\s+now\s+(a\s+|an\s+)?(?!Claude|an\s+AI)/i als Referenz).

In @.opencode/plugins/lib/sanitizer.ts:
- Around line 53-63: The homoglyph mapping in the lookalikes object uses a Latin
"h" where a Cyrillic "х" is intended, so update the lookalikes record (variable
name: lookalikes in sanitizer.ts) to replace the Latin small letter h with the
Cyrillic small letter kha (U+0445) so the Cyrillic→Latin normalization works
correctly; verify other entries use correct Unicode characters and run tests to
ensure the "х"→"x" mapping is applied during sanitization in the
sanitize/normalize function that consumes lookalikes.
- Around line 18-21: The unicode-lookalike mapping in the sanitizer contains an
incorrect key `h: "x"`; locate the mapping object in
.opencode/plugins/lib/sanitizer.ts (the object used for Cyrillic→ASCII
replacements) and replace the Latin letter key `h` with the Cyrillic small
letter kh `х` (U+0445) so the entry becomes the Cyrillic-to-ASCII mapping (х ->
"x"); alternatively remove the entry if unintended—update the mapping where `h:
"x"` currently appears.

In `@docs/architecture/adr/ADR-011-security-hardening.md`:
- Around line 120-122: Update the "Regex DoS" bullet under
ADR-011-security-hardening to accurately describe the actual strategy: note that
while bounded quantifiers like `{0,50}` are preferred, several implemented
pattern sets still contain `.*` sequences as a deliberate trade-off; explicitly
list or reference the affected pattern sets, state the conditions under which
`.*` is used, and add the mitigations applied for those cases (e.g., input
length limits, anchored patterns, complexity review, and
regex-timeout/sanitization measures) so the documentation matches the real
implementation.
- Around line 1-7: This ADR (ADR-011: Security Hardening — Prompt Injection
Defense & Audit Logging) is missing the required YAML frontmatter for
Obsidian/Dataview; add a top YAML block delimited by '---' containing keys like
id: ADR-011, title, status, date (ISO), depends_on (or depends), type: ADR, and
tags/topics (e.g., security, architecture) so Dataview/Obsidian workflows can
index it; ensure the frontmatter appears at the very top of
ADR-011-security-hardening.md and that keys match existing ADRs' conventions
(use the same key names as other ADR files).

In `@docs/epic/WP-B-SECURITY-HARDENING-PLAN.md`:
- Around line 6-7: Aktualisiere alle Platzhalter-PR-Referenzen im Dokument:
ersetze jede Instanz von "target-pr: PR `#B`" und "Estimierter PR: PR `#B`" durch
die tatsächliche Referenz "PR `#43`" und sorge dafür, dass die Einträge konsistent
sind (prüfe besonders die Stellen, die im Kommentar genannt wurden sowie die
weiteren Vorkommen im Dokument). Verwende die bestehenden Schlüsselwörter
"target-pr" und "Estimierter PR" beim Austausch, passe ggf. das
Datums-/Metadatenformat an das vorhandene Schema an und führe einen schnellen
grep/Find-Check durch, um keine weiteren verwaisten Platzhalter übrig zu lassen.

---

Outside diff comments:
In @.opencode/plugins/handlers/security-validator.ts:
- Around line 177-192: The early-return path that allows inputs when
extractCommand(input) returns falsy skips the injection scanner; move or add
invocation of checkAllFieldsForInjection(input, ...) before the allow return in
the extractCommand branch so inputs without an extracted command still undergo
injection checks, and if checkAllFieldsForInjection reports an injection, call
logSecurityEvent(...) and return { action: "deny", reason: "injection detected"
} instead of allowing; apply the same fix to the analogous block that currently
allows before scanning (the block around the other early-return mentioned) so
both code paths use checkAllFieldsForInjection and logSecurityEvent
consistently.

---

Nitpick comments:
In @.opencode/plugins/lib/sanitizer.ts:
- Around line 80-82: The detection regex in sanitizer.ts currently uses the
start-anchor (^ in the condition /^(\\w\\s){4,}/) so obfuscated words are only
detected at the string start; change the condition to test the entire string
(e.g. remove the ^ so it becomes /(\w\s){4,}/ and run it against the full
content rather than content.trim()), leaving the replacement (/(\w)\s(?=\w)/g in
the return call) intact so spaced-out words anywhere in the string get
normalized.

In `@docs/architecture/adr/ADR-011-security-hardening.md`:
- Around line 127-137: Füge in der ADR-011-security-hardening.md unter der
Implementation- oder Files Modified-Sektion einen einklappbaren Details-Block
mit einem Mermaid-Flowchart namens "Mermaid: Security-Flow (Detail)" ein; der
Block soll einen flowchart LR zeigen: Tool Input -> Sanitizer Pipeline ->
detectInjections + Command Checks -> Entscheidung (block/confirm/allow) ->
security-audit.jsonl, und muss als ```mermaid fenced code block innerhalb eines
<details><summary>Mermaid: Security-Flow (Detail)</summary> ... </details>
eingefügt werden, damit die Dokumentationsrichtlinie docs/** erfüllt wird.

In `@docs/epic/WP-B-SECURITY-HARDENING-PLAN.md`:
- Around line 55-73: Add a collapsible Mermaid diagram immediately after the
"Scope: 3 Themenblöcke" ASCII overview that visualizes the Security-Flow
(showing interactions between security-validator.ts, lib/injection-patterns.ts,
lib/sanitizer.ts, adapters/types.ts/DANGEROUS_PATTERNS, MCP Tool-Description
checks, and ADR-011-security-hardening.md); wrap the Mermaid block in a
collapsible section per docs/** guidelines, include clear directional arrows and
node labels for normalization → pattern-matching → validation → audit logging,
and keep the ASCII overview intact above the collapsible Mermaid detail.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bfa91f7e-9ed2-4acb-966e-8f6255344a3a

📥 Commits

Reviewing files that changed from the base of the PR and between 5758038 and 3ad2e50.

📒 Files selected for processing (7)

.opencode/plugins/adapters/types.ts
.opencode/plugins/handlers/security-validator.ts
.opencode/plugins/lib/injection-patterns.ts
.opencode/plugins/lib/sanitizer.ts
docs/architecture/adr/ADR-011-security-hardening.md
docs/architecture/adr/README.md
docs/epic/WP-B-SECURITY-HARDENING-PLAN.md

.opencode/plugins/handlers/security-validator.ts

.opencode/plugins/lib/injection-patterns.ts

.opencode/plugins/lib/sanitizer.ts

coderabbitai · 2026-03-06T19:06:45Z

docs/architecture/adr/ADR-011-security-hardening.md

+# ADR-011: Security Hardening — Prompt Injection Defense & Audit Logging
+
+**Status:** ✅ Implemented (WP-B)  
+**Date:** 2026-03-06  
+**Depends on:** ADR-006 (Security Validation Preservation)  
+
+---


⚠️ Potential issue | 🟡 Minor

Wichtiger ADR-Eintrag ohne YAML-Frontmatter.

Für diese ADR als zentrale Architekturentscheidung fehlt das Frontmatter für Obsidian/Dataview-Workflows.

📝 Vorschlag

+--- +title: ADR-011: Security Hardening — Prompt Injection Defense & Audit Logging +status: accepted +date: 2026-03-06 +depends_on: ADR-006 +category: security +--- + # ADR-011: Security Hardening — Prompt Injection Defense & Audit Logging

As per coding guidelines: `**/*.md`: Frontmatter for important documents.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# ADR-011: Security Hardening — Prompt Injection Defense & Audit Logging

**Status:** ✅ Implemented (WP-B)

**Date:** 2026-03-06

**Depends on:** ADR-006 (Security Validation Preservation)

---

---

title: ADR-011: Security Hardening — Prompt Injection Defense & Audit Logging

status: accepted

date: 2026-03-06

depends_on: ADR-006

category: security

---

# ADR-011: Security Hardening — Prompt Injection Defense & Audit Logging

**Status:** ✅ Implemented (WP-B)

**Date:** 2026-03-06

**Depends on:** ADR-006 (Security Validation Preservation)

---

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/architecture/adr/ADR-011-security-hardening.md` around lines 1 - 7, This ADR (ADR-011: Security Hardening — Prompt Injection Defense & Audit Logging) is missing the required YAML frontmatter for Obsidian/Dataview; add a top YAML block delimited by '---' containing keys like id: ADR-011, title, status, date (ISO), depends_on (or depends), type: ADR, and tags/topics (e.g., security, architecture) so Dataview/Obsidian workflows can index it; ensure the frontmatter appears at the very top of ADR-011-security-hardening.md and that keys match existing ADRs' conventions (use the same key names as other ADR files).

coderabbitai · 2026-03-06T19:06:46Z

docs/architecture/adr/ADR-011-security-hardening.md

+- **Regex DoS:** Complex patterns on long input could be slow
+  - Mitigation: Patterns use bounded quantifiers (`{0,50}` not `*`)
+- **False positives:** Aggressive patterns might block legitimate content


⚠️ Potential issue | 🟡 Minor

Risikomitigation beschreibt die Regex-Strategie aktuell ungenau.

Line 121 sagt, es würden bounded Quantifier statt * genutzt; in den implementierten Pattern-Sets kommen aber mehrfach .*-Sequenzen vor. Das sollte als Trade-off präzise dokumentiert werden.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/architecture/adr/ADR-011-security-hardening.md` around lines 120 - 122, Update the "Regex DoS" bullet under ADR-011-security-hardening to accurately describe the actual strategy: note that while bounded quantifiers like `{0,50}` are preferred, several implemented pattern sets still contain `.*` sequences as a deliberate trade-off; explicitly list or reference the affected pattern sets, state the conditions under which `.*` is used, and add the mitigations applied for those cases (e.g., input length limits, anchored patterns, complexity review, and regex-timeout/sanitization measures) so the documentation matches the real implementation.

coderabbitai · 2026-03-06T19:06:46Z

docs/epic/WP-B-SECURITY-HARDENING-PLAN.md

+target-pr: PR #B
+date: 2026-03-06


⚠️ Potential issue | 🟡 Minor

PR-Referenzen sind noch Platzhalter und inzwischen inkonsistent.

target-pr: PR #B und `Estimierter PR: PR `#B sollten auf die reale PR-Referenz aktualisiert werden (hier: PR #43), damit Plan- und Audit-Doku nachvollziehbar bleiben.

🧾 Vorschlag

-target-pr: PR `#B` +target-pr: PR `#43` @@ -*Estimierter PR: PR `#B`* +*PR: PR `#43`*

Also applies to: 435-437

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/epic/WP-B-SECURITY-HARDENING-PLAN.md` around lines 6 - 7, Aktualisiere alle Platzhalter-PR-Referenzen im Dokument: ersetze jede Instanz von "target-pr: PR `#B`" und "Estimierter PR: PR `#B`" durch die tatsächliche Referenz "PR `#43`" und sorge dafür, dass die Einträge konsistent sind (prüfe besonders die Stellen, die im Kommentar genannt wurden sowie die weiteren Vorkommen im Dokument). Verwende die bestehenden Schlüsselwörter "target-pr" und "Estimierter PR" beim Austausch, passe ggf. das Datums-/Metadatenformat an das vorhandene Schema an und führe einen schnellen grep/Find-Check durch, um keine weiteren verwaisten Platzhalter übrig zu lassen.

7 patterns for API keys and private keys: Anthropic, OpenAI, GitHub PAT, AWS Access Key, PEM Private Key, Groq, HuggingFace Pattern sources: jcfischer/pai-content-filter (MIT) + own additions

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

.opencode/plugins/lib/injection-patterns.ts (1)
25-27: ⚠️ Potential issue | 🟠 Major

Rollen-Hijacking-Regexe blockieren weiter harmlose Prompts.

Line 25 und Line 27 erzwingen nach you are now bzw. pretend to be kein echtes Rollen-Token. Dadurch matchen auch Sätze wie you are now ready oder pretend to be careful, was legitime Inhalte unnötig als Injection markiert.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/plugins/lib/injection-patterns.ts around lines 25 - 27, Die beiden
Regex-Muster /you\s+are\s+now\s+(a\s+|an\s+)?(?!Claude|an\s+AI)/i und
/pretend\s+(you\s+are|to\s+be)\s+(a\s+|an\s+)?(?!(helpful|assistant))/i sind zu
permissiv und matchen harmlose Phrasen; enger die Negativ-Lookahead-Logik, z.B.
erfordern, dass nach „you are now“ bzw. „pretend to be“ ein echtes Rollen-Token
kommt (z.B. durch Lookahead auf ein Substantiv/Musterauswahl oder
Großbuchstaben/Minimallänge), oder ersetzen die current negative list by a
positive whitelist of role nouns (assistant|hacker|developer|doctor|Claude|...);
update die beiden Literale so, dass sie nur dann matchen, wenn unmittelbar ein
Rollenwort (nicht ein kurzes Adjektiv wie "ready" oder "careful") folgt.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/plugins/lib/injection-patterns.ts:
- Around line 109-112: The InjectionMatch interface currently stores full
secrets in matchedText (e.g., API keys) which leaks PII; change matchedText to
hold a redacted preview or metadata only (for example: masked value showing only
last 4 chars, length, and match position) and update all places that construct
InjectionMatch (references: InjectionMatch and matchedText and the match
creation sites around lines 134-136) to populate the new redacted field instead
of the raw match; ensure existing consumers/loggers use the redacted
preview/metadata for audit logging and never serialize the full secret.
- Line 70: Die Regex in the injection patterns uses .{0,50} which doesn't match
newlines, so a single line break between "this tool" and "send|exfiltrate|leak"
bypasses detection; update the pattern (the literal regex
/when\s+(you\s+)?(use|call|invoke)\s+this\s+tool.{0,50}(send|exfiltrate|leak)/i)
to allow multiline matching by replacing .{0,50} with a construct that matches
newlines as well (e.g., [\s\S]{0,50} or enable DOTALL with (?s)), keeping the
rest of the pattern and flags intact so the rule still limits the max span
between phrases.
- Line 84: The PEM private-key regex
'/-----BEGIN\s+(?:RSA\s+)?PRIVATE\s+KEY-----/' in
plugins/lib/injection-patterns.ts misses other key types (e.g. EC, OPENSSH);
replace the '(?:RSA\s+)?' fragment with a more general token like
'(?:[A-Z0-9]+\s+)?' (or '\w+' if preferred) and add the case-insensitive flag so
the pattern becomes e.g. '/-----BEGIN\s+(?:[A-Z0-9]+\s+)?PRIVATE\s+KEY-----/i'
to reliably match "BEGIN EC PRIVATE KEY", "BEGIN OPENSSH PRIVATE KEY", etc.

---

Duplicate comments:
In @.opencode/plugins/lib/injection-patterns.ts:
- Around line 25-27: Die beiden Regex-Muster
/you\s+are\s+now\s+(a\s+|an\s+)?(?!Claude|an\s+AI)/i und
/pretend\s+(you\s+are|to\s+be)\s+(a\s+|an\s+)?(?!(helpful|assistant))/i sind zu
permissiv und matchen harmlose Phrasen; enger die Negativ-Lookahead-Logik, z.B.
erfordern, dass nach „you are now“ bzw. „pretend to be“ ein echtes Rollen-Token
kommt (z.B. durch Lookahead auf ein Substantiv/Musterauswahl oder
Großbuchstaben/Minimallänge), oder ersetzen die current negative list by a
positive whitelist of role nouns (assistant|hacker|developer|doctor|Claude|...);
update die beiden Literale so, dass sie nur dann matchen, wenn unmittelbar ein
Rollenwort (nicht ein kurzes Adjektiv wie "ready" oder "careful") folgt.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fb4a230f-97db-4b7f-9177-153426eaf897

📥 Commits

Reviewing files that changed from the base of the PR and between 3ad2e50 and 85d4e9b.

📒 Files selected for processing (1)

.opencode/plugins/lib/injection-patterns.ts

.opencode/plugins/lib/injection-patterns.ts

Critical fixes: - security-validator.ts: Fire-and-forget audit logging (non-blocking) - security-validator.ts: Secrets redaction for commandPreview (API keys, PEM keys) - security-validator.ts: Early injection check (even when no command extracted) - injection-patterns.ts: Strengthen ROLE_HIJACKING patterns (require role tokens) - injection-patterns.ts: MCP pattern multiline-capable ([\s\S] instead of .) - injection-patterns.ts: PEM regex extended for EC, OPENSSH, etc. - sanitizer.ts: Fix Cyrillic kha (U+0445) instead of Latin h All Biome checks passing.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

.opencode/plugins/handlers/security-validator.ts (1)

232-245: ⚠️ Potential issue | 🟠 Major

Die Fallback-Preview umgeht die Secret-Redaction.

Im Injection-Pfad ohne extrahierten command wird commandPreview aus args[field] unverändert in security-audit.jsonl geschrieben. Wenn genau dieses Feld ein Token, eine Mailadresse oder andere PII enthält, persistiert der Audit-Log wieder Rohdaten — die Redaction greift hier nur im command-Zweig. Führt den Fallback durch dieselbe Redaction oder lasst die Preview in diesem Pfad ganz weg. As per coding guidelines, ADR-011 requires a 100-char commandPreview with no PII.

🔧 Mögliche Korrektur

 		if (injectionResult) {
 			const firstMatch = injectionResult.matches[0];
+			const previewSource = command
+				? command
+				: `${injectionResult.field}:${String(input.args?.[injectionResult.field] ?? "")}`;
 			fileLog(
 				`BLOCKED: Prompt injection detected in field '${injectionResult.field}'`,
 				"error",
 			);
@@
 			logSecurityEvent({
 				timestamp: new Date().toISOString(),
 				tool: input.tool,
 				action: "blocked",
 				reason: `Prompt injection in ${injectionResult.field}`,
 				category: firstMatch.category,
 				pattern: firstMatch.pattern.toString(),
-				commandPreview: command
-					? redactSecrets(command).slice(0, 100)
-					: `${injectionResult.field}:${input.args?.[injectionResult.field]}`.slice(
-							0,
-							100,
-						),
+				commandPreview: redactSecrets(previewSource).slice(0, 100),
 			});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.opencode/plugins/handlers/security-validator.ts around lines 232 - 245, Der
Fallback-Pfad für commandPreview schreibt unveränderte args[field] ins Audit-Log
und umgeht damit redactSecrets; fix: when building commandPreview in the
logSecurityEvent call, pass the fallback string through redactSecrets before
slicing (i.e. compute fallback =
`${injectionResult.field}:${input.args?.[injectionResult.field]}` then
commandPreview = redactSecrets(fallback).slice(0,100)), or alternatively omit
the preview when command is missing; update the logic around logSecurityEvent /
commandPreview so redactSecrets is always applied and slice(0,100) runs after
redaction to satisfy ADR-011 and avoid persisting PII.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/plugins/lib/sanitizer.ts:
- Around line 40-68: normalizeUnicode currently leaves combining marks after
NFKD, letting characters like "o + U+0301" remain non-ASCII; update
normalizeUnicode to remove Unicode combining marks immediately after calling
content.normalize("NFKD") (e.g. strip characters in the Unicode Mark category or
the U+0300–U+036F range) before constructing/applying the lookalikes mapping and
nonAsciiRegex mapping so composed diacritics cannot bypass ASCII-only checks in
the subsequent replacement logic.
- Around line 78-83: Die Funktion collapseObfuscatedSpacing verwendet einen
Anchored-Guard (^(\w\s){4,}) und eine globale Ersetzung, die Wortgrenzen
zerstört; ändere die Logik so dass du statt eines Präfix-Guards global nach
ganzen Sequenzen von Einzelbuchstaben (mehrere aufeinanderfolgende "letter space
letter" Token, mindestens z.B. 4 Paare) suchst und nur die Leerzeichen innerhalb
dieser gefundenen Sequenzen entfernst; konkret: in collapseObfuscatedSpacing
zuerst alle Matches für solche letter-by-letter-Sequenzen (gebunden an
Wortgrenzen) finden und für jedes Match nur die internen Leerzeichen
zusammenziehen, statt die ursprünglichen Regexes ^(\w\s){4,} und (\w)\s(?=\w)
direkt auf den gesamten Content anzuwenden.

---

Duplicate comments:
In @.opencode/plugins/handlers/security-validator.ts:
- Around line 232-245: Der Fallback-Pfad für commandPreview schreibt
unveränderte args[field] ins Audit-Log und umgeht damit redactSecrets; fix: when
building commandPreview in the logSecurityEvent call, pass the fallback string
through redactSecrets before slicing (i.e. compute fallback =
`${injectionResult.field}:${input.args?.[injectionResult.field]}` then
commandPreview = redactSecrets(fallback).slice(0,100)), or alternatively omit
the preview when command is missing; update the logic around logSecurityEvent /
commandPreview so redactSecrets is always applied and slice(0,100) runs after
redaction to satisfy ADR-011 and avoid persisting PII.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f100ff67-dc87-4946-be28-72a4459b2a8a

📥 Commits

Reviewing files that changed from the base of the PR and between 85d4e9b and 3c89bae.

📒 Files selected for processing (3)

.opencode/plugins/handlers/security-validator.ts
.opencode/plugins/lib/injection-patterns.ts
.opencode/plugins/lib/sanitizer.ts

🚧 Files skipped from review as they are similar to previous changes (1)

.opencode/plugins/lib/injection-patterns.ts

coderabbitai · 2026-03-06T22:05:47Z

.opencode/plugins/lib/sanitizer.ts

+export function normalizeUnicode(content: string): string {
+	// Use NFKD normalization to decompose characters
+	const normalized = content.normalize("NFKD");
+
+	// Build regex from char codes to avoid control character lint warning
+	// Match any character outside ASCII range (0-127)
+	const nonAsciiRegex = new RegExp(
+		`[^${String.fromCharCode(0)}-${String.fromCharCode(127)}]`,
+		"g",
+	);
+
+	return normalized.replace(nonAsciiRegex, (char) => {
+		// Map common Cyrillic/Greek lookalikes to ASCII
+		const lookalikes: Record<string, string> = {
+			а: "a", // Cyrillic а (U+0430)
+			е: "e", // Cyrillic е (U+0435)
+			і: "i", // Cyrillic і (U+0456)
+			о: "o", // Cyrillic о (U+043E)
+			р: "p", // Cyrillic р (U+0440)
+			с: "c", // Cyrillic с (U+0441)
+			ѕ: "s", // Cyrillic ѕ (U+0455)
+			у: "y", // Cyrillic у (U+0443)
+			х: "x", // Cyrillic х (U+0445) - kha, NOT Latin h
+			А: "A", // Cyrillic А (U+0410)
+			В: "B", // Cyrillic В (U+0412)
+			Е: "E", // Cyrillic Е (U+0415)
+		};
+		return lookalikes[char] ?? char;
+	});


⚠️ Potential issue | 🟠 Major

normalizeUnicode lässt Combining Marks stehen und verfehlt damit einfache Unicode-Bypässe.

normalize("NFKD") zerlegt z. B. ó zu o + U+0301, aber der Callback gibt unbekannte Non-ASCII-Zeichen unverändert zurück. Dadurch bleibt ignóre previous instructions nach der Pipeline weiter nicht-ASCII und ASCII-basierte Injection-Regexe sehen den Trigger nicht. Entfernt Combining Marks direkt nach NFKD, bevor die Lookalike-Mappings laufen.

🔧 Mögliche Korrektur

export function normalizeUnicode(content: string): string { // Use NFKD normalization to decompose characters - const normalized = content.normalize("NFKD"); + const normalized = content.normalize("NFKD").replace(/\p{M}+/gu, "");

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export function normalizeUnicode(content: string): string {

// Use NFKD normalization to decompose characters

const normalized = content.normalize("NFKD");

// Build regex from char codes to avoid control character lint warning

// Match any character outside ASCII range (0-127)

const nonAsciiRegex = new RegExp(

`[^${String.fromCharCode(0)}-${String.fromCharCode(127)}]`,

"g",

);

return normalized.replace(nonAsciiRegex, (char) => {

// Map common Cyrillic/Greek lookalikes to ASCII

const lookalikes: Record<string, string> = {

а: "a", // Cyrillic а (U+0430)

е: "e", // Cyrillic е (U+0435)

і: "i", // Cyrillic і (U+0456)

о: "o", // Cyrillic о (U+043E)

р: "p", // Cyrillic р (U+0440)

с: "c", // Cyrillic с (U+0441)

ѕ: "s", // Cyrillic ѕ (U+0455)

у: "y", // Cyrillic у (U+0443)

х: "x", // Cyrillic х (U+0445) - kha, NOT Latin h

А: "A", // Cyrillic А (U+0410)

В: "B", // Cyrillic В (U+0412)

Е: "E", // Cyrillic Е (U+0415)

};

return lookalikes[char] ?? char;

});

export function normalizeUnicode(content: string): string {

// Use NFKD normalization to decompose characters

const normalized = content.normalize("NFKD").replace(/\p{M}+/gu, "");

// Build regex from char codes to avoid control character lint warning

// Match any character outside ASCII range (0-127)

const nonAsciiRegex = new RegExp(

`[^${String.fromCharCode(0)}-${String.fromCharCode(127)}]`,

"g",

);

return normalized.replace(nonAsciiRegex, (char) => {

// Map common Cyrillic/Greek lookalikes to ASCII

const lookalikes: Record<string, string> = {

а: "a", // Cyrillic а (U+0430)

е: "e", // Cyrillic е (U+0435)

і: "i", // Cyrillic і (U+0456)

о: "o", // Cyrillic о (U+043E)

р: "p", // Cyrillic р (U+0440)

с: "c", // Cyrillic с (U+0441)

ѕ: "s", // Cyrillic ѕ (U+0455)

у: "y", // Cyrillic у (U+0443)

х: "x", // Cyrillic х (U+0445) - kha, NOT Latin h

А: "A", // Cyrillic А (U+0410)

В: "B", // Cyrillic В (U+0412)

Е: "E", // Cyrillic Е (U+0415)

};

return lookalikes[char] ?? char;

});

}

🧰 Tools

🪛 ast-grep (0.41.0)

[warning] 45-48: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(
[^${String.fromCharCode(0)}-${String.fromCharCode(127)}],
"g",
)
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.opencode/plugins/lib/sanitizer.ts around lines 40 - 68, normalizeUnicode currently leaves combining marks after NFKD, letting characters like "o + U+0301" remain non-ASCII; update normalizeUnicode to remove Unicode combining marks immediately after calling content.normalize("NFKD") (e.g. strip characters in the Unicode Mark category or the U+0300–U+036F range) before constructing/applying the lookalikes mapping and nonAsciiRegex mapping so composed diacritics cannot bypass ASCII-only checks in the subsequent replacement logic.

coderabbitai · 2026-03-06T22:05:47Z

.opencode/plugins/lib/sanitizer.ts

+export function collapseObfuscatedSpacing(content: string): string {
+	// Detect letter-space-letter pattern (obfuscated words)
+	if (/^(\w\s){4,}/.test(content.trim())) {
+		return content.replace(/(\w)\s(?=\w)/g, "$1");
+	}
+	return content;


⚠️ Potential issue | 🟠 Major

collapseObfuscatedSpacing erkennt nur Präfixe und kann Wortgrenzen zerstören.

Der Guard ^(\w\s){4,} greift nur, wenn die Obfuskation direkt am Anfang steht. Wenn er greift, entfernt (\w)\s(?=\w) außerdem jede einzelne Leerstelle zwischen Wortzeichen, sodass i g n o r e all previous zu ignoreallprevious wird. Matcht besser echte Letter-by-letter-Sequenzen global und kollabiert nur innerhalb dieser Sequenzen.

🔧 Mögliche Korrektur

export function collapseObfuscatedSpacing(content: string): string { - // Detect letter-space-letter pattern (obfuscated words) - if (/^(\w\s){4,}/.test(content.trim())) { - return content.replace(/(\w)\s(?=\w)/g, "$1"); - } - return content; + return content.replace( + /(^|[^\w])((?:\w\s){4,}\w)(?=$|[^\w])/g, + (_, prefix, obfuscated) => prefix + obfuscated.replace(/\s+/g, ""), + ); }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.opencode/plugins/lib/sanitizer.ts around lines 78 - 83, Die Funktion collapseObfuscatedSpacing verwendet einen Anchored-Guard (^(\w\s){4,}) und eine globale Ersetzung, die Wortgrenzen zerstört; ändere die Logik so dass du statt eines Präfix-Guards global nach ganzen Sequenzen von Einzelbuchstaben (mehrere aufeinanderfolgende "letter space letter" Token, mindestens z.B. 4 Paare) suchst und nur die Leerzeichen innerhalb dieser gefundenen Sequenzen entfernst; konkret: in collapseObfuscatedSpacing zuerst alle Matches für solche letter-by-letter-Sequenzen (gebunden an Wortgrenzen) finden und für jedes Match nur die internen Leerzeichen zusammenziehen, statt die ursprünglichen Regexes ^(\w\s){4,} und (\w)\s(?=\w) direkt auf den gesamten Content anzuwenden.

- Add redactMatchedText() helper that masks API keys/tokens - For PII: show sk12...[REDACTED:51]...xYzQ format - For non-PII: truncate at 50 chars - Add matchPosition and originalLength metadata - Prevents full secrets from being stored in logs Addresses CodeRabbit finding: InjectionMatch stores full secrets

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

.opencode/plugins/lib/injection-patterns.ts (1)

25-33: ⚠️ Potential issue | 🟠 Major

Role-Hijacking-Regeln sind noch zu breit.

Line 27 ist trotz der Verschärfung weiter anfällig für False Positives: Durch das i-Flag matched ["']?[A-Z][a-z]+["']? auch normale Wörter wie ready, und developer|expert|specialist in Lines 27-31 blockiert sehr wahrscheinlich legitime Prompts. Für einen Block-Trigger würde ich hier nur explizit markierte Rollen (role: / quoted labels) oder klar schädliche Rollen zulassen.

🎯 Vorschlag

-	/you\s+are\s+now\s+(?:a\s+|an\s+)?(?:["']?[A-Z][a-z]+["']?|role:\s*\w+|hacker|attacker|malware|developer|expert|specialist)/i,
+	/you\s+are\s+now\s+(?:a\s+|an\s+)?(?:role:\s*\w+|["'][^"'\n]{2,30}["']|hacker|attacker|malware|jailbreak)\b/i,
...
-	/pretend\s+(?:you\s+are|to\s+be)\s+(?:a\s+|an\s+)?(?:["']?[A-Z][a-z]+["']?|hacker|attacker|expert|developer)/i,
+	/pretend\s+(?:you\s+are|to\s+be)\s+(?:a\s+|an\s+)?(?:["'][^"'\n]{2,30}["']|hacker|attacker|malware|jailbreak)\b/i,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.opencode/plugins/lib/injection-patterns.ts around lines 25 - 33, The
ROLE_HIJACKING_PATTERNS array is too broad and causes false positives (the
/["']?[A-Z][a-z]+["']?/ part and generic tokens like
developer|expert|specialist); update ROLE_HIJACKING_PATTERNS to only match
explicitly marked role labels (e.g., quoted strings or the "role:" prefix) and
clearly malicious roles (hacker, attacker, malware, jailbroken, evil, malicious)
— remove the ambiguous capitalized-word alternative and drop generic benign
tokens like developer/expert/specialist from the patterns used in functions or
modules that reference ROLE_HIJACKING_PATTERNS so only explicit markers or
clearly harmful role names trigger a block.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/plugins/lib/injection-patterns.ts:
- Around line 75-77: The second regex in MCP_TOOL_INJECTION_PATTERNS is too
broad and can match across very large inputs; replace the current
/tool\s+description.*ignore.*instructions/is pattern with a bounded-window
variant that limits how far apart the keywords can be (e.g., use non-greedy or
[\s\S]{0,50} between "tool description", "ignore" and "instructions") so it only
flags proximate occurrences; update the MCP_TOOL_INJECTION_PATTERNS array entry
accordingly to keep the same flags (i and s) and behavior but avoid large-span
false positives.

---

Duplicate comments:
In @.opencode/plugins/lib/injection-patterns.ts:
- Around line 25-33: The ROLE_HIJACKING_PATTERNS array is too broad and causes
false positives (the /["']?[A-Z][a-z]+["']?/ part and generic tokens like
developer|expert|specialist); update ROLE_HIJACKING_PATTERNS to only match
explicitly marked role labels (e.g., quoted strings or the "role:" prefix) and
clearly malicious roles (hacker, attacker, malware, jailbroken, evil, malicious)
— remove the ambiguous capitalized-word alternative and drop generic benign
tokens like developer/expert/specialist from the patterns used in functions or
modules that reference ROLE_HIJACKING_PATTERNS so only explicit markers or
clearly harmful role names trigger a block.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 85078add-d274-49b7-ae17-c0cd43e38cea

📥 Commits

Reviewing files that changed from the base of the PR and between 3c89bae and c08ac5f.

📒 Files selected for processing (1)

.opencode/plugins/lib/injection-patterns.ts

.opencode/plugins/lib/injection-patterns.ts

…ction - MCP Pattern (line 77): Replace unbounded .* with [\s\S]{0,50} Prevents large-span false positives across entire document - ROLE_HIJACKING: Remove ambiguous patterns - Remove quoted-capitalized-word pattern that matched any Capitalized word - Remove developer/expert/specialist (generic benign tokens) - Keep only: explicit role: prefix, quoted strings, or clearly malicious roles - Now requires explicit markers or harmful role names only Reduces false positives while maintaining security coverage.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

.opencode/plugins/lib/injection-patterns.ts (1)
116-121: Pattern-Serialisierung ist bereits korrekt implementiert — Refactoring bleibt optional.

Der Code konvertiert RegExp zu String (.toString()) bereits an allen Serialisierungspunkten (Zeilen 238, 280, 300, 332) vor logSecurityEvent(). Das SecurityAuditEntry-Interface erwartet pattern?: string, nicht RegExp. Die JSONL-Audit-Logging funktioniert damit ohne Verlust.

Die Refactoring-Empfehlung (Änderung von pattern: RegExp zu pattern: string in InjectionMatch) bleibt architektonisch sinnvoll für Design-Klarheit, ist aber keine kritische Fehlerbehebung — das System serialisiert aktuell korrekt.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.opencode/plugins/lib/injection-patterns.ts around lines 116 - 121, The
InjectionMatch interface currently types pattern as RegExp but the audit
pipeline (SecurityAuditEntry and calls to logSecurityEvent) already serializes
regexes to strings; change InjectionMatch.pattern from RegExp to string for
design clarity and to match SecurityAuditEntry.pattern?: string, then update any
code that constructs InjectionMatch instances (places that set pattern) to call
.toString() or otherwise pass a string, and adjust consumers that assume a
RegExp to parse/recreate the RegExp when needed (e.g., callers of
logSecurityEvent or any match-processing functions that relied on a RegExp).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.opencode/plugins/lib/injection-patterns.ts:
- Line 88: The existing regex /sk-(?!ant-)[a-zA-Z0-9]{32,}/ in
injection-patterns.ts misses documented OpenAI key formats that include
additional hyphenated prefixes (e.g., sk-proj-..., sk-admin-...); update that
pattern to allow hyphens in the token by expanding the character class (for
example change [a-zA-Z0-9] to include '-' so it becomes [a-zA-Z0-9-]) while
keeping the negative lookahead (sk-(?!ant-)) and the length requirement,
ensuring keys with internal hyphens are detected.

---

Nitpick comments:
In @.opencode/plugins/lib/injection-patterns.ts:
- Around line 116-121: The InjectionMatch interface currently types pattern as
RegExp but the audit pipeline (SecurityAuditEntry and calls to logSecurityEvent)
already serializes regexes to strings; change InjectionMatch.pattern from RegExp
to string for design clarity and to match SecurityAuditEntry.pattern?: string,
then update any code that constructs InjectionMatch instances (places that set
pattern) to call .toString() or otherwise pass a string, and adjust consumers
that assume a RegExp to parse/recreate the RegExp when needed (e.g., callers of
logSecurityEvent or any match-processing functions that relied on a RegExp).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7bb87f33-9bd8-463d-bac6-ee07171b8575

📥 Commits

Reviewing files that changed from the base of the PR and between c08ac5f and 91d4073.

📒 Files selected for processing (1)

.opencode/plugins/lib/injection-patterns.ts

coderabbitai · 2026-03-06T23:04:42Z

.opencode/plugins/lib/injection-patterns.ts

+// Pattern-Quellen: jcfischer/pai-content-filter (MIT) + eigene Ergänzungen
+export const PII_PATTERNS = [
+	/sk-ant-[A-Za-z0-9\-_]{20,}/, // Anthropic API Key
+	/sk-(?!ant-)[a-zA-Z0-9]{32,}/, // OpenAI API Key


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Welche Formate/Präfixe haben aktuelle OpenAI API Keys laut offizieller OpenAI-Dokumentation oder offiziellen Beispielen? Bitte speziell prüfen, ob nach sk-zusätzliche mit- getrennte Segmente vorkommen.

💡 Result:

Laut offiziellen OpenAI-Seiten/Beispielen kommen aktuell folgende Präfixe/Formate für OpenAI API Keys vor (relevante Ausschnitte in den Beispielen sind meist redaktiert):

sk-... (klassisches/„standard“ Secret-Key-Präfix; in Fehlermeldungsbeispielen so angezeigt). → keine weiteren, durch - getrennten „Namenssegmente“ direkt nach sk- erkennbar. [1]

sk-proj-... (Project-bezogene API Keys; in offiziellen Cookbook-Beispielen als OPENAI_API_KEY gesetzt). → ja, hier folgt nach sk- explizit ein weiteres Segment proj, getrennt durch -. [2]

sk-admin-... (Admin API Keys; im offiziellen API-Reference-Beispiel ist sogar ein vollständiger Beispiel-value gezeigt). → ja, hier folgt nach sk- explizit admin, getrennt durch -. [3]

Damit: Ja, es gibt offiziell dokumentierte/illustrierte API-Key-Präfixe, bei denen nach sk- weitere mit - getrennte Segmente vorkommen (z. B. sk-proj-…, sk-admin-…). [2][3]

Quellen: [1] [2] [3]

OpenAI-API-Key-Regex erfasst offizielle Keyformate mit Bindestrich nicht.

Die Regex auf Zeile 88 erlaubt nach sk- nur alphanumerische Zeichen und übersieht damit dokumentierte OpenAI-Formate wie sk-proj-... (Projekt-Keys) und sk-admin-... (Admin-Keys). Diese werden von der Leak-Erkennung nicht erfasst.

🎯 Vorschlag

- /sk-(?!ant-)[a-zA-Z0-9]{32,}/, // OpenAI API Key + /sk-(?!ant-)[A-Za-z0-9-]{32,}/, // OpenAI API Key

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/sk-(?!ant-)[a-zA-Z0-9]{32,}/, // OpenAI API Key

/sk-(?!ant-)[A-Za-z0-9-]{32,}/, // OpenAI API Key

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.opencode/plugins/lib/injection-patterns.ts at line 88, The existing regex /sk-(?!ant-)[a-zA-Z0-9]{32,}/ in injection-patterns.ts misses documented OpenAI key formats that include additional hyphenated prefixes (e.g., sk-proj-..., sk-admin-...); update that pattern to allow hyphens in the token by expanding the character class (for example change [a-zA-Z0-9] to include '-' so it becomes [a-zA-Z0-9-]) while keeping the negative lookahead (sk-(?!ant-)) and the length requirement, ensuring keys with internal hyphens are detected.

File was superseded by actual implementation in WP-B PR #43. No longer needed as security features are documented in ADR-011 and implemented in plugins/handlers/ directory.

Steffen025 added 2 commits March 6, 2026 16:14

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

WP-B: Add PII credential leak patterns to injection-patterns.ts

85d4e9b

7 patterns for API keys and private keys: Anthropic, OpenAI, GitHub PAT, AWS Access Key, PEM Private Key, Groq, HuggingFace Pattern sources: jcfischer/pai-content-filter (MIT) + own additions

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

.opencode/plugins/lib/injection-patterns.ts Outdated Show resolved Hide resolved

.opencode/plugins/lib/injection-patterns.ts Outdated Show resolved Hide resolved

.opencode/plugins/lib/injection-patterns.ts Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

.opencode/plugins/lib/injection-patterns.ts Outdated Show resolved Hide resolved

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

Steffen025 merged commit 1062aa9 into dev Mar 8, 2026
3 checks passed

Steffen025 deleted the feature/wp-b-security-hardening branch March 8, 2026 20:59

Steffen025 mentioned this pull request Mar 8, 2026

Security Hardening: Command Injection & Path Traversal Opportunities in PAI/Tools (Findings from PAI-OpenCode Updating Process) danielmiessler/Personal_AI_Infrastructure#904

Open

	/sk-(?!ant-)[a-zA-Z0-9]{32,}/, // OpenAI API Key
	/sk-(?!ant-)[A-Za-z0-9-]{32,}/, // OpenAI API Key

Conversation

Steffen025 commented Mar 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

New Files

Modified Files

Security Improvements

Testing

Related

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Steffen025 commented Mar 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 6, 2026 •

edited

Loading