Skip to content

fix(dlp): scan tool_use.input and tool_result.content for secrets#333

Merged
4 commits merged intomainfrom
fix/dlp-scan-tool-io
Apr 28, 2026
Merged

fix(dlp): scan tool_use.input and tool_result.content for secrets#333
4 commits merged intomainfrom
fix/dlp-scan-tool-io

Conversation

@Destynova2
Copy link
Copy Markdown
Contributor

Summary

Closes the DLP audit gap on chained tool calls. Previously the engine scanned only top-level message text, so secrets that flow through tool_use.input (tool arguments) or tool_result.content (tool output that flows back into the next provider call) were invisible — chained agent loops could quietly leak secrets across turns.

  • New DlpEngine::sanitize_tool_io_request walks every tool_use.input (serde_json::Value, serialised) and every tool_result.content (string OR array of content blocks) in the request. Default mode rejects the request and emits a ToolIoBlocked audit entry; [dlp] mode = "redact" redacts in-place and continues.
  • New DlpEngine::sanitize_tool_io_response does the same for the upstream provider response, so a model that echoes a leaked secret into a tool argument is caught before the response leaves the proxy.
  • New DlpBlockError::SecretInToolIo { stage, tool_name, reason } — stage is "input" (request tool_use.input), "result" (request tool_result.content), or "output" (upstream-response tool_use.input).
  • New AuditEvent::ToolIoBlocked variant; the existing dlp_rules_triggered field carries stage=…, tool=…, and the rule name.
  • New ProviderLoopAction::DlpBlocked propagates upstream-response blocks past the provider fallback cascade (no sibling retry on a poisoned response).
  • New [dlp] mode = "block" | "redact" config (default block).
  • New grob_dlp_tool_io_total{stage,rule,outcome} metric.

Honors existing [dlp] enabled / scan_input / scan_output flags.

Test plan

  • dlp_blocks_secret_in_tool_use_input
  • dlp_blocks_secret_in_tool_result_content (covers the array-of-content-blocks branch)
  • dlp_blocks_secret_in_upstream_response_tool_use_input
  • dlp_redacts_when_mode_redact_in_tool_input (asserts the raw token does not survive in the serialized request)
  • dlp_audit_log_records_tool_io_block (round-trips the new variant through the signing pipeline)
  • dlp_passes_clean_tool_io_unchanged (byte-for-byte unchanged on clean input/result/output)
  • cargo fmt --check, cargo clippy --all-targets --all-features -- -D warnings, cargo nextest run --tests --all-features (1307 tests, all green)

Clément LIARD added 4 commits April 28, 2026 22:41
CI re-runs the full test suite (incl. doctests) on every PR via the
.github/workflows/ci.yml tests job, so local pre-push duplication
adds ~20 min per push without catching anything new. Pre-push hooks
should be fast-fail; expensive checks belong on the CI server.

Closes audit finding: silent productivity tax (pre-push duplication).
Documents the three-state intent (true/false/absent) of ProviderConfig.is_enabled
and the dependency on deny_unknown_fields (added in the next commit) to
reject typos like enbaled = false at parse time. Behaviour is unchanged;
this is purely contractual clarity to support the silent-typo-killer audit.

Closes audit finding: silent typo killer on provider config.
Adds #[serde(deny_unknown_fields)] to AppConfig and the major
sub-structs (ProviderConfig, ModelConfig, TierConfig, RouterConfig,
ScoringConfig, CacheConfig, BudgetConfig, DlpConfig, SecurityConfig).

Without this guard, a typo like enbaled = false in a [[providers]]
block silently parses (the unknown key is dropped) and the provider
remains enabled with the wrong intent. With the guard, parsing fails
loudly and the operator gets an actionable error pointing at the
offending key.

Tested with the full nextest suite (1268 tests) plus all doctests:
no fixture, preset or example carries a stale field, so this is a
pure tightening with no migration cost.

Closes audit finding: silent typo killer on TOML config.
Each entry in DENIED_SECTIONS / DENIED_KEYS now carries a short
justification table covering why it can not be hot-reloaded — either
because the data is sensitive (credentials, DLP rules) or because the
consumer is constructed once at process start (TLS listener, secret
backend, TEE attestation, FIPS gate).

Adds tee, fips, server.tls and secrets.backend to the deny-list so
the documented "static-init" rationale matches actual behaviour. Also
emits an INFO log on every denied attempt telling the operator to
restart instead of expecting the silent reload to apply.

Adds two unit tests covering the new deny entries (tee/fips sections
and server.tls / secrets.backend keys) and asserts that sibling keys
in the same sections remain editable.

Closes audit finding: hot-reload UX (silent ignore of denied edits).
@Destynova2 Destynova2 enabled auto-merge (squash) April 28, 2026 21:29
@Destynova2 Destynova2 closed this pull request by merging all changes into main in a6a684e Apr 28, 2026
@Destynova2 Destynova2 deleted the fix/dlp-scan-tool-io branch April 28, 2026 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant