Agent Guard: a runtime security proxy for MCP tool calls #781
Replies: 2 comments 1 reply
-
|
Taint tracking is the right call. Most proxies catch the outbound pattern but miss that the value was read two steps earlier. Hard case: agent reads a secret, a summarization step slightly transforms it, then it goes out. Does exact-value matching still catch it? How does it hold up when multiple agent threads are running at the same time? |
Beta Was this translation helpful? Give feedback.
-
|
Honest answer: both are known gaps, not solved yet. Transformed values: No, matching is exact substring (plus one level of base64/hex decode), so a paraphrase/truncation/re-encode breaks it. Called out in the README's limitations. Considering tagging subtokens to catch partial leaks, but that raises false positive, unsolved tradeoff. Concurrency: Taint store is per session, in memory, asyncio safe for concurrent calls within a session. Sessions are isolated by design. Real limit is FIFO eviction (max_entries) could drop an earlier tagged value before a later leak attempt, missing the match on long sessions. These are exactly the gaps that need solving before its prod ready |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Pre-submission Checklist
What would you like to share?
What I built:
A runtime security proxy for MCP: it sits between your MCP client (Claude Desktop, Cursor, or any MCP compatible client) and your real MCP servers, checking every tool call for secrets in transit, dangerous commands (
rm -rf,curl | sh, etc.), and accidental data exfiltration via taint tracking (e.g. if an agent reads.envand that value later shows up in a call to an external tool, it gets blocked). Also has a prompt-injection tripwire. Every call is logged with a risk score and verdict (allowed/warned/blocked/error), plus a kill switch andaudit-onlymode for tuning.There's a growing space of MCP runtime guards (e.g. policy/rate-limit proxies, static scanners), Agent Guard's main differentiator is taint tracking for accidental exfiltration: if an agent reads a sensitive file and that value later flows to an external-facing tool, the call is blocked.
How I built it:
Python, implemented as an MCP server itself: it proxies
tools/listandtools/callto whatever downstream servers you configure inagent-guard.yaml. Tool names get aggregated as<server>__<tool>. Detection runs pre and post-call (pattern matching + a small taint store).Challenges:
Getting taint tracking to not be too noisy — landed on exact-value matching (plus basic base64/hex decoding) for v1. Also hit an MCP tool-naming validation issue along the way (
.separators aren't allowed in tool names, switched to__).Still early, would love feedback, especially from anyone running agents with real tool access. Limitations are documented up front in the README.
Relevant Links
Beta Was this translation helpful? Give feedback.
All reactions