Skip to content
Aleksandr Artamokhov edited this page Jun 26, 2026 · 1 revision

WARDEN firewall

WARDEN is ARGUS's MCP security layer. Every third-party MCP server is treated as hostile-by-default until it clears a gate chain.

Full reference: argus/docs/security-warden.md


Why it exists

MCP tool descriptions are attacker-controlled text the model reads as instructions. A malicious server can:

  • Hide prompt injection in tool defs
  • Rug-pull (swap defs after approval)
  • Exfiltrate data via schema prose
  • Harvest secrets (API keys, seed phrases)

WARDEN blocks these before any tool reaches the model or runs on your machine.


Gate chain

flowchart TD
  IN[MCP server] --> SS[1 · static-scan]
  SS --> TF[2 · threat-feed]
  TF --> REP[3 · LUMEN reputation]
  REP --> PIN[4 · pinning]
  PIN --> OK{allow?}
  OK -->|yes| RUN[bridge tools]
  OK -->|no| BLOCK[block + report]
Loading
Gate Catches
static-scan Injection, exfil, secret-harvest signatures in descriptions/schemas
threat-feed Known-bad patterns (built-in + optional signed remote feed)
reputation Low LUMEN trust score for the server identity
pinning Tool-def hash drift since you approved (rug-pull)

Oracle unreachable? Reputation degrades to a neutral score — WARDEN keeps working offline (Autonomy).


CLI

argus warden scan          # vet all configured MCP servers
argus warden status        # last verdicts per server

Policy knobs live in argus.config.json under warden — see Configuration.


Sensitive tools

Even after a server is allowed, tools classified as sensitive (write, delete, exec, payment…) require explicit user approval at call time (CLI prompt, Telegram /yes, etc.).


Related

Clone this wiki locally