-
Notifications
You must be signed in to change notification settings - Fork 0
WARDEN
Aleksandr Artamokhov edited this page Jun 26, 2026
·
1 revision
WARDEN is ARGUS's MCP security layer. Every third-party MCP server is treated as hostile-by-default until it clears a gate chain.
Full reference: argus/docs/security-warden.md
MCP tool descriptions are attacker-controlled text the model reads as instructions. A malicious server can:
- Hide prompt injection in tool defs
- Rug-pull (swap defs after approval)
- Exfiltrate data via schema prose
- Harvest secrets (API keys, seed phrases)
WARDEN blocks these before any tool reaches the model or runs on your machine.
flowchart TD
IN[MCP server] --> SS[1 · static-scan]
SS --> TF[2 · threat-feed]
TF --> REP[3 · LUMEN reputation]
REP --> PIN[4 · pinning]
PIN --> OK{allow?}
OK -->|yes| RUN[bridge tools]
OK -->|no| BLOCK[block + report]
| Gate | Catches |
|---|---|
| static-scan | Injection, exfil, secret-harvest signatures in descriptions/schemas |
| threat-feed | Known-bad patterns (built-in + optional signed remote feed) |
| reputation | Low LUMEN trust score for the server identity |
| pinning | Tool-def hash drift since you approved (rug-pull) |
Oracle unreachable? Reputation degrades to a neutral score — WARDEN keeps working offline (Autonomy).
argus warden scan # vet all configured MCP servers
argus warden status # last verdicts per serverPolicy knobs live in argus.config.json under warden — see Configuration.
Even after a server is allowed, tools classified as sensitive (write, delete, exec, payment…) require explicit user approval at call time (CLI prompt, Telegram /yes, etc.).