Confinement wrapper for LLM coding agents (Claude Code, Codex, Aider, OpenCode) on Linux. Flips the default from --dangerously-skip-permissions to "confined by default": clone a third-party repo and run an agent on it without worrying about prompt-injection → credential theft or persistent compromise.
MVP, Linux x86_64 only (the seccomp BPF filter is currently amd64-specific; arm64 is tracked as a follow-up). Single profile shipping (untrusted, default); paranoid stubbed. Auto-detects NixOS vs FHS layouts, so the same binary works on NixOS and Ubuntu/Debian/etc.
- Filesystem:
$HOMEbecomes a fresh tmpfs; only the project directory is writable; credentials (~/.ssh,~/.aws,~/.gnupg) and sibling repos are invisible. - Network: pasta-managed userns + a /32 route to the gateway + an in-namespace nft rule → no default route, no kernel-level reachability outside a single allowed flow. The agent's
/etc/hostsmaps allowed hostnames to an in-namespace forwarder, which splices to an SNI-sniffing TCP proxy on host loopback that gates outbound by hostname (no TLS termination, no MITM). Direct-IP egress and non-proxy ports on the host's loopback are both rejected. The allowlist refuses loopback names and IP literals so the host-side proxy can't be steered into dialing local services. - Env: scrubbed, with per-agent passthrough only (no
SSH_AUTH_SOCK, no arbitrary host env). - Seccomp (amd64): BPF filter blocking
ptrace,keyctlfamily,mount/pivot_root,bpf, kernel module syscalls,reboot/kexec, and other kernel-touching vectors.
Rootless: no sudo, no setcap, no sysctl tweaks, no persistent host state. The whole sandbox tears down with the process. nft is required as a runtime binary, but it runs inside the sandbox's unprivileged user namespace — CAP_NET_ADMIN there is scoped to that namespace, so no host-level privilege is involved and the rules disappear when the namespace is torn down. (The original "drop nft" goal in issue #15 was about removing host-privileged sudo nft, not removing the nft binary dependency itself.)
Not defended against: steganographic exfil inside prompt bodies (a fundamental limit); a malicious agent binary stealing the API credential still in-sandbox (phase 2 work, see notes.md).
Runtime: bubblewrap, passt (provides pasta), nftables, iproute2, and a Linux kernel with user namespaces enabled.
Run agentpen --check to report which layers this host can actually enforce.
passt is in repos for Debian 12+, Ubuntu 23.10+, Fedora 38+, Arch, Alpine, NixOS, etc. — install with the usual package manager. For older releases (notably Ubuntu 22.04 LTS), build from upstream source — it's a small pure-C codebase with no exotic deps:
git clone https://passt.top/passt
cd passt
make
make prefix=$HOME/.local install # or sudo make install for /usr/local
Auditable (~26K lines of C, single tree), reproducible, and pinned by the commit hash you cloned. Don't curl | sh a binary off the internet — for a tool whose job is to gate egress, the supply chain matters.
With Nix (recommended, no host pollution):
nix develop --command go build .
With Docker (works anywhere docker compose does, pins the Go toolchain):
UID=$(id -u) GID=$(id -g) docker compose run --rm build
(The explicit UID/GID ensure the output binary is owned by you. Bash doesn't export them by default, so compose.yml's ${UID:-1000} substitution would otherwise silently fall back to 1000:1000.)
With system Go 1.21 or newer (the go 1.26.1 directive in go.mod auto-downloads the matching toolchain on first build):
go build .
Produces ./agentpen.
agentpen claude # run claude confined
agentpen codex # same for codex
agentpen --check # host capability report
agentpen --allow example.com claude # add to allowlist
Known agents auto-detected from the command's basename: claude, codex, aider, opencode. Override with --agent NAME. See agentpen --help for all flags.
See notes.md for the threat model, design decisions, and deferred work.
MIT. See LICENSE.