Conversation
Adds design doc for docker alongside boxsh/local. Covers host↔container path translation, network-mode mapping, container-per-session lifecycle, and the runner-side preflight branching needed so a docker-configured runner doesn't misfire boxsh preflight on Windows. Assisted-by: claude:claude-opus-4-7
Introduces SandboxBackendDocker, SandboxDockerConfig (Image, User, AllowPull, ExtraMounts) with validation. No plugin or runner wiring yet — later phases of the docker sandbox plan build on this. Assisted-by: claude-sonnet:claude-sonnet-4-6
Introduces plugins/sandbox/docker/dockerclient/: shells out to the `docker` CLI for container lifecycle (create/start/stop), exec (blocking and streaming), and orphan cleanup. Includes label constants and a shimmable Client tested against a fake docker binary. No pkg/sandbox imports — this is the pure subprocess layer consumed by the plugin factory in a later phase. Assisted-by: claude-sonnet:claude-sonnet-4-6
Adds plugins/sandbox/docker: dockerFactory/dockerSession/dockerHost implementing the pkg/sandbox contract on top of the dockerclient subprocess layer. Container-per-session with bind-mounted workspace; host-side FS ops; Exec/StartProcess via docker exec with host→container path translation; HTTP fails closed; whitelist network mode rejected with RelaxedWouldHelp. Assisted-by: claude-sonnet:claude-sonnet-4-6
Registers dockerplugin.NewFactory in DefaultRegistry (always-on; daemon reachability is decided at session-create time). Adds createDockerSession and cleanupOrphanedDockerContainers in sandbox_backend.go, and refactors prepareSandbox into a switch on the backend name so a docker-configured runner runs docker preflight + docker orphan cleanup rather than misfiring the boxsh path. Assisted-by: claude-sonnet:claude-sonnet-4-6
Extends TestSessionContract and TestHostContract with DockerFactory subtests that skip gracefully when the docker daemon is unreachable. Adds TestDockerFactorySupported mirroring the boxsh policy-compat coverage (whitelist fails closed with RelaxedWouldHelp; disabled / allow_all supported). Extends TestDefaultRegistry to assert docker is always registered. Assisted-by: claude-sonnet:claude-sonnet-4-6
Updates the sandbox abstraction doc, configuration / deployment guides (en/zh/ja), changelog, and README to cover the new docker backend: config keys under sandbox.docker, when to choose it over boxsh/local, known tradeoffs (Docker Desktop bind-mount perf, no COW isolation, whitelist/HTTP fail-closed). Assisted-by: claude-sonnet:claude-sonnet-4-6
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 972a25a903
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| func (h *dockerHost) ResolvePath(path string) (string, error) { | ||
| if filepath.IsAbs(path) { | ||
| return path, nil | ||
| } |
There was a problem hiding this comment.
Enforce policy checks for absolute host file paths
ResolvePath returns absolute paths unchanged, and the docker host filesystem methods then use those host paths directly. In strict docker sessions (Relaxed=false), this lets absolute tool paths (for example /etc/passwd) bypass workspace/read-only policy boundaries and access host files outside the sandbox mount set. That breaks the backend’s “honor policies strictly” isolation expectation for non-exec file operations.
Useful? React with 👍 / 👎.
| if !stale && createdAt != "" { | ||
| t, err := time.Parse(time.RFC3339, createdAt) | ||
| if err == nil && time.Since(t) > time.Hour { | ||
| stale = true |
There was a problem hiding this comment.
Avoid force-removing running containers by age alone
The orphan cleanup marks a container as stale when its created_at label is older than one hour, regardless of whether it is still running, and then removes it with rm --force. Since cleanup filters by anna_home and runs at startup, this can terminate active sessions from other Anna processes that have been alive for more than an hour. Staleness-by-age should not apply to running containers without an ownership/liveness check tied to the current process.
Useful? React with 👍 / 👎.
| if opts.Stdin != nil { | ||
| args = append(args, "-i") | ||
| } |
There was a problem hiding this comment.
Pass -i when StartExec exposes a stdin pipe
buildExecArgs only adds -i when opts.Stdin is non-nil, but StartExec creates and returns handle.Stdin when opts.Stdin is nil. That means callers can get a writable stdin handle while docker exec was started without interactive stdin enabled. Per Docker CLI docs, -i/--interactive is what “Keep[s] STDIN open even if not attached”, so interactive subprocesses may see immediate EOF or dropped input.
Useful? React with 👍 / 👎.
📊 Coverage ReportTotal coverage: 46.1% Per-package breakdown |
Remove all skills, agents, and PATH dir mounts from sandbox policy. Skills are read from DB; ReadOnlyPaths interface is preserved but unpopulated. Both boxsh and docker now receive identical policy: UserRoot as the sole writable workspace, no read-only mounts. Deletes collectSandboxReadOnlyDirs, sandboxReadableDirs, isWithinPathRoot, and the four agent/project path helpers that existed only to feed sandbox mounts. Assisted-by: Claude:claude-sonnet-4-6
Replaces the multi-stage builder Dockerfile with a simpler single-stage approach: copy mise/uv/bun from official images, install system tools via apt, and declare all mise tools in _mise.toml so adding a new binary is a one-line edit. GitHub token is injected at build time as a BuildKit secret (env mount, not file read). Adds sandbox:docker:build mise task that auto-passes USER_UID/GID and GITHUB_TOKEN. Assisted-by: Claude:claude-sonnet-4-6
Assisted-by: Claude:claude-sonnet-4-6
Move foundational tools (python3, nodejs/npm, ripgrep, fd-find) from mise to apt so they track debian's stable versions and don't hit the network during every build. Drop uv/bun from mise since they're already COPY'd from upstream images. Pin mise, uv, and bun base tags to specific versions for reproducible builds. Assisted-by: claude-code:claude-opus-4-7
Mirrors the existing docker.yml workflow but targets plugins/sandbox/docker as the build context and publishes to ghcr.io/<repo>-sandbox. Path filters keep it from rebuilding on unrelated changes. Assisted-by: claude-code:claude-opus-4-7
Replace the shell-out-to-docker CLI layer in plugins/sandbox/docker/dockerclient with the moby/moby/client Go SDK. The client reads connection settings from the environment via client.FromEnv (DOCKER_HOST, DOCKER_API_VERSION, DOCKER_CERT_PATH, DOCKER_TLS_VERIFY); API-version negotiation is enabled by default. DOCKER_CONTEXT is a CLI-only concept and is not supported. Structural changes: - Introduce an exported `API` interface covering the subset of APIClient we use, with `NewWithAPI(api)` for test injection. - Rewrite Create/Start/Stop/Inspect/Exec/ImagePull/ImageInspect against the SDK; demux exec output with stdcopy.StdCopy. - Replace per-file subprocess fakes with `noopAPI` in the parent package and drive preflight/host tests through NewWithAPI. - Retarget runner tests that keyed off a missing docker binary to a bogus DOCKER_HOST so Preflight's daemon ping fails deterministically. Assisted-by: claude-code:claude-opus-4-7
When anna runs inside a container and talks to a daemon on the host, the paths anna sees (e.g. /mnt/anna/workspace) don't match the paths the daemon resolves (e.g. /var/anna/workspace). Add two config fields — sandbox.docker.container_path_prefix and sandbox.docker.host_path_prefix — and rewrite bind-mount sources at the CreateOptions boundary via Config.TranslateToDaemonPath. Both prefixes must be absolute and either both set or both empty. Scope is intentionally narrow: only paths sent to the daemon (workspace root, read-only mount sources) are translated. The session's internal mount table keeps anna-view paths so toContainerPath continues to map cwd/env correctly. Assisted-by: claude-code:claude-opus-4-7
- ResolvePath now rejects absolute paths outside the session's mount set so filesystem operations cannot bypass workspace/read-only policy. - Orphan cleanup labels containers with the creating anna PID; running and paused containers are reaped only when that PID is verifiably gone (Unix Signal(0), Windows FindProcess), keeping peer anna processes safe. Transitional states fall back to a 1h age cutoff. - StartExec always attaches stdin so caller writes on handle.Stdin are delivered to the daemon instead of being silently dropped. - Default image is always allowed to auto-pull via AllowsImplicitPull, so a fresh install works without toggling allow_pull. - extra_mounts accepts :rw to match the plugin parser and docker CLI. - Document container_path_prefix / host_path_prefix for DooD setups in en/zh/ja configuration guides. Assisted-by: claude-code:claude-opus-4-7
The docker backend migrated to the moby SDK and talks to the daemon over
the socket directly, so the CLI is no longer a runtime dependency. The
Available() probe still ran exec.LookPath("docker"), which made the
backend mis-report as unavailable inside images (anna:latest) that ship
without the docker CLI even when the daemon is fully reachable.
Replace the PATH probe with a short-timeout ServerVersion ping, update
the compatibility-error message and the orphan-cleanup log, and retune
the tests to simulate an unreachable daemon via DOCKER_HOST rather than
an empty PATH.
Assisted-by: claude-code:claude-opus-4-7
DooD (anna inside a container, docker daemon on the host) needed per-agent container_path_prefix/host_path_prefix to be set manually, which was the wrong layer — DooD is a deployment decision, not a per-agent one. Introduce ANNA_HOME_HOST as the host-side path of ANNA_HOME and have the runner derive the pair automatically. Behavior: - Explicit agent config still wins (back-compat). - In-container + ANNA_HOME_HOST set → prefixes auto-filled. - In-container + ANNA_HOME_HOST unset → fail preflight with a clear message rather than the opaque "bind source does not exist" from the daemon. - Host mode + env set by mistake → logged and ignored. Container detection is conservative: only /.dockerenv and /run/.containerenv are consulted. docker:dev now seeds ANNA_HOME_HOST=$HOME/.anna-dev so the dev flow works out of the box. Docs (en/zh/ja) name ANNA_HOME_HOST as the standard mechanism and demote the two prefix fields to advanced per-agent overrides. Assisted-by: claude-code:claude-opus-4-7
Move tap from mise-managed (under $HOME/.local/share/mise) to a pinned binary at /usr/local/bin/tap. HOME is remapped to /workspace at runtime, which hides mise's own install tree, so the mise-installed tap was not reachable from the bash tool inside the sandbox. A system-wide binary is HOME-independent and resolves from the image's default PATH. Note: rtk and boxsh still live under mise in _mise.toml and hit the same HOME-remap blind spot; leaving them for a follow-up per V's request. Assisted-by: claude-code:claude-opus-4-7
sandboxProcessEnv pins HOME to the sandbox workspace so host-filesystem backends (boxsh, local) cannot touch the real user's ~/.ssh, ~/.gitconfig, etc. That isolation is redundant inside docker — each session already has its own rootfs and image-baked user home. Pinning HOME for docker translates, via the mount table, to /workspace inside the container, which hides the image's mise config, mise install tree, and shell rc files that were set up against /home/anna at build time. Gate the HOME override on backend so docker sessions keep the container's native HOME=/home/anna. That restores reachability for every tool mise installs at build time — verified empirically that tap, rtk, and boxsh all resolve on PATH at runtime with a clean bind-mounted workspace. Revert the earlier direct-binary tap install since mise can now manage it alongside rtk and boxsh; re-add tap to _mise.toml to keep one source of truth for the sandbox tool set. Assisted-by: claude-code:claude-opus-4-7
CI runners have a reachable docker daemon but no pre-pulled alpine:3.20, so TestHostContract/DockerFactory and TestSessionContract/DockerFactory failed at CreateSession with "No such image". The config already set AllowPull: true, but CreateSession does not pull — only Preflight does. Invoke Preflight from the contract tests before creating the factory, and t.Skip (not t.Fatal) on pull failures so registry outages do not masquerade as contract regressions. Assisted-by: claude-code:claude-opus-4-7
- Register sandbox core tools (bash/read/write/edit) exclusively of plugin duplicates so agent commands cannot bypass the sandbox by running the plugin versions on the anna host filesystem. - Introduce internal/version and internal/config/sandbox_image to pin the sandbox image to the anna binary version (with dev/release split). - Apply DooD path translation at preflight and orphan cleanup so all sandbox paths scope to the same daemon view as the session. - Bake HOME/PATH into plugins/sandbox/docker/Dockerfile; install tap directly in the image. - Refresh docker sandbox config/session/preflight plus their tests, UI agents page, multilingual config docs, and release pipeline. - Drop stale plan-docker-sandbox.md and wexin-bot-protol.md. Assisted-by: claude-code:claude-opus-4-7
Add symlink traversal rejection in ResolvePath to close an escape where an agent creates a symlink in the workspace and accesses it from the anna-process side. Any symlink at or below the mount root is rejected, as legitimate code never creates symlinks in session workspaces. Includes comprehensive tests verifying rejection of: - Leaf symlinks pointing outside the mount - Leaf symlinks pointing inside the mount - Symlinked ancestor directories
Adds tests for pure functions across several packages that had 0% coverage, bringing total coverage from ~49.2% to ~50.3% in CI. - pkg/sandbox: Policy validation, Registry CRUD, NewSessionID, log helpers - internal/tools: Manifest load/save, ResolveAsset, DeduplicateByName, StatusFromSpecs - plugins/sandbox/docker/dockerclient: buildContainerConfig, buildMounts, envSlice, isContainerStale, ownerProcessGone, buildExecCreateOptions - plugins/sandbox/docker: mergeEnv, buildMountTable, mapNetworkMode, translateMountsForDaemon - plugins/hooks: CloseHookPlugins (100% coverage) - plugins/tools/skills/builtin: ExtractSkills, ExtractAgents, EnsureBuiltinSkills (84% coverage) Assisted-by: claude-code:claude-sonnet-4-6
Admin now selects the active sandbox backend once on the Plugins page (auto/boxsh/docker/local) instead of per agent. Only one backend can be active at a time; enabling one disables the others. Per-agent config retains network policy (mode + allowlist). The active backend is resolved lazily via SandboxBackendFn at runner-creation time, so toggling the plugin takes effect immediately for the next session — no restart, no per-agent reload. Assisted-by: Claude:claude-sonnet-4-6
Summary
dockeras a third sandbox backend alongsideboxshandlocal, driven by a newsandbox.dockerconfig subsection (image,user,allow_pull,extra_mounts).alpine:3.20), bind-mounts the workspace at/workspaceand read-only paths at/workspace-readonly/<i>, then routesExec/StartProcessthroughdocker execwith host→container path translation.autobackend selection remainsboxsh→local; docker requires explicitsandbox.backend: "docker".whitelistnetwork mode and in-host HTTP mediation both fail closed (same posture asboxsh).Why
boxshis Linux/macOS only.boxshcontract.Design notes (full plan in
plan-docker-sandbox.md)local). Fails closed ifopts.CwdforExec/StartProcessfalls outside any configured mount.prepareSandboxin the runner is now aswitchon backend name, so a docker-configured runner does not misfireboxshpreflight (important on Windows whereboxshisn't viable).Supported()probes, but the runner builds its own factory fromcfg.Sandbox.Dockerso config overrides flow through.anna.sandbox.anna_home+anna.sandbox.session_id; async.Once-guarded sweep on runner startup removes stale ones.Commits
📝 docs: plan docker as a sandbox backend— reviewed plan doc (plan + design decisions + phase list).✨ feat: add docker backend name + SandboxDockerConfig— config constants & validation.✨ feat: add dockerclient subprocess layer for docker sandbox— shells out todockerCLI (container lifecycle, exec, orphan cleanup, labels), fully unit-tested with a shim binary.✨ feat: implement docker sandbox plugin (factory, session, host)—pkg/sandboxcontract impl.✨ feat: wire docker backend into registry and runner— registry registration +createDockerSession+prepareSandboxrefactor.✅ test: add docker backend to sandbox contract + policy-compat suites— contract + policy-compat coverage; skips when daemon absent.📝 docs: document docker as a sandbox backend— user-facing docs (en/zh/ja), changelog, README.Test plan
mise run format— green.mise run test— green (incl.internal/sandboxcontract subtests running against a real Docker daemon on macOS; alpine:3.20 pulled at session create whenAllowPull=true).mise run release:check— green (.goreleaser.yamlvalid).go test ./plugins/sandbox/docker/...— green (shim-based; no daemon needed).auto→localpath still works and explicitdockerbackend launches a container end-to-end. Not possible from this machine — please verify before merging if you have access to a Windows box.plan.md(Skill Management in Database, unrelated to this PR) should be archived or left untouched onmain.🤖 Generated with Claude Code