A TypeScript library for orchestrating AI coding agents in isolated sandboxes.
- You call
sanddune.run(). - An agent runs in a Docker container against a git worktree.
- You get commits on a branch.
- Your host working tree stays clean (or doesn't, depending on the branch strategy you pick).
Note: sanddune is heavily inspired by Matt Pocock's sandcastle. The core orchestration model — branch strategies, bind-mount providers, the
run()API shape — is based on his work.
- Git
- Docker Desktop (or any Docker-compatible runtime)
- Bun ≥ 1.3 if you're working in this repo
- An
ANTHROPIC_API_KEY
| Surface | Status | Notes |
|---|---|---|
run() |
✅ shipped | Bind-mount only; inline prompt and promptFile; multi-iteration via maxIterations + completionSignal + idleTimeoutSeconds |
docker() sandbox provider |
✅ shipped | Bind-mount, auto image-exists check, parent .git re-mount for worktrees |
claudeCode() agent provider |
✅ shipped | --print --output-format stream-json --verbose --dangerously-skip-permissions |
branchStrategy: { type: "head" } |
✅ shipped | Default. Agent writes directly to host working tree. |
branchStrategy: { type: "merge-to-head" } |
✅ shipped | Worktree under .sanddune/worktrees/<id>/, fast-forward back to HEAD on success |
branchStrategy: { type: "branch", branch } |
✅ shipped | Named branch in a worktree; reused on re-run |
| Env resolution | ✅ shipped | .sanddune/.env + agent/sandbox env + RunOptions.env |
| JSONL run log | ✅ shipped | Streamed to .sanddune/logs/<run-id>.jsonl (or logging.path) |
Terminal mode (logging: { type: "stdout" }) |
✅ shipped | Spinners + styled status lines + run summary |
onAgentStreamEvent callback |
✅ shipped | Sync, fire-and-forget; errors swallowed (file mode only) |
IterationResult.usage |
✅ shipped | Raw token counts parsed from captured Claude session JSONL (per ADR-0005b) |
| Custom bind-mount provider | ✅ shipped | Build your own by constructing a BindMountSandboxProvider |
createSandbox() |
✅ shipped | Long-lived reusable sandbox on a single branch; multiple sandbox.run() calls reuse the container; await using auto-disposes |
createWorktree() |
✅ shipped | Long-lived worktree as an independent lifecycle; wt.run() / wt.createSandbox() / wt.interactive() layer on top with split-close ownership (ADR-0010) |
interactive() |
✅ shipped | Launches the agent's TUI inside a sandbox or directly on the host; accepts bind-mount, isolated, or no-sandbox providers; uses the provider's default branch strategy |
noSandbox() sandbox provider |
✅ shipped | Runs the agent on the host with no container; accepted only by interactive() / wt.interactive(); the agent's normal permission prompts stay active |
There is no sanddune init yet — set up by hand. The repo's .sanddune/ directory shows the layout.
- Install:
npm install --save-dev @missingstudio/sanddune-
Create a
.sanddune/Dockerfilethat includesgit,gh, the Claude Code CLI, and a non-root user (use ./.sanddune/Dockerfile as a starting point). -
Build the image. The default tag is
sanddune:<repo-dir-name>:
docker build -t sanddune:my-repo -f .sanddune/Dockerfile .sanddune- Create
.sanddune/.envwith yourANTHROPIC_API_KEY:
ANTHROPIC_API_KEY=sk-ant-...
- Write a script and run it with
bun(ortsx):
import { run, claudeCode } from "@missingstudio/sanddune";
import { docker } from "@missingstudio/sanddune/sandboxes/docker";
const result = await run({
agent: claudeCode("claude-opus-4-7"),
sandbox: docker(),
prompt: "Add a HELLO.md file with the text 'hi'. Commit it.",
});
console.log(result.branch); // host's active branch
console.log(result.commits); // ["<sha>"]
console.log(result.logFilePath); // .sanddune/logs/<run-id>.jsonlThe agent runs once in Docker, makes a commit (or doesn't), and the container is destroyed.
A run goes through three phases:
- Setup — resolve env vars, determine the host's current branch, plan the branch strategy, create a worktree if needed, start the container with the worktree bind-mounted at
/workspace. - Agent invocation — invoke the agent with the (inline) prompt, stream JSON events into the run log, capture stdout text events.
- Teardown — read commits off the worktree HEAD, fast-forward back to the host branch (
merge-to-headonly), tear down the container, and clean up the worktree (preserving it on disk if dirty).
The iteration loop honors maxIterations (default 1), completionSignal (default <promise>COMPLETE</promise>, substring-matched against the agent's text events; first match across iterations wins), idleTimeoutSeconds (default 600; resets on every agent stream event — on expiry the agent subprocess is killed and the run rejects with AgentIdleTimeoutError), and signal (caller-supplied AbortSignal — when it fires mid-iteration, the agent subprocess is killed and run() rejects with signal.reason verbatim; the worktree is left as-is per ADR-0011). For prompt templates, prompt expansion evaluates !`shell expressions` once per iteration before the agent is invoked. With claudeCode() (default captureSessions: true), each iteration's session JSONL is captured from the sandbox to ~/.claude/projects/<encoded-cwd>/sessions/<id>.jsonl on the host with cwd fields rewritten so claude --resume works natively; capture is best-effort (failure logs a warning and the run still resolves successfully).
Configure where the agent's commits land via the branchStrategy option on run(). When omitted, defaults to { type: "head" } for bind-mount providers (the only kind that can run today).
| Strategy | Where commits land | Host HEAD touched? |
|---|---|---|
{ type: "head" } |
Host working tree directly. No worktree. | Yes |
{ type: "merge-to-head" } |
Temp branch in a worktree, fast-forwarded into HEAD | Yes (on success) |
{ type: "branch", branch: "agent/fix-42" } |
Named branch in a worktree | No |
// head — fast iteration during development
await run({
agent: claudeCode("claude-opus-4-7"),
sandbox: docker(),
prompt: "...",
});
// merge-to-head — safer; HEAD untouched if anything goes wrong mid-run
await run({
agent: claudeCode("claude-opus-4-7"),
sandbox: docker(),
branchStrategy: { type: "merge-to-head" },
prompt: "...",
});
// branch — commits on a named branch you can pick up later (e.g. for a PR)
await run({
agent: claudeCode("claude-opus-4-7"),
sandbox: docker(),
branchStrategy: { type: "branch", branch: "agent/fix-42" },
prompt: "...",
});For merge-to-head and branch, sanddune creates the worktree under .sanddune/worktrees/<id>/ with a lock under .sanddune/locks/<id>.lock. If the agent leaves the worktree dirty, it's preserved on disk and surfaced as result.worktreePath. For branch, re-running with the same branch name reuses the worktree (per ADR-0003).
import { run, claudeCode } from "@missingstudio/sanddune";
import { docker } from "@missingstudio/sanddune/sandboxes/docker";
const result = await run({
agent: claudeCode("claude-opus-4-7"),
sandbox: docker(),
prompt: "Fix the typo in README.md and commit.",
// Optional:
cwd: "../other-repo", // host repo root, defaults to process.cwd()
branchStrategy: { type: "branch", branch: "agent/x" }, // see Branch strategies
env: { MY_VAR: "value" }, // call-site env override (ADR-0012)
});| Option | Type | Behavior |
|---|---|---|
agent |
AgentProvider |
Required. Today: claudeCode(model, { env? }) |
sandbox |
BindMountSandboxProvider |
Required. Today: docker() or your own bind-mount provider |
prompt |
string |
Inline prompt; passed to the agent verbatim (ADR-0008) |
promptFile |
string |
Path to a prompt template file; relative paths resolve from process.cwd() |
cwd |
string |
Host repo dir; relative paths resolve from process.cwd() |
branchStrategy |
BranchStrategy |
head (default) / merge-to-head / branch |
env |
Record<string, string> |
Call-site env override; highest precedence (ADR-0012) |
Exactly one of prompt / promptFile is required. On the template path, sanddune performs host-side {{KEY}} substitution before the agent runs: every {{KEY}} placeholder is replaced with its value from promptArgs, plus the built-in arguments {{SOURCE_BRANCH}} and {{TARGET_BRANCH}} (resolved from the active branch strategy). Passing SOURCE_BRANCH or TARGET_BRANCH in promptArgs throws — built-ins cannot be overridden. A {{KEY}} with no matching arg throws naming the key; unused promptArgs keys log a warning. Inline prompt skips substitution entirely (per ADR-0008), and combining prompt with promptArgs throws. Templates can also embed !`command` shell expressions, evaluated in parallel inside the sandbox before each iteration; substitution runs first, so {{KEY}} placeholders inside shell expressions are valid (e.g. !`gh issue view {{ISSUE}}`). All of this is owned by the preparePromptPipeline() module, exported from @missingstudio/sanddune for callers who want to reuse the host-side validation outside run().
copyToWorktree accepts a list of host paths (relative paths resolve against cwd, the target-repo perspective; absolute paths are used as-is) that are copied into the worktree before any hook fires. Rejected at runtime with branchStrategy: { type: "head" } — there is no separate worktree to copy into.
hooks runs in this order: host.onWorktreeReady (sequential) → sandbox created → host.onSandboxReady ∥ sandbox.onSandboxReady (parallel; the two sides are not coordinated, so setup that needs ordering across host/sandbox must live entirely on one side). Host hooks are { command, timeoutMs? }; sandbox hooks add { sudo? }. Per-hook timeout defaults to 60_000ms and is caller-overridable via timeoutMs. The copyToWorktree step has its own timeout (timeouts.copyToWorktreeMs, default 60_000ms). Non-zero exit fails the run with the offending command and exit code; the caller signal is threaded into every hook so abort kills in-flight commands.
await run({
agent: claudeCode("claude-opus-4-7"),
sandbox: docker(),
branchStrategy: { type: "merge-to-head" },
prompt: "...",
copyToWorktree: [".env.example", "fixtures/"],
hooks: {
host: {
onWorktreeReady: [{ command: "cp .env.example .env" }],
},
sandbox: {
onSandboxReady: [{ command: "bun install", timeoutMs: 120_000 }],
},
},
timeouts: { copyToWorktreeMs: 30_000 },
});resumeSession: "<id>" continues a prior Claude Code session in a fresh sandbox. Validated before sandbox creation: the host session file must exist (at the path written by a previous capture), and the option is rejected when combined with maxIterations > 1. The file is transferred into the sandbox with cwd rewritten, and --resume <id> is passed to Claude Code on iteration 1 only — subsequent iterations start fresh. Non-Claude agent providers ignore resumeSession.
timeouts.idleSeconds, timeouts.totalSeconds. Silently ignored. Don't depend on them.
logging: { type: "file" } // default
logging: { type: "file", path: "/abs/or/relative.jsonl" } // override location
logging: { type: "file", onAgentStreamEvent: (event) => { /* forward */ } }
logging: { type: "stdout" } // terminal modeIn log-to-file mode (default), sanddune writes a run log to .sanddune/logs/<run-id>.jsonl (or path when supplied) and prints a tail -f hint. RunResult.logFilePath is the absolute path of the file. The optional onAgentStreamEvent callback fires synchronously for every parsed agent stream event carrying { iteration, timestamp, type: "text" | "toolCall", ... } — intended for forwarding to an observability system. The callback is fire-and-forget; thrown errors are swallowed onto stderr so a broken forwarder cannot kill the run.
In terminal mode ({ type: "stdout" }), sanddune renders an interactive UI directly: spinners while iterations run, a status line per iteration, and a final summary. RunResult.logFilePath is undefined; onAgentStreamEvent is not surfaced in this mode (the rendered UI is the channel).
name: "issue-42"Optional display name prefixed in log output ([issue-42] tail -f …) and terminal mode rendering for parallel-run readability. Cosmetic only — not persisted in the run log records.
interface RunResult {
branch: string; // result branch — host's active branch (head/merge-to-head) or named branch
worktreePath?: string; // set when the worktree was preserved on disk after a dirty close
iterations: IterationResult[]; // one entry per iteration that ran
commits: string[]; // SHAs reachable from worktree HEAD past the pre-run tip, ordered by iteration
completionSignal?: string; // the matched completion-signal string, if the loop terminated by signal
stdout: string; // concatenated text events from the agent stream
logFilePath?: string; // path to the JSONL run log; undefined in terminal mode
}
interface IterationResult {
iteration: number;
commitSha?: string; // last commit produced on this iteration, if any
sessionId?: string; // agent session id (Claude Code system/init), when sessionCapture is configured
sessionFilePath?: string; // host path to captured JSONL, when capture succeeded
usage?: IterationUsage; // raw token counts parsed from the captured session JSONL (per ADR-0005b)
completionSignal?: string; // set on the iteration that matched the completion signal
}
interface IterationUsage {
inputTokens: number;
outputTokens: number;
cacheCreationInputTokens?: number;
cacheReadInputTokens?: number;
}Per ADR-0012, env vars are declaration-driven: a key reaches the sandbox iff it's declared in one of these layers. process.env only supplies values for already-declared keys.
Layers, lowest → highest precedence:
process.env— value source only, never declares.sanddune/.env— declaration site and value source- Agent provider
envand sandbox providerenv— declaration sites; must not overlap RunOptions.env— call-site escape hatch; declares, sets, and overrides
await run({
agent: claudeCode("claude-opus-4-7", {
env: { ANTHROPIC_API_KEY: "sk-ant-..." },
}),
sandbox: docker({
env: { SOME_DOCKER_VAR: "x" },
}),
env: { OVERRIDE: "y" },
prompt: "...",
});If an agent provider's env and a sandbox provider's env share a key, run() throws.
docker({
image: "sanddune:my-repo", // optional; defaults to sanddune:<repo-dir-name>
env: { SOMETHING: "value" }, // optional; merged per ADR-0012
})Today's DockerOptions is { image?, env? }. The image must already exist locally; sanddune does not build it for you (no sanddune init / build-image yet). Build manually:
docker build -t sanddune:my-repo -f .sanddune/Dockerfile .sandduneThe container is started with -w /workspace and the worktree bind-mounted there. If the worktree's .git is a pointer file (true for git worktrees), sanddune also bind-mounts the parent .git directory at its host path inside the container so the pointer resolves (per ADR-0006).
claudeCode("claude-opus-4-7", {
env: { ANTHROPIC_API_KEY: "sk-ant-..." }, // optional; merged per ADR-0012
captureSessions: true, // optional; default true. Set false to skip session capture
})captureSessions controls whether each iteration's agent session JSONL is captured from the sandbox to the host (~/.claude/projects/<encoded-cwd>/sessions/<id>.jsonl, with cwd rewritten so claude --resume works natively). Capture is best-effort — failure logs a warning and the run still resolves successfully. The effort option shown in the brief doesn't exist yet.
interactive() launches the agent's interactive UI (e.g. Claude Code's TUI) and resolves when the user exits. It accepts all three sandbox provider kinds — bind-mount, isolated, and no-sandbox — and always uses the provider's default branch strategy. There is no branchStrategy option on top-level interactive(); for a non-default strategy with a TUI, route through createWorktree() + wt.interactive() (per ADR-0009).
import { createWorktree, interactive, claudeCode } from "@missingstudio/sanddune";
import { docker } from "@missingstudio/sanddune/sandboxes/docker";
import { noSandbox } from "@missingstudio/sanddune/sandboxes/no-sandbox";
// TUI inside Docker — same flow as run(), but you drive it.
await interactive({
agent: claudeCode("claude-opus-4-7"),
sandbox: docker(),
prompt: "Help me refactor the prompt pipeline.", // optional seed prompt
});
// TUI directly on the host — no container, agent's permission prompts stay on.
await interactive({
agent: claudeCode("claude-opus-4-7"),
sandbox: noSandbox(),
cwd: "../other-repo", // optional; relative paths resolve against process.cwd()
});
// TUI on a worktree branch (non-default strategy needs the wt.* path).
await using wt = await createWorktree({
branchStrategy: { type: "branch", branch: "agent/explore" },
});
await wt.interactive({ agent: claudeCode("claude-opus-4-7") });
// `sandbox` defaults to noSandbox() on wt.interactive(); pass docker() etc. to override.| Option | Type | Behavior |
|---|---|---|
agent |
AgentProvider |
Required. Must declare buildInteractiveCommand (claudeCode does) |
sandbox |
InteractiveSandboxProvider |
Required on interactive(); defaults to noSandbox() on wt.interactive() |
prompt |
string |
Optional seed prompt; passed as a positional arg to the agent |
promptFile |
string |
Optional template file; {{KEY}} substitution + shell expressions |
promptArgs |
Record<string, string|number> |
Values for {{KEY}} placeholders; only valid with promptFile |
cwd |
string |
Host repo dir; relative paths resolve from process.cwd() |
env |
Record<string, string> |
Call-site env override (per ADR-0012) |
hooks |
SandboxHooks |
Same shape as run(); sandbox-side hooks are skipped under noSandbox() |
signal |
AbortSignal |
Abort cancels the launch handshake; once the TUI is live, exit semantics depend on the underlying agent process |
copyToWorktree |
string[] |
Copies host items into the worktree before hooks fire (rejected with branch strategy head) |
maxIterations, completionSignal, idleTimeoutSeconds, and logging are not part of InteractiveOptions — interactive sessions are user-driven, not iteration-bounded.
Under bind-mount and isolated providers, sanddune passes --dangerously-skip-permissions to Claude Code so the agent can act freely inside the container. Under noSandbox(), the agent runs directly on the host and the flag is not passed — Claude Code's normal permission prompts stay active. This is enforced via the agent provider's buildInteractiveCommand({ skipPermissions }) callback; the orchestrator decides skipPermissions based on the sandbox kind.
noSandbox() is accepted only by interactive() / wt.interactive(). The type system rejects it for run() and createSandbox() — AFK runs require a real sandbox.
If you want to wrap something other than Docker (a different container runtime, an SSH host, a local subprocess), construct a BindMountSandboxProvider directly:
import type {
BindMountCreateOptions,
BindMountSandboxHandle,
BindMountSandboxProvider,
} from "@missingstudio/sanddune";
const myProvider = (): BindMountSandboxProvider => ({
kind: "bind-mount",
name: "my-provider",
create: async (opts: BindMountCreateOptions): Promise<BindMountSandboxHandle> => {
// opts.worktreePath — host path to mount into your sandbox
// opts.hostRepoPath — caller's `cwd`, e.g. for default image-name derivation
// opts.env — resolved env vars to inject
return {
worktreePath: "/workspace", // sandbox-side path
exec: async (command, execOpts) => ({
stdout: "...",
stderr: "...",
exitCode: 0,
}),
close: async () => { /* tear down */ },
};
},
});Reference implementation: packages/sanddune/src/sandboxes/docker.ts.
The isolated-provider factory is declared in the type system but not yet usable from run().
Every run() in log-to-file mode (the default) streams JSONL events to .sanddune/logs/<run-id>.jsonl (or the path supplied via logging.path) and prints a tail -f hint at start. Pass logging: { type: "stdout" } to switch to terminal mode — sanddune then renders the run inline (spinners + status lines + summary) and RunResult.logFilePath is undefined. Follow the file from another terminal:
tail -f .sanddune/logs/*.jsonl./.sanddune/ contains three runnable demos, one per branch strategy. After bun install && bun run build, build the image and run any of:
bun .sanddune/docker-head-claude-code.ts
bun .sanddune/docker-merge-to-head-claude-code.ts
bun .sanddune/docker-branch-claude-code.tsSee .sanddune/README.md for the full walkthrough.
CONTEXT.md— domain language and relationshipsdocs/adr/— architecture decision records
bun install
bun run build # tsc -p tsconfig.build.json + bun build, orchestrated by turbo
bun test # bun test
bun run typecheck # tsgo --noEmitsanddune is heavily inspired by sandcastle by Matt Pocock. If you're evaluating agent orchestration libraries, go check out the original.
MIT
