GitHub - missingstudio/sanddune: A TypeScript library for orchestrating AI coding agents in isolated sandboxes

What is sanddune?

A TypeScript library for orchestrating AI coding agents in isolated sandboxes.

You call sanddune.run().
An agent runs in a Docker container against a git worktree.
You get commits on a branch.
Your host working tree stays clean (or doesn't, depending on the branch strategy you pick).

Note: sanddune is heavily inspired by Matt Pocock's sandcastle. The core orchestration model — branch strategies, bind-mount providers, the run() API shape — is based on his work.

Prerequisites

Git
Docker Desktop (or any Docker-compatible runtime)
Bun ≥ 1.3 if you're working in this repo
An ANTHROPIC_API_KEY

What's implemented today

Surface	Status	Notes
`run()`	✅ shipped	Bind-mount only; inline `prompt` and `promptFile`; multi-iteration via `maxIterations` + `completionSignal` + `idleTimeoutSeconds`
`docker()` sandbox provider	✅ shipped	Bind-mount, auto image-exists check, parent `.git` re-mount for worktrees
`claudeCode()` agent provider	✅ shipped	`--print --output-format stream-json --verbose --dangerously-skip-permissions`
`branchStrategy: { type: "head" }`	✅ shipped	Default. Agent writes directly to host working tree.
`branchStrategy: { type: "merge-to-head" }`	✅ shipped	Worktree under `.sanddune/worktrees/<id>/`, fast-forward back to HEAD on success
`branchStrategy: { type: "branch", branch }`	✅ shipped	Named branch in a worktree; reused on re-run
Env resolution	✅ shipped	`.sanddune/.env` + agent/sandbox `env` + `RunOptions.env`
JSONL run log	✅ shipped	Streamed to `.sanddune/logs/<run-id>.jsonl` (or `logging.path`)
Terminal mode (`logging: { type: "stdout" }`)	✅ shipped	Spinners + styled status lines + run summary
`onAgentStreamEvent` callback	✅ shipped	Sync, fire-and-forget; errors swallowed (file mode only)
`IterationResult.usage`	✅ shipped	Raw token counts parsed from captured Claude session JSONL (per ADR-0005b)
Custom bind-mount provider	✅ shipped	Build your own by constructing a `BindMountSandboxProvider`
`createSandbox()`	✅ shipped	Long-lived reusable sandbox on a single branch; multiple `sandbox.run()` calls reuse the container; `await using` auto-disposes
`createWorktree()`	✅ shipped	Long-lived worktree as an independent lifecycle; `wt.run()` / `wt.createSandbox()` / `wt.interactive()` layer on top with split-close ownership (ADR-0010)
`interactive()`	✅ shipped	Launches the agent's TUI inside a sandbox or directly on the host; accepts `bind-mount`, `isolated`, or `no-sandbox` providers; uses the provider's default branch strategy
`noSandbox()` sandbox provider	✅ shipped	Runs the agent on the host with no container; accepted only by `interactive()` / `wt.interactive()`; the agent's normal permission prompts stay active

Quick start

There is no sanddune init yet — set up by hand. The repo's .sanddune/ directory shows the layout.

Install:

npm install --save-dev @missingstudio/sanddune

Create a .sanddune/Dockerfile that includes git, gh, the Claude Code CLI, and a non-root user (use ./.sanddune/Dockerfile as a starting point).
Build the image. The default tag is sanddune:<repo-dir-name>:

docker build -t sanddune:my-repo -f .sanddune/Dockerfile .sanddune

Create .sanddune/.env with your ANTHROPIC_API_KEY:

ANTHROPIC_API_KEY=sk-ant-...

Write a script and run it with bun (or tsx):

import { run, claudeCode } from "@missingstudio/sanddune";
import { docker } from "@missingstudio/sanddune/sandboxes/docker";

const result = await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  prompt: "Add a HELLO.md file with the text 'hi'. Commit it.",
});

console.log(result.branch);          // host's active branch
console.log(result.commits);         // ["<sha>"]
console.log(result.logFilePath);     // .sanddune/logs/<run-id>.jsonl

The agent runs once in Docker, makes a commit (or doesn't), and the container is destroyed.

How it works

A run goes through three phases:

Setup — resolve env vars, determine the host's current branch, plan the branch strategy, create a worktree if needed, start the container with the worktree bind-mounted at /workspace.
Agent invocation — invoke the agent with the (inline) prompt, stream JSON events into the run log, capture stdout text events.
Teardown — read commits off the worktree HEAD, fast-forward back to the host branch (merge-to-head only), tear down the container, and clean up the worktree (preserving it on disk if dirty).

The iteration loop honors maxIterations (default 1), completionSignal (default <promise>COMPLETE</promise>, substring-matched against the agent's text events; first match across iterations wins), idleTimeoutSeconds (default 600; resets on every agent stream event — on expiry the agent subprocess is killed and the run rejects with AgentIdleTimeoutError), and signal (caller-supplied AbortSignal — when it fires mid-iteration, the agent subprocess is killed and run() rejects with signal.reason verbatim; the worktree is left as-is per ADR-0011). For prompt templates, prompt expansion evaluates !`shell expressions` once per iteration before the agent is invoked. With claudeCode() (default captureSessions: true), each iteration's session JSONL is captured from the sandbox to ~/.claude/projects/<encoded-cwd>/sessions/<id>.jsonl on the host with cwd fields rewritten so claude --resume works natively; capture is best-effort (failure logs a warning and the run still resolves successfully).

Branch strategies

Configure where the agent's commits land via the branchStrategy option on run(). When omitted, defaults to { type: "head" } for bind-mount providers (the only kind that can run today).

Strategy	Where commits land	Host HEAD touched?
`{ type: "head" }`	Host working tree directly. No worktree.	Yes
`{ type: "merge-to-head" }`	Temp branch in a worktree, fast-forwarded into HEAD	Yes (on success)
`{ type: "branch", branch: "agent/fix-42" }`	Named branch in a worktree	No

// head — fast iteration during development
await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  prompt: "...",
});

// merge-to-head — safer; HEAD untouched if anything goes wrong mid-run
await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  branchStrategy: { type: "merge-to-head" },
  prompt: "...",
});

// branch — commits on a named branch you can pick up later (e.g. for a PR)
await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  branchStrategy: { type: "branch", branch: "agent/fix-42" },
  prompt: "...",
});

For merge-to-head and branch, sanddune creates the worktree under .sanddune/worktrees/<id>/ with a lock under .sanddune/locks/<id>.lock. If the agent leaves the worktree dirty, it's preserved on disk and surfaced as result.worktreePath. For branch, re-running with the same branch name reuses the worktree (per ADR-0003).

API

`run(options)`

import { run, claudeCode } from "@missingstudio/sanddune";
import { docker } from "@missingstudio/sanddune/sandboxes/docker";

const result = await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  prompt: "Fix the typo in README.md and commit.",

  // Optional:
  cwd: "../other-repo",                                  // host repo root, defaults to process.cwd()
  branchStrategy: { type: "branch", branch: "agent/x" }, // see Branch strategies
  env: { MY_VAR: "value" },                              // call-site env override (ADR-0012)
});

Currently accepted

Option	Type	Behavior
`agent`	`AgentProvider`	Required. Today: `claudeCode(model, { env? })`
`sandbox`	`BindMountSandboxProvider`	Required. Today: `docker()` or your own bind-mount provider
`prompt`	`string`	Inline prompt; passed to the agent verbatim (ADR-0008)
`promptFile`	`string`	Path to a prompt template file; relative paths resolve from `process.cwd()`
`cwd`	`string`	Host repo dir; relative paths resolve from `process.cwd()`
`branchStrategy`	`BranchStrategy`	`head` (default) / `merge-to-head` / `branch`
`env`	`Record<string, string>`	Call-site env override; highest precedence (ADR-0012)

Exactly one of prompt / promptFile is required. On the template path, sanddune performs host-side {{KEY}} substitution before the agent runs: every {{KEY}} placeholder is replaced with its value from promptArgs, plus the built-in arguments {{SOURCE_BRANCH}} and {{TARGET_BRANCH}} (resolved from the active branch strategy). Passing SOURCE_BRANCH or TARGET_BRANCH in promptArgs throws — built-ins cannot be overridden. A {{KEY}} with no matching arg throws naming the key; unused promptArgs keys log a warning. Inline prompt skips substitution entirely (per ADR-0008), and combining prompt with promptArgs throws. Templates can also embed !`command` shell expressions, evaluated in parallel inside the sandbox before each iteration; substitution runs first, so {{KEY}} placeholders inside shell expressions are valid (e.g. !`gh issue view {{ISSUE}}`). All of this is owned by the preparePromptPipeline() module, exported from @missingstudio/sanddune for callers who want to reuse the host-side validation outside run().

Lifecycle hooks and `copyToWorktree`

copyToWorktree accepts a list of host paths (relative paths resolve against cwd, the target-repo perspective; absolute paths are used as-is) that are copied into the worktree before any hook fires. Rejected at runtime with branchStrategy: { type: "head" } — there is no separate worktree to copy into.

hooks runs in this order: host.onWorktreeReady (sequential) → sandbox created → host.onSandboxReady ∥ sandbox.onSandboxReady (parallel; the two sides are not coordinated, so setup that needs ordering across host/sandbox must live entirely on one side). Host hooks are { command, timeoutMs? }; sandbox hooks add { sudo? }. Per-hook timeout defaults to 60_000ms and is caller-overridable via timeoutMs. The copyToWorktree step has its own timeout (timeouts.copyToWorktreeMs, default 60_000ms). Non-zero exit fails the run with the offending command and exit code; the caller signal is threaded into every hook so abort kills in-flight commands.

await run({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  branchStrategy: { type: "merge-to-head" },
  prompt: "...",
  copyToWorktree: [".env.example", "fixtures/"],
  hooks: {
    host: {
      onWorktreeReady: [{ command: "cp .env.example .env" }],
    },
    sandbox: {
      onSandboxReady: [{ command: "bun install", timeoutMs: 120_000 }],
    },
  },
  timeouts: { copyToWorktreeMs: 30_000 },
});

resumeSession: "<id>" continues a prior Claude Code session in a fresh sandbox. Validated before sandbox creation: the host session file must exist (at the path written by a previous capture), and the option is rejected when combined with maxIterations > 1. The file is transferred into the sandbox with cwd rewritten, and --resume <id> is passed to Claude Code on iteration 1 only — subsequent iterations start fresh. Non-Claude agent providers ignore resumeSession.

Accepted by the type but not yet wired

timeouts.idleSeconds, timeouts.totalSeconds. Silently ignored. Don't depend on them.

`logging`

logging: { type: "file" }                                // default
logging: { type: "file", path: "/abs/or/relative.jsonl" } // override location
logging: { type: "file", onAgentStreamEvent: (event) => { /* forward */ } }
logging: { type: "stdout" }                              // terminal mode

In log-to-file mode (default), sanddune writes a run log to .sanddune/logs/<run-id>.jsonl (or path when supplied) and prints a tail -f hint. RunResult.logFilePath is the absolute path of the file. The optional onAgentStreamEvent callback fires synchronously for every parsed agent stream event carrying { iteration, timestamp, type: "text" | "toolCall", ... } — intended for forwarding to an observability system. The callback is fire-and-forget; thrown errors are swallowed onto stderr so a broken forwarder cannot kill the run.

In terminal mode ({ type: "stdout" }), sanddune renders an interactive UI directly: spinners while iterations run, a status line per iteration, and a final summary. RunResult.logFilePath is undefined; onAgentStreamEvent is not surfaced in this mode (the rendered UI is the channel).

`name`

name: "issue-42"

Optional display name prefixed in log output ([issue-42] tail -f …) and terminal mode rendering for parallel-run readability. Cosmetic only — not persisted in the run log records.

`RunResult`

interface RunResult {
  branch: string;                  // result branch — host's active branch (head/merge-to-head) or named branch
  worktreePath?: string;           // set when the worktree was preserved on disk after a dirty close
  iterations: IterationResult[];   // one entry per iteration that ran
  commits: string[];               // SHAs reachable from worktree HEAD past the pre-run tip, ordered by iteration
  completionSignal?: string;       // the matched completion-signal string, if the loop terminated by signal
  stdout: string;                  // concatenated text events from the agent stream
  logFilePath?: string;            // path to the JSONL run log; undefined in terminal mode
}

interface IterationResult {
  iteration: number;
  commitSha?: string;              // last commit produced on this iteration, if any
  sessionId?: string;              // agent session id (Claude Code system/init), when sessionCapture is configured
  sessionFilePath?: string;        // host path to captured JSONL, when capture succeeded
  usage?: IterationUsage;          // raw token counts parsed from the captured session JSONL (per ADR-0005b)
  completionSignal?: string;       // set on the iteration that matched the completion signal
}

interface IterationUsage {
  inputTokens: number;
  outputTokens: number;
  cacheCreationInputTokens?: number;
  cacheReadInputTokens?: number;
}

Environment resolution

Per ADR-0012, env vars are declaration-driven: a key reaches the sandbox iff it's declared in one of these layers. process.env only supplies values for already-declared keys.

Layers, lowest → highest precedence:

process.env — value source only, never declares
.sanddune/.env — declaration site and value source
Agent provider env and sandbox provider env — declaration sites; must not overlap
RunOptions.env — call-site escape hatch; declares, sets, and overrides

await run({
  agent: claudeCode("claude-opus-4-7", {
    env: { ANTHROPIC_API_KEY: "sk-ant-..." },
  }),
  sandbox: docker({
    env: { SOME_DOCKER_VAR: "x" },
  }),
  env: { OVERRIDE: "y" },
  prompt: "...",
});

If an agent provider's env and a sandbox provider's env share a key, run() throws.

`docker(options?)`

docker({
  image: "sanddune:my-repo",                  // optional; defaults to sanddune:<repo-dir-name>
  env: { SOMETHING: "value" },                // optional; merged per ADR-0012
})

Today's DockerOptions is { image?, env? }. The image must already exist locally; sanddune does not build it for you (no sanddune init / build-image yet). Build manually:

docker build -t sanddune:my-repo -f .sanddune/Dockerfile .sanddune

The container is started with -w /workspace and the worktree bind-mounted there. If the worktree's .git is a pointer file (true for git worktrees), sanddune also bind-mounts the parent .git directory at its host path inside the container so the pointer resolves (per ADR-0006).

`claudeCode(model, options?)`

claudeCode("claude-opus-4-7", {
  env: { ANTHROPIC_API_KEY: "sk-ant-..." },   // optional; merged per ADR-0012
  captureSessions: true,                       // optional; default true. Set false to skip session capture
})

captureSessions controls whether each iteration's agent session JSONL is captured from the sandbox to the host (~/.claude/projects/<encoded-cwd>/sessions/<id>.jsonl, with cwd rewritten so claude --resume works natively). Capture is best-effort — failure logs a warning and the run still resolves successfully. The effort option shown in the brief doesn't exist yet.

`interactive(options)` and `wt.interactive(options)`

interactive() launches the agent's interactive UI (e.g. Claude Code's TUI) and resolves when the user exits. It accepts all three sandbox provider kinds — bind-mount, isolated, and no-sandbox — and always uses the provider's default branch strategy. There is no branchStrategy option on top-level interactive(); for a non-default strategy with a TUI, route through createWorktree() + wt.interactive() (per ADR-0009).

import { createWorktree, interactive, claudeCode } from "@missingstudio/sanddune";
import { docker } from "@missingstudio/sanddune/sandboxes/docker";
import { noSandbox } from "@missingstudio/sanddune/sandboxes/no-sandbox";

// TUI inside Docker — same flow as run(), but you drive it.
await interactive({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: docker(),
  prompt: "Help me refactor the prompt pipeline.",  // optional seed prompt
});

// TUI directly on the host — no container, agent's permission prompts stay on.
await interactive({
  agent: claudeCode("claude-opus-4-7"),
  sandbox: noSandbox(),
  cwd: "../other-repo",   // optional; relative paths resolve against process.cwd()
});

// TUI on a worktree branch (non-default strategy needs the wt.* path).
await using wt = await createWorktree({
  branchStrategy: { type: "branch", branch: "agent/explore" },
});
await wt.interactive({ agent: claudeCode("claude-opus-4-7") });
// `sandbox` defaults to noSandbox() on wt.interactive(); pass docker() etc. to override.

Option	Type	Behavior
`agent`	`AgentProvider`	Required. Must declare `buildInteractiveCommand` (claudeCode does)
`sandbox`	`InteractiveSandboxProvider`	Required on `interactive()`; defaults to `noSandbox()` on `wt.interactive()`
`prompt`	`string`	Optional seed prompt; passed as a positional arg to the agent
`promptFile`	`string`	Optional template file; `{{KEY}}` substitution + shell expressions
`promptArgs`	`Record<string, string\|number>`	Values for `{{KEY}}` placeholders; only valid with `promptFile`
`cwd`	`string`	Host repo dir; relative paths resolve from `process.cwd()`
`env`	`Record<string, string>`	Call-site env override (per ADR-0012)
`hooks`	`SandboxHooks`	Same shape as `run()`; sandbox-side hooks are skipped under `noSandbox()`
`signal`	`AbortSignal`	Abort cancels the launch handshake; once the TUI is live, exit semantics depend on the underlying agent process
`copyToWorktree`	`string[]`	Copies host items into the worktree before hooks fire (rejected with branch strategy `head`)

maxIterations, completionSignal, idleTimeoutSeconds, and logging are not part of InteractiveOptions — interactive sessions are user-driven, not iteration-bounded.

`noSandbox()` and the `--dangerously-skip-permissions` policy

Under bind-mount and isolated providers, sanddune passes --dangerously-skip-permissions to Claude Code so the agent can act freely inside the container. Under noSandbox(), the agent runs directly on the host and the flag is not passed — Claude Code's normal permission prompts stay active. This is enforced via the agent provider's buildInteractiveCommand({ skipPermissions }) callback; the orchestrator decides skipPermissions based on the sandbox kind.

noSandbox() is accepted only by interactive() / wt.interactive(). The type system rejects it for run() and createSandbox() — AFK runs require a real sandbox.

Custom bind-mount providers

If you want to wrap something other than Docker (a different container runtime, an SSH host, a local subprocess), construct a BindMountSandboxProvider directly:

import type {
  BindMountCreateOptions,
  BindMountSandboxHandle,
  BindMountSandboxProvider,
} from "@missingstudio/sanddune";

const myProvider = (): BindMountSandboxProvider => ({
  kind: "bind-mount",
  name: "my-provider",
  create: async (opts: BindMountCreateOptions): Promise<BindMountSandboxHandle> => {
    // opts.worktreePath  — host path to mount into your sandbox
    // opts.hostRepoPath  — caller's `cwd`, e.g. for default image-name derivation
    // opts.env           — resolved env vars to inject

    return {
      worktreePath: "/workspace",                // sandbox-side path
      exec: async (command, execOpts) => ({
        stdout: "...",
        stderr: "...",
        exitCode: 0,
      }),
      close: async () => { /* tear down */ },
    };
  },
});

Reference implementation: packages/sanddune/src/sandboxes/docker.ts.

The isolated-provider factory is declared in the type system but not yet usable from run().

Run log

Every run() in log-to-file mode (the default) streams JSONL events to .sanddune/logs/<run-id>.jsonl (or the path supplied via logging.path) and prints a tail -f hint at start. Pass logging: { type: "stdout" } to switch to terminal mode — sanddune then renders the run inline (spinners + status lines + summary) and RunResult.logFilePath is undefined. Follow the file from another terminal:

tail -f .sanddune/logs/*.jsonl

Trying it in this repo

./.sanddune/ contains three runnable demos, one per branch strategy. After bun install && bun run build, build the image and run any of:

bun .sanddune/docker-head-claude-code.ts
bun .sanddune/docker-merge-to-head-claude-code.ts
bun .sanddune/docker-branch-claude-code.ts

See .sanddune/README.md for the full walkthrough.

Design docs

CONTEXT.md — domain language and relationships
docs/adr/ — architecture decision records

Development

bun install
bun run build      # tsc -p tsconfig.build.json + bun build, orchestrated by turbo
bun test           # bun test
bun run typecheck  # tsgo --noEmit

Acknowledgments

sanddune is heavily inspired by sandcastle by Matt Pocock. If you're evaluating agent orchestration libraries, go check out the original.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.changeset		.changeset
.claude		.claude
.sanddune		.sanddune
apps		apps
docs		docs
images		images
packages		packages
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTEXT.md		CONTEXT.md
README.md		README.md
bun.lock		bun.lock
package.json		package.json
sanddune-brief.md		sanddune-brief.md
tsconfig.json		tsconfig.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is sanddune?

Prerequisites

What's implemented today

Quick start

How it works

Branch strategies

API

`run(options)`

Currently accepted

Lifecycle hooks and `copyToWorktree`

Accepted by the type but not yet wired

`logging`

`name`

`RunResult`

Environment resolution

`docker(options?)`

`claudeCode(model, options?)`

`interactive(options)` and `wt.interactive(options)`

`noSandbox()` and the `--dangerously-skip-permissions` policy

Custom bind-mount providers

Run log

Trying it in this repo

Design docs

Development

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What is sanddune?

Prerequisites

What's implemented today

Quick start

How it works

Branch strategies

API

run(options)

Currently accepted

Lifecycle hooks and copyToWorktree

Accepted by the type but not yet wired

logging

name

RunResult

Environment resolution

docker(options?)

claudeCode(model, options?)

interactive(options) and wt.interactive(options)

noSandbox() and the --dangerously-skip-permissions policy

Custom bind-mount providers

Run log

Trying it in this repo

Design docs

Development

Acknowledgments

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`run(options)`

Lifecycle hooks and `copyToWorktree`

`logging`

`name`

`RunResult`

`docker(options?)`

`claudeCode(model, options?)`

`interactive(options)` and `wt.interactive(options)`

`noSandbox()` and the `--dangerously-skip-permissions` policy

Packages