Releases: jbeshir/demesne
Releases · jbeshir/demesne
Release list
Release v0.2.0
Added
backgroundoption onsandbox_script,sandbox_agent, andsandbox_research(host and in-sandbox child surfaces): whentrue, the tool returns immediately with{job_id, status:"running"}instead of blocking.sandbox_statustool (host and child): non-blocking status snapshot for a background job — returns status, elapsed time, a stdout tail, and cost/exit-code once terminal.sandbox_waittool (host and child): blocks up totimeout_seconds(default 30, hard-capped at 120) for a background job to reach a terminal state; returns the final result or{status:"running", message:"still running; call sandbox_wait again"}on timeout.sandbox_canceltool (host and child): cancels a background job and its entire descendant subtree depth-first, tearing down each sandbox via the existing sidecar/egress deferred path.- In-memory job registry (
internal/sandbox/jobs.go): job state lives in memory for the process lifetime with no on-disk persistence; jobs do NOT survive an MCP-server restart (a stale job_id then returnsErrJobNotFound); a TTL reaper retains terminal jobs ~1h to bound memory; orphaned containers from a crashed/restarted process are reaped independently byReapOrphansvia thedemesne.ownerlabel. sandbox_usage_reporttool: token-usage & cost introspection for a finished job — walks the job's/outtree and breaks spend down by child/phase, model, token-type, and (claude-code) per-subagent, joining the per-requestusage.jsonlagainst a distilled transcriptattribution.jsonl; unattributed spend rolls up to(main)and is never dropped, with anOutputRoot-escape guard.AgentResult/results.jsongain an additiveper_model_tokensbreakdown rolled up the descendant tree, and previously-silent parse / no-usage-block drops are now counted inusage.json.
Changed
- claude-code agent image tracks the host Claude Code version: the image build folds a
CLAUDE_CODE_VERSIONbuild arg (resolved from the hostclaude --version, falling back tonpm viewthenlatest) into its content-hash cache key, so the sandbox CLI rebuilds automatically on a host Claude Code upgrade instead of drifting behind and rejecting a freshly released model alias. No demesne release is needed for the sandbox to track the host. Image builders without aBuildArgsresolver hash exactly as before. - Build-toolchain telemetry disabled in the sandbox env:
sandboxEnv()injects telemetry/analytics opt-out vars (wranglerWRANGLER_SEND_METRICS/ERROR_REPORTS, the Next/Nuxt/Angular/Storybook/Vercel/Yarn CLIs, npm update-notifier/funding noise, pip's version check, Prisma/HashiCorp checkpoint, Nx Cloud, plusDO_NOT_TRACK=1as a catch-all) so build tools don't phone home. Under restricted egress these calls previously stalled the build against the deny-by-default network policy until they timed out, so this also de-flakes and speeds up sandboxed builds. - Internal job hooks (
JobHooks,internalAgentSpec,sandboxPrepOptions): the mid-run job-tracking plumbing was reduced to a singleOnOutputReady(outHost, resultsHost)callback that records the live output/results paths forsandbox_status; the write-onlyOnSandboxCreatedhook and run-UUID parameter (and their now-dead job fields) were dropped. Internal only — no behaviour change; the MCP tool surface (sandbox_status/sandbox_wait/sandbox_cancel) is unchanged.
Removed
agentparameter onsandbox_agent/sandbox_research(and the in-sandbox child variants): model aliases are globally unique across providers, so the provider is now inferred frommodelvia a registry-driven lookup guarded by a uniqueness test. Setting only a claude-codemodelsuch assonnetno longer fails against the codex-first default provider (model "sonnet" is not in the Codex allowlist). An emptymodelpreserves the credential-aware default: codex /gpt-5.5when Codex credentials are configured, otherwise claude-code /sonnet.
Caveats
- This is a pre-1.0 release; APIs and the tool surface may change.
Release v0.1.1
Added
fablemodel tier: the Claudefablealias (most capable tier, aboveopus) is now selectable as themodelforsandbox_agent/sandbox_researchand the in-sandbox child variants when claude-code credentials are configured. Added to the pricing catalog so its usage counts toward cost reporting and the cap.mediasandbox image: a new demesne-built image (FROM ubuntu:24.04) carrying ffmpeg, ImageMagick, libvips, and a broad audio toolbox (sox, lame, flac, opus-tools) for video/audio/image conversion. Wired throughsandbox_script/sandbox_create/ in-sandbox child variants exactly like the existingbrowserimage; built lazily on the host on first use and content-hash cached viaagentcommon.ImageBuilder.
Caveats
- This is a pre-1.0 release; APIs and the tool surface may change.
Release v0.1.0
First public release — an agent-agnostic, local, containerised agent-orchestration MCP server you drive from your agent of choice. It runs untrusted shell, scripts, and AI coding agents in disposable OpenSandbox containers, with read-only host mounts and egress allowlists.
Tools
- Sandboxes —
sandbox_script(one-shot) plussandbox_create/sandbox_exec/sandbox_upload/sandbox_download/sandbox_destroy(persistent) run shell and scripts in disposable containers. - Agents —
sandbox_agentandsandbox_researchrun a coding-agent CLI inside a sandbox:codexby default when Codex credentials are configured, otherwiseclaude-code. Each tool advertises itsagent/modeloptions filtered to the providers you have credentials for. Containerised agents can spawn child sandboxes and, with configuration, reach a read-only subset of the host's MCP server tools.
Security and orchestration
- Read-only host inputs at
/in; an output-only/outwhose host directory defaults to~/.demesne/out(always included in the mount allowlist); per-tool egress allowlists; agent outbound HTTPS confined to a credential-isolating per-sandbox proxy sidecar, so the agent never sees the real token. - Separate, tail-bounded stdout/stderr in tool results; indicative per-run cost reporting; a results roll-up across the child-sandbox tree.
- Host MCP proxy: re-expose a curated, read-only subset of the stdio MCP servers from your Claude Code (
DEMESNE_CLAUDE_CODE_MCP_CONFIG, default~/.claude.json) and Codex (DEMESNE_CODEX_MCP_CONFIG, default~/.codex/config.toml) configs — merged, with Codex winning on name conflicts — to containerised agents through a per-sandbox tunnel.
The milestone sections below (M1–M6) are the per-feature development log that rolls into this release.
Caveats
- This is a pre-1.0 release; APIs and the tool surface may change.