Skip to content

Releases: lxcshine/nexusbox

v0.1.2

25 Jun 16:32
a233f90

Choose a tag to compare

Release Notes

NexusBox Sandbox v0.1.2

Release date: 2026-06-26

This release delivers Phase 2 (Developer Experience): four协同 features that bring NexusBox from "sandbox that runs" to "sandbox developers actually want to use" — snapshot/restore, multi-language runtimes, hot-reload configuration, and workspace isolation. Alongside the features, this release adds project visualizations, a self-contained HTML test report, and two stdio-driven MCP availability scripts that catch a real Go/Java gap in the MCP entrypoint.


Highlights

  • Snapshot / Restore (VSS) — Return a corrupted workspace to the last good state without restarting the service. Backend abstraction auto-selects Windows VSS (vssadmin) on Windows and a filesystem mirror elsewhere, with graceful fallback when VSS is unavailable.
  • Multi-language runtime (Python / Node / Go / Java) — Cover mainstream AI programming scenarios. Both the Gateway REST entrypoint and the MCP stdio entrypoint now compile and run all four languages.
  • Hot-reload configuration — Update sandbox runtime configuration (concurrency, timeouts, endpoints) without restarting. A dependency-free polling watcher applies new config to the RuntimeManager live, inheriting empty fields from defaults so partial updates are safe.
  • Workspace isolation (multi-project parallelism) — Multiple AI sessions no longer collide. Workspace.ResolvePath confines every file path to the session's root, rejecting ../ and re-rooting absolute paths; resource quotas cap CPU, memory, disk, and file count per workspace.
  • Visual test report — A self-contained HTML report (inline SVG charts, no external deps) summarizes the latest MCP availability run at a glance.

New Features

Snapshot / Restore (VSS)

  • Introduce SnapshotBackend interface with three implementations: vssBackend (Windows Volume Shadow Copy), filesystemBackend (cross-platform hardlink + mirror with byte-copy fallback), and containerdBackend (Linux CRIU checkpoint, retained for compat).
  • Auto-select backend by OS; allow injection of a custom backend for tests.
  • Track per-snapshot metadata (source path, backend, checksum, created time) and prune by age.
  • RestoreSnapshot reproduces the source tree from the snapshot directory into the target path; an empty target reuses the original source path recorded in metadata.

Multi-Language Runtime

  • Extend CodeService (Gateway entrypoint) with executeGo (temp module + go run) and executeJava (class-name normalization to Main, javac + java).
  • Add language aliases (golang, js, node, python3) and a unified Execute dispatcher.
  • Align the MCP code_run tool with the Gateway: add go and java branches alongside python and nodejs.

Hot-Reload Configuration

  • Add config.Watcher, a cross-platform polling watcher (no fsnotify dependency) that tracks file mtimes and notifies registered Reloader implementations on change.
  • Define Reloader interface: Reload(ctx, newConfig) error with the contract that a rejected config leaves the old one in effect.
  • RuntimeManager implements Reloader: validates new concurrency/timeout/endpoint values, falls back to previous values on rejection, and inherits empty endpoint fields from defaults.

Workspace Isolation

  • Add Workspace struct with isolated Root, DataDir, TmpDir, CacheDir, and a Quota (CPU, memory, disk, file count).
  • ResolvePath is the core isolation primitive: strips volume/absolute prefixes, cleans the path, and rejects any result that escapes Root via .. — so a session bound to workspace A cannot reach workspace B's files.
  • Add WorkspaceManager for create/list/delete with owner-session binding and arbitrary labels.

Project Visualizations

  • Add 6 chart generators under scripts/visualizations/ (MCP tools distribution, startup benchmark, feature radar, concurrency throughput, multi-language benchmark, security layers) with a shared flat theme and a one-shot gen_all.py runner.
  • Embed the charts in both READMEs in a 2×3 grid with one-line captions.

MCP Availability Checks & HTML Report

  • Add scripts/check_mcp_availability.ps1 — drives nexusbox-mcp.exe over stdio, asserts the MCP initialize handshake and that tools/list returns all 18 tools.
  • Add scripts/check_mcp_tools.ps1 — exercises shell_exec, file_write+file_read, code_run across Python/Node/Go/Java, and the path-traversal guard; expects PASS: 9 FAIL: 0.
  • Add scripts/generate_test_report.py — produces a self-contained docs/test-report.html with inline SVG donut/stacked-bar charts, a language result grid, and a bug-found-and-fixed card.

Bug Fixes

fix(mcp): code_run did not support Go or Java

The MCP entrypoint pkg/mcp/code_server.go only handled python and nodejs in its language switch, returning Unsupported language: go despite Phase 2 having extended the Gateway CodeService with Go and Java. The two entrypoints were not synchronized. Fixed by adding go (go run) and java (javac + java, public class normalized to Main) branches to the MCP dispatcher, aligning it with the Gateway. All four languages now pass the availability check.

fix(snapshot): hardlinks caused snapshots to mutate with source

The filesystem backend originally created snapshots using hardlinks, which share inodes between source and snapshot. Editing a source file after snapshotting therefore changed the snapshot's content, breaking the "return to last good state" guarantee. Fixed by copying bytes on create instead of hardlinking, ensuring snapshot independence. Checksum verification was added to detect any later drift.

fix(migration): RestoreSnapshot call missing the targetPath argument

RestoreSnapshot was widened to a 4-argument signature (ctx, snapshotID, sandboxID, targetPath) but the migration manager still called it with 3 arguments, causing a compile error. Fixed by passing an empty targetPath so the restore reuses the source path recorded in the snapshot metadata.

fix(workspace): absolute paths were rejected instead of re-rooted

The workspace confinement design intends to re-root absolute paths inside the workspace, but the initial implementation rejected them outright. Fixed ResolvePath to strip the volume/absolute prefix and join the remainder onto Root, then guard the final result with filepath.Rel to ensure it stays inside Root.

fix(sandbox): Windows Job Object CPU rate-control structure length

JOBOBJECT_CPU_RATE_CONTROL_INFORMATION was laid out with incorrect alignment, producing ERROR_BAD_LENGTH (0x18) from SetInformationJobObject. Fixed by sizing the struct to 8 bytes (ControlFlags + CpuRate). A conflicting JOB_OBJECT_LIMIT_AFFINITY flag that caused ERROR_INVALID_PARAMETER was also removed, and the API call order was corrected.


Testing

  • All Phase 2 packages pass go test ./... with zero failures; no regressions in existing packages.
  • A new integration test test/integration/phase2_integration_test.go verifies the four features协同 in a real scenario: two parallel AI sessions, multi-language execution, snapshot/restore (with VSS gracefully falling back to filesystem on non-admin Windows), and hot-reload taking effect without restart.
  • Two stdio-driven PowerShell scripts drive the actual nexusbox-mcp.exe binary the same way Trae does, asserting PASS: 9 FAIL: 0 across shell/file/multi-language/security.

Test & Script Files Added

File Purpose
pkg/snapshot/manager_test.go Snapshot create/restore, backend selection, pruning
pkg/config/hotreload_test.go Watcher polling, Reloader apply/reject, file mtime detection
pkg/workspace/manager_test.go ResolvePath confinement, quotas, owner-session binding
pkg/sandbox/runtime/manager_test.go RuntimeManager reload validation and fallback
pkg/gateway/code_service_test.go Go and Java execution paths
test/integration/phase2_integration_test.go End-to-end协同 of all four Phase 2 features
scripts/check_mcp_availability.ps1 MCP handshake + tools/list availability check
scripts/check_mcp_tools.ps1 MCP tool-call e2e (shell/file/code/security)
scripts/generate_test_report.py Self-contained HTML test report generator

Breaking Changes

None for end users. RestoreSnapshot gained a targetPath argument, but this is an internal API not consumed externally. Existing sandbox configurations continue to work unchanged.


Upgrade Notes

No migration required. Phase 2 features are additive and auto-activate:

  • Snapshot — VSS is used automatically on Windows (requires admin for vssadmin); on non-admin Windows or other OSes the filesystem backend is used transparently.
  • Multi-language — ensure go, javac/java are on PATH if you want Go/Java execution; Python and Node are unchanged.
  • Hot-reload — point the watcher at your runtime config file; no new flags required.
  • Workspace isolationWorkspaceManager is opt-in; existing single-workspace flows are unaffected.

Regenerate the test report after a run:

python scripts/generate_test_report.py
# open docs/test-report.html

Regenerate project visualizations:

python scripts/visualizations/gen_all.py

Files Changed

New Files

File Description
pkg/snapshot/vss_windows.go Windows VSS backend (vssadmin create shadow)
pkg/snapshot/vss_other.go Non-Windows VSS stub
pkg/snapshot/manager.go Snapshot manager with backend abstraction, metadata, pruning
pkg/config/hotreload.go Polling config watcher + Reloader interface
pkg/workspace/manager.go Workspace isolation with ResolvePath confinement and quotas
`p...
Read more

v0.1.1

25 Jun 05:16
a489557

Choose a tag to compare

Release Notes

NexusBox Sandbox v0.1.1

Release date: 2026-06-25

This release delivers the P0 and P1 priority features referenced from the CubeSandbox architecture, transforming NexusBox into a production-ready AI Agent sandbox. All new code is covered by unit tests, and a full local integration test pass confirms zero regressions.


Highlights

  • E2B SDK drop-in compatibility — existing E2B clients can switch to NexusBox by changing only the API base URL.
  • Template system — reusable sandbox configurations with four seeded defaults for common AI Agent workloads.
  • Pre-warming pool — per-template sandbox pools with TTL eviction and utilization-based auto-scaling for sub-100ms cold-starts.
  • Egress security gateway — domain allowlist/denylist, dynamic credential injection, private IP blocking, and full audit logging.
  • eBPF network policy engine — L3/L4 ingress/egress rules with auto-detection and graceful fallback to iptables.

New Features

E2B API Compatibility Layer

  • Implement full E2B SDK-compatible REST API under /e2b/v1/*.
  • Cover sandbox lifecycle (create, get, list, kill), command execution, file I/O, code execution, timeout refresh, pause/resume, logs, and stats.
  • Enable drop-in replacement for E2B Python/JS SDK, LangChain integrations, and OpenAI Agents SDK clients.

Template System

  • Add TemplateManager for reusable sandbox configurations (image, runtime, resources, env vars, working directory, restart policy).
  • Seed four default templates: python-data-science, node-fullstack, browser-automation, ai-agent-default.
  • Expose CRUD API at /v1/templates with full validation and automatic defaults.
  • Support ApplyToSandbox to inherit template defaults while preserving user-overridden fields.

Resource Pool Pre-warming

  • Add TemplatePoolManager that maintains per-template pre-warmed sandbox pools.
  • Support configurable target size, min/max bounds, TTL-based eviction, and utilization-based auto-scaling.
  • Track detailed statistics: total created, total reused, hit rate, average create/reuse latency.

Egress Security Gateway

  • Intercept outbound HTTPS traffic from sandboxes via reverse proxy.
  • Enforce domain allowlist/denylist with wildcard subdomain matching (e.g., *.openai.com).
  • Inject credentials dynamically via CredentialProvider interface (supports Vault and other secret backends).
  • Block private IP ranges (loopback, private, link-local) to prevent SSRF.
  • Audit all outbound requests with URL, method, status code, bytes sent/received, and duration.
  • Expose policy management API at /v1/egress/policies, audit log at /v1/egress/audit, and stats at /v1/egress/stats.

eBPF Network Policy Engine

  • Add pluggable Engine with three backends: EBPFBackend (production), IPTablesBackend (fallback), NoopBackend (testing).
  • Support L3/L4 ingress/egress rules with port ranges and protocol filtering (tcp, udp, icmp).
  • Auto-detect eBPF availability on Linux with graceful fallback to iptables on unsupported kernels.
  • Validate CIDRs and enforce default-deny policies per sandbox.
  • Expose policy CRUD and statistics via thread-safe methods.

Internal Helpers

  • Add ShellService.ExecSync for synchronous command execution used by the E2B compatibility layer.
  • Add FileService.ReadFile and FileService.WriteFile for synchronous file I/O with path traversal protection and atomic writes.
  • Add CodeService.ExecuteCode for synchronous Python/Node.js code execution.

Bug Fixes

fix(egress): audit log drop count underflow when maxSize < 10

The AuditLog.Append method computed dropCount = maxSize / 10, which evaluated to 0 for small log sizes, preventing any entries from being dropped when the log was full. This caused the log to grow unboundedly. Fixed by enforcing a minimum drop count of 1.

fix(egress): policy and audit API routes returned 502

The egress gateway's HTTP server handler was bound exclusively to handleRequest (the proxy handler), so requests to /v1/egress/policies, /v1/egress/audit, and /v1/egress/stats were proxied instead of handled by the policy API. Fixed by routing requests with the /v1/egress/ prefix to Gateway.ServeHTTP.

fix(code): int32 to int type mismatch in ExecuteCode

The CodeService.ExecuteCode method passed an int32 timeout directly to CodeExecuteRequest.Timeout (which expects int), causing a compile error. Fixed by explicitly casting int(timeoutSec).

fix(e2b): undefined metav1.ObjectMeta in e2bObjectMeta

The e2bObjectMeta helper returned an anonymous struct instead of metav1.ObjectMeta, causing a compile error. Fixed by importing metav1 and returning the correct type.


Testing

  • Add 64 new unit test cases across 5 new test files, covering all P0/P1 features.
  • All 16 packages pass go test ./... with zero failures.
  • Local integration test confirms all live API endpoints respond correctly: templates CRUD, E2B compatibility, egress policy/audit/stats, shell exec, code execute, and file list.

Test Files Added

File Cases Coverage
pkg/template/manager_test.go 13 CRUD, defaults, idempotent seed, ApplyToSandbox
pkg/network/ebpf/engine_test.go 11 Policy validation, CIDR validation, backends, stats
pkg/network/egress/gateway_test.go 16 Domain matching, private IP, audit log, credentials
pkg/sandbox/runtime/template_pool_test.go 10 Register, acquire, release, recycle, stats
pkg/gateway/e2b_service_test.go 14 Routes, health, templates, sandbox lifecycle

Breaking Changes

None. All new APIs are additive and do not affect existing endpoints.


Upgrade Notes

No migration required. Start the dev server with the new -egress-port flag (default 8082) to enable the egress gateway. Set it to 0 to disable.

go run ./cmd/sandbox-dev/main.go \
  -port=8080 \
  -mcp-port=8079 \
  -egress-port=8082 \
  -workspace="$PWD"

Files Changed

New Files

File Description
pkg/gateway/e2b_service.go E2B SDK-compatible REST API layer
pkg/gateway/template_service.go Template CRUD REST API service
pkg/template/manager.go Sandbox template manager with seeded defaults
pkg/sandbox/runtime/template_pool.go Template-aware pre-warming pool manager
pkg/network/egress/gateway.go Egress security gateway with credential injection
pkg/network/egress/policy.go Egress policy management API handler
pkg/network/ebpf/engine.go eBPF network policy engine with iptables fallback
pkg/template/manager_test.go Unit tests for template manager
pkg/network/ebpf/engine_test.go Unit tests for network policy engine
pkg/network/egress/gateway_test.go Unit tests for egress gateway
pkg/sandbox/runtime/template_pool_test.go Unit tests for template pool manager
pkg/gateway/e2b_service_test.go Unit tests for E2B compatibility layer

Modified Files

File Changes
pkg/gateway/gateway.go Wire TemplateService and E2BService into gateway, add routes and accessors
pkg/gateway/shell_service.go Add ExecSync method for synchronous command execution
pkg/gateway/file_service.go Add ReadFile and WriteFile synchronous helpers
pkg/gateway/code_service.go Add ExecuteCode method, fix int32 to int type cast
cmd/sandbox-dev/main.go Integrate template manager, egress gateway, and network policy engine

v0.1.0

24 Jun 12:09
7cc7ef1

Choose a tag to compare

What is NexusBox?

NexusBox is a secure sandbox platform for AI Agents. It provides an isolated execution environment where AI agents can safely run shell commands, read/write files, execute code, and automate browsers — without any risk to the host machine.

Highlights

MCP (Model Context Protocol) Integration

  • 18 real tools exposed via JSON-RPC 2.0 over HTTP
  • 4 built-in MCP servers: Shell, File, Code, Browser
  • Seamless integration with Trae, Claude Desktop, Cursor, and other MCP-compatible AI assistants
  • Workspace-scoped isolation with path traversal protection

Shell Execution

  • shell_exec — synchronous command execution with timeout control (max 300s)
  • shell_background — background long-running tasks
  • shell_check — monitor background process status

File Operations

  • file_read / file_write / file_list / file_search
  • file_replace / file_delete / file_move
  • Atomic writes and path traversal prevention via resolvePath()

Code Execution

  • code_run — execute Python and Node.js code with timeout limits (max 120s)
  • code_install — install pip/npm packages
  • Temporary file handling with automatic cleanup

Browser Automation

  • CDP (Chrome DevTools Protocol) integration with Chromium
  • browser_navigate / browser_screenshot / browser_click / browser_type
  • browser_eval / browser_get_text

REST API Gateway

  • Unified entry point for shell, file, code, browser, and sandbox management
  • Panic recovery middleware for stability
  • JWT authentication support

Multi-Tenant Isolation

  • 3 isolation levels: Standard, Enhanced, Maximum
  • Per-tenant workspace, network policy, and resource quotas
  • Token bucket rate limiting per tenant
  • VXLAN VNI and cgroup-based hard isolation

Security Hardening

  • Docker: cap_drop ALL , no-new-privileges , memory limits
  • Rootless mode support with UID mapping
  • Seccomp and AppArmor profile management
  • mTLS certificate generation

Scheduling Framework

  • 11-phase scheduling pipeline inspired by Kubernetes scheduler
  • Pluggable plugins: ResourceFit, TenantAffinity, ImageLocality, NodeResourcesBalancedAllocation
  • Priority queue and batch scheduling support

CRI (Container Runtime Interface)

  • CRI-compatible gRPC server for direct kubelet integration
  • Enables Kubernetes to schedule pods onto NexusBox-managed sandboxes

Observability

  • Prometheus metrics ( nexusbox_sandbox_creation_total , nexusbox_sandbox_creation_duration , etc.)
  • OpenTelemetry distributed tracing
  • Structured audit logging with JSON output
  • Health checker with liveness/readiness probes

Full Development Environment (Docker)

  • JupyterLab (port 8888)
  • code-server — VS Code in browser (port 8200)
  • noVNC remote desktop (port 6080)
  • Chromium with CDP (port 9222)
  • Supervisor process manager for 7 services

Kubernetes Ready

  • CRDs for Sandbox, Tenant, and SandboxTemplate
  • Deployment manifests
  • Admission webhook for validation