Skip to content

v0.4.0

Latest

Choose a tag to compare

@fslongjin fslongjin released this 15 Jun 12:32
4004a6e

2026.06.14 Release v0.4.0

CubeSandbox 0.4.0 introduces CubeEgress, an OpenResty-based security proxy that brings credential injection, domain filtering, and access auditing to sandbox egress traffic. This release also delivers container log forwarding with a new cubecli logs command, a node component version matrix with cluster-wide visibility, template replica compatibility checking, a daemonless template image build pipeline, and significant network performance improvements (35% faster network P50). The builder base image has been downgraded to ubuntu:20.04, lowering the minimum glibc requirement from 2.34 to 2.31 for broader distribution compatibility. 58 commits from 15 contributors.

🎯 Major Features

CubeEgress: Security Proxy

CubeEgress is a new OpenResty-based egress gateway that sits in the sandbox outbound traffic path via TPROXY, enforcing L7 policy before requests leave the cluster. It consists of ~2,200 lines of Lua across 9 modules running on OpenResty/nginx, plus Go-side integration in CubeMaster (CA provisioning, policy push), network-agent (TPROXY iptables rules), and Cubelet (per-sandbox routing, protobuf egress rule model).

  • Credential injection (#518): Per-sandbox secrets are attached to outbound requests at the proxy layer via EgressRule.inject — user code inside the sandbox never handles raw credentials. The CubeNetworkConfig protobuf message (formerly CubeVSContext) now carries L7 egress rules with match conditions (SNI, host, method, path, scheme) and actions (allow/deny, audit, inject). Credential material is redacted as ***REDACTED*** in CubeMaster safe-log output (#520).
  • Domain filtering (#518): Policy-driven allow/deny lists gate which destinations a sandbox may reach, evaluated first-match-wins against the L7 request. DNS queries are permitted even when domain-based allow-out rules are set (38fe997).
  • Access auditing (#518): Structured JSON logs of every egress request with optional body redaction via a redactor Lua module, enabling downstream compliance review.
  • Kernel 5.4 compatibility (38fe997): The security proxy runs on kernel v5.4+, expanding deployment coverage.
  • CubeVS fast-path hardening (#527): SYN-only packets are now rejected in the port-mapping BPF fast path, preventing guest-initiated connection attempts from bypassing egress policy.
  • TAP TX offload (#505): TX checksum/TSO offload and tx-tcp-mangleid-segmentation are enabled on TAP devices so redirected packets skip GSO before reaching the guest.
  • CubeEgress version reporting (9d76195): CubeEgress participates in the node component version matrix with build-time version metadata injection, a /admin/v1/health endpoint extension, release manifest entries, and cubelet-side file-based collection.

New files: CubeEgress/ (20 files — Lua modules, nginx config, Dockerfile, iptables scripts, systemd units, CA generation); CubeMaster/pkg/service/httpservice/cube/ca_download.go; CubeMaster/pkg/templatecenter/cube_egress_ca/; CubeMaster/pkg/templatecenter/cube_egress_ca_bake.go; DB migration 0005_cube_egress.sql.

Container Log Forwarding

Container init-process stdout/stderr is now streamed from the agent to the shim via a dedicated vsock connection and appended to log files on the host. A new cubecli cubebox logs subcommand lets operators read these logs from outside the sandbox.

  • Log streaming (#535): The shim injects a cube.container.log_forwarding=true annotation into the OCI spec, causing the agent to create stdout/stderr pipes (1 MiB buffer, O_NONBLOCK) for the init process. A dedicated vsock channel carries the log stream to the shim, which appends to /data/log/template/<id>/stdout|stderr during template builds and to ./stdout / ./stderr in the bundle directory for normal sandboxes. Log forwarding is cleanly cancelled before pause/snapshot/teardown, and pipe write fds are closed on process exit so readers receive EOF (#541). Exec I/O relay (FIFO-based) is kept separate from init log forwarding.
  • cubecli cubebox logs (#528): New subcommand to read container stdout/stderr from /data/cubelet/state/io.containerd.runtime.v2.task/default/<id>/stdout|stderr. Supports --tail N, --head N, --all, and --stderr flags. Since log files live inside the cubelet mount namespace, the command re-execs itself via the existing C constructor in pkg/cubemnt/nsenter.c to safely enter the namespace before any Go code runs. Includes openNoFollow() path validation hardened against symlink-following attacks.

Node Component Version Matrix

A new version tracking infrastructure gives operators cluster-wide visibility of component versions across all nodes, with a dedicated Web UI page.

  • Version collection and matrix (#500): Cubelet collects component versions (guest-image, cube-agent, kernel, plus control-plane components from the release manifest) and reports them to CubeMaster, which maintains a version matrix in the node_component_version table (DB migration 0004). The matrix groups nodes by reported version for each component, surfaces version skew, and exposes summary and detail APIs through CubeAPI.
  • Standardized version injection (#493): All Go and Rust binaries now receive version, commit, and build-time metadata via ldflags / build.rs. A machine-readable release-manifest.json is generated in one-click release bundles so every artifact is traceable to the same release. The cubecli version and cubemastercli version output formats are unified across components.
  • Web UI Versions page (#500, #481): A new Versions.tsx page (762 lines) with i18n support (en/zh) shows per-component version distribution across nodes. The sidebar and Settings About section now display the actual release tag (injected at build time as __APP_VERSION__) instead of hardcoded versions.

New files: CubeMaster/pkg/nodemeta/versionmatrix.go; web/src/pages/Versions.tsx; web/src/locales/en/versions.json, zh/versions.json; DB migration 0004_node_component_version.sql.

Template Replica Compatibility

Template replicas are now checked against node component versions, with stale/missing replicas surfaced in both the API and Web UI.

  • Compatibility matrix and version binding (#510): The template compatibility system compares each template's bound component versions (guest-image, cube-agent, kernel) against what each node currently reports. Results are stored in template_versions (DB migration 0006) and exposed via /templates/compat (summary) and /templates/compat/{id} (per-template detail). Version binding management lets operators pin a template to specific component versions at creation time.
  • Web UI (#545): The template detail page now shows per-replica compatibility badges, version delta between bound and current component versions, and a stale-replica warning banner with a rebuild trigger. New components: CompatBadge, CompatSection, CompatWarning, CompatNodeCard, VersionDeltaList.

New files: CubeMaster/pkg/templatecenter/compat.go; CubeMaster/pkg/service/httpservice/cube/template_compat.go; DB migration 0006_template_replica_compat.sql.

Template Image Build Pipeline Overhaul

The template image build pipeline has been rearchitected to support daemonless operation via skopeo/umoci, with a 72% reduction in peak disk usage and file-level content deduplication.

  • Daemonless export path (#492, #506): When skopeo and umoci are available on the CubeMaster node, template images are pulled via skopeo copy into a local OCI layout and unpacked with umoci unpack --rootless, eliminating the Docker daemon requirement. Falls back to Docker for backward compatibility. The export strategy is chosen once at image resolution time so preparation and export stay consistent.
  • Artifact management (#506): A new job runner orchestrates the full pipeline (image export → rootfs artifact build → distribution), with redo support that can resume from the last completed phase. File-level content fingerprints (SHA256) enable artifact deduplication across builds, and artifact cleanup is managed through a structured lifecycle. Redo operations now carry the correct template ID through working requests (#544).
  • Disk usage optimization (#472): Peak disk usage during image-to-ext4 build is reduced from ~4.2× to ~1.2× image size through five complementary optimizations:
    1. Pipe-streamed export: Docker export stdout is connected directly to tar -xf stdin via a 1 MiB pipe (F_SETPIPE_SZ), eliminating the intermediate rootfs.tar file.
    2. Early workDir cleanup: The scratch workDir is removed immediately after the rootfs reaches the store directory, before ext4 creation begins.
    3. Precise ext4 sizing: Power-of-2 alignment is replaced with a triple-overhead model (fixed 256 MiB + 10% of data + 1 KiB per file), aligned to 256 MiB boundaries.
    4. Direct-to-storeDir export: On local fast filesystems (detected via statfs magic), the rootfs is exported directly into the store directory, skipping the workDir→storeDir relocate step. NFS/CIFS fall back to the relocate path to avoid cross-device copies.
    5. Disk-space pre-check: A fail-fast statfs check on the store directory parent ensures sufficient space before the build starts, with a configurable safety margin (CUBEMASTER_DISK_SPACE_SAFETY_MARGIN, default 1.5×).
      SHA256 computation uses a 4 MiB buffer to reduce read syscalls. A loop-mount streaming ext4 build phase (gated behind CUBEMASTER_LOOP_MOUNT_EXT4_ENABLED, default false) is also implemented with CAP_SYS_ADMIN detection.
  • SDK alignment (#485): CubeAPI POST /templates and Python/Go SDKs now expose DNS, egress CIDRs, registry auth, command/args, network type, and node scope options, matching the full cubemastercli template create-from-image option set.

New files: CubeMaster/pkg/templatecenter/image/ (export, ext4, disk, command, ref, source, types, paths, util); CubeMaster/pkg/templatecenter/artifact_build.go, artifact_cleanup.go, distribution.go, fingerprint.go, image_job_runner.go, job_constants.go, job_dto.go.

Network Performance

  • TAP fd acquisition optimization (#487): A three-tier GetTapFile strategy replaces the old single-path approach:

    • Fast path: When state.tap.File is already cached, return it immediately (0 syscalls).
    • Hot path: For pooled taps with a closed fd, reopen with just 2 syscalls (open + TUNSETIFF), skipping the expensive restoreTap flow (netlink lookup, LinkSetUp, SetMTU, TC filter attach, ARP entry).
    • Recovery path: Fall back to full restoreTap only when there is no in-memory state or the tap is held externally.

    The fdserver JSON response now includes the ifindex, allowing cubelet to skip its own netlink.LinkByName call — eliminating a serialization point during concurrent sandbox creation. Cubelet falls back to LinkByName only when ifindex is 0 (backward-compatible with older agents).

    A TOCTOU race between EnsureNetwork and ReleaseNetwork is fixed by replacing singleflight-style dedup with a per-sandbox creating guard channel registered in the same critical section as the state check. Includes a pprof debug server (--pprof-listen flag) and 390 lines of concurrency tests (6 functions, 64-goroutine stress test clean under -race).

    Benchmarks (BMI5, Xeon Platinum 8255C, kernel 6.6.119): Network P50 35.3→23.1ms (35% faster), Network P99 86.6→51.2ms (41% faster), Total P50 106.1→92.0ms (13% faster), Throughput 194.8→209.8 sandboxes/s (8% higher).

  • BPF checksum optimization (#469): bpf_csum_diff() is replaced with bpf_{l3,l4}_csum_replace helpers in both from_world and from_cube BPF programs. Combined with the TAP TX offload work (#505), this enables TSO/UFO/CSUM offloads to be re-enabled on virtio-net TAPs (reverting #110), and the disableGRO() requirement on host NICs is dropped.

✨ Enhancements

Scheduling

  • Configurable overcommit and Redis allocation bypass (#525): Two new scheduler configuration knobs: overcommit_ratio (default CPU=3, Mem=2) with optional per-instance-type overrides via overcommit_ratio_conf, and ignore_redis_allocation (default false) to treat Redis-recorded allocations as zero. Applied consistently across filter and score plugins, with non-positive ratios clamped back to defaults. Physical load guards (CPU utilization ceiling, real-time free memory) are intentionally preserved.

Affinity

  • Custom node affinity selector (#504, #467): The com.nodeaffinity.selector annotation now accepts arbitrary NodeSelectorRequirements (In, NotIn, Exists, DoesNotExist, Gt, Lt) as a JSON array of {key, operator, values}. Node labels from registration are carried through Node.NodeLabels, merged into Labels() with an atomic.Pointer cache and InvalidateLabelsCache() for mutation safety. DoS hardening: max annotation size 4 KB, 10 selectors per request, 50 values per In/NotIn. Configurable allowed keys default to zone, cluster-id, cpu-type, memory-size, cpu-cores, instance-type. 872 lines of tests covering 47 cases.

Template Management

  • tpl- prefix enforcement (#474): Template IDs are now always auto-generated with a tpl- prefix across all creation paths (API, CLI, Web UI, sandbox commit). User-specified IDs are accepted for backward compatibility but silently ignored — the server always returns an auto-generated tpl- prefixed ID as the authoritative template identifier. Validation rejects bare tpl- / snap- prefixes and non-conforming annotation prefixes.
  • Builder image downgrade to ubuntu:20.04 (#468): The builder base image is changed from ubuntu:22.04 to ubuntu:20.04, lowering the minimum glibc requirement from 2.34 to 2.31. Affects Dockerfile.builder, one-click installer preflight checks, CI workflows, and documentation.

Web UI

  • Template policy display (#486): The template detail page now shows environment variables, network type, internet access, DNS servers, allow-out rules, and deny-out rules parsed from createRequest. A dedicated "Network Policy" section includes per-rule copy buttons. A BoolBadge component is extracted as a shared UI primitive.
  • CubeAPI container image (#513): A container build for the cube-api service produces a self-contained runtime image suitable for one-click and orchestrated deployments, with a lean build context.

SDK

  • Python SDK v0.3.0 (#521): Bump to 0.3.0 with new APIs for security proxy configuration.

PVM

  • Kernel LOCALVERSION rename (#511, #534): The PVM host and guest kernel LOCALVERSION is renamed to a clean descriptive scheme so the distribution base and host/guest role are obvious from uname -r. Deployment configs, user-facing guides, and blog references are updated to match.

🐛 Bug Fixes

These fixes address issues present in v0.3.1:

  • Virtiofs config skipped when shareDirs is empty (#533): Cubelet no longer generates virtiofs configuration or annotations when no shared directories are specified, preventing broken config generation.
  • DNS server IP automatically added to AllowOut (#526): When any DNS rule is configured, the DNS server IP is now added to AllowOut to ensure DNS resolution works through egress policy. Includes regression test coverage.
  • Cubelog nil trace panic (#512): Background workers and detached job contexts that run without a request trace no longer panic on nil dereference — trace handling is now tolerant of a missing trace.
  • Storage symlink resolution in host-dir cleanup (#530): cleanupHostDirVolumes now resolves base-path symlinks when walking sandbox directories, so bind mounts under paths like /data → /mnt/ssd/data are correctly identified and unmounted instead of leaking or having their backing directories wiped.
  • Network plugin bootstrap warnings (#491): Cubelet startup no longer logs valid network configuration keys as "unknown TOML fields" — the existing config struct is now reused when reading bootstrap overrides.
  • DNS not auto-allowed when internet is disabled (#490): When AllowInternetAccess=false, resolved DNS servers are no longer appended to allow_out, so the deny-all outbound policy consistently blocks DNS resolution. Fixes #408.
  • Ripgrep dependency removed from one-click runtime (#496): The one-click install and startup path no longer requires or auto-installs ripgrep. Shell checks now use grep-based helpers.
  • Virtiofs migration_on_error set to GuestError (#482): The native virtiofs server now uses MigrationOnError::GuestError instead of Abort. Per-inode failures during snapshot restore surface as guest FS errors (ENOENT/EIO) on the affected paths rather than tearing down the entire live migration.
  • VMM virtio-fs queue fault tolerance (#464): process_queue_serial() no longer panics on malformed descriptors. Failures are recovered by writing an EIO FUSE error reply to the guest and continuing to serve the queue. A new device_memory view is added for device-backed memory regions (virtio-pmem, virtio-fs DAX, ivshmem/zshm BARs).
  • Cgroup v2 manager creation (#488): The agent now uses the cgroup v2 creation path from cgroups-rs and attaches container processes through cgroup.procs, avoiding v1 controller name failures in unified cgroup mode. Process ID collection for cleanup and signals also reads from cgroup.procs.
  • Node health expiry on stale heartbeat (#455): Node health is now derived from heartbeat freshness — stale heartbeats are correctly reported as unhealthy in nodemeta reads, localcache-backed reads, and scheduler prefilter. A shared helper centralizes the timeout rule across all three paths.
  • SELinux context restore after one-click install (#471): File contexts under the install prefix are now restored before starting systemd services, fixing one-click installs on SELinux Enforcing hosts. Fixes #465.
  • Glibc preflight pipefail race (#473): The ldd --version output is now fully captured before parsing, preventing strict-mode preflight checks from exiting on an expected SIGPIPE.
  • Python SDK streaming request body read (377a99d): Request bodies in IPOverrideTransport are now buffered before copying, so multipart uploads no longer fail with RequestNotRead.
  • CLI help text corrections (#478): Fixed incorrect command names (e.g., cuebclicubecli), spelling mistakes, outdated deprecation hints, and truncated descriptions in both cubecli and cubemastercli.

📚 Documentation

  • DEB install instructions (#532): Added apt (DEB) install instructions alongside existing yum (RPM) steps for Python SDK setup in the Quick Start guide.
  • Benchmark blog env var fixes (#497): Fixed benchmark setup examples that mixed environment variables from different client stacks — E2B variables for e2b_code_interpreter examples, CUBE_API_URL + CubeProxy settings for CubeSandbox SDK examples.
  • CNCF Landscape badge (#477): Added CNCF Landscape badge and footer note to README in both English and Chinese.
  • Template ID documentation cleanup (#476): Removed all --template-id flags from create-from-image documentation and examples since template IDs are now auto-generated with tpl- prefix.
  • Install guide links in benchmark posts (#475): Added installation guide callouts to the §2.1 Hardware section of all four benchmark blog posts (EN + ZH, bare-metal + PVM).
  • Troubleshooting links (#466): Added GitHub issue #311 troubleshooting URL to XFS filesystem check error messages in install.sh, online-install.sh, and check-deps.sh. Updated install docs to use direct links to the Releases page.
  • CODEOWNERS (#522): Added CubeEgress maintainer entry.

⚙️ Engineering Improvements

  • Build system reorganization (#529): Per-target .PHONY declarations replace the single bulk list. A new clean-rust-target-dirs target removes target/ under each top-level Rust workspace. The all target is driven from a shared BINARIES list.
  • Format check CI (#524): fmt targets are added to all component Makefiles (Go and Rust), with a new .github/workflows/fmt-check.yml CI workflow that runs format checking on PRs. The agent's fmt target automatically generates required files (version.rs, protocol .rs) before formatting.
  • CI review-comment via stdin (#494): PR review comments are now passed via stdin (--body-file -) instead of temp files, keeping review content out of the checkout directory.
  • CI auto-review comment reuse (#489): Automated review comments now update the bot's existing marked comment on repeated PR synchronizations instead of creating new top-level comments each time.
  • Metric report jitter (#479): The Cubelet CLS metric report loop now adds random jitter (uniformly distributed between [t, 1.5t]) to prevent thundering herd issues when multiple agents start concurrently.