feat(template): optimize disk usage peak during image-to-ext4 build#472
Conversation
Reduce peak disk usage from ~4.2X image size down to ~1.2X (72% reduction)
via five complementary optimisations:
1. Pipe-streamed export (eliminate rootfs.tar intermediate file):
- Connect docker export stdout directly to tar -xf stdin
- Proper subprocess lifecycle: kill+wait tar on export failure
- 1 MiB pipe buffer via F_SETPIPE_SZ for throughput
- Context.Background() for cleanup to survive request cancellation
2. Early workDir cleanup:
- Remove workDir immediately after rootfs reaches storeDir,
before ext4 creation, instead of defer-to-end
3. Precise ext4 sizing with file-count overhead:
- Replace power-of-2 alignment with 256 MiB boundary
- Triple overhead model: fixed (256 MiB default) + percentage
(10% default) + per-file (1 KiB/file)
- Single-pass directorySizeAndFileCount avoids extra I/O
4. Direct-to-storeDir export (local fast filesystems):
- Detect NFS/CIFS via statfs magic and fall back to
workDir+relocate only for network filesystems
- Eliminates the relocate step and potential cross-device copy
5. Disk-space pre-check (fail-fast):
- Statfs on storeDir parent before build starts
- Configurable safety margin (CUBEMASTER_DISK_SPACE_SAFETY_MARGIN,
default 1.5x)
- Uses docker image inspect --format {{.Size}} for estimation
Phase 2 (loop-mount streaming ext4 build) is also implemented,
gated behind CUBEMASTER_LOOP_MOUNT_EXT4_ENABLED (default false)
with automatic capability detection (CAP_SYS_ADMIN, /dev/loop-control).
Security hardening:
- Reject image references starting with "-" in normalisation
- Add "--" separator to all docker/mount/umount/losetup invocations
- Mount with nosuid,noexec,nodev,noatime for loop devices
- Use sync.Once for idempotent umount and losetup detach
Additional improvements:
- SHA256 computation uses 4 MiB buffer to reduce read syscalls
- Cleanup failures are logged at WARN level instead of silent discard
- Configurable ext4 overhead via CUBEMASTER_EXT4_OVERHEAD_PERCENT
and CUBEMASTER_EXT4_FIXED_OVERHEAD_MIB
All existing tests pass. Verified on x86_64 host with alpine (~8 MiB),
mysql:8.0 (~786 MiB), golang:1.22 (~754 MiB) and rust:1.77 (~1.32 GiB)
images. Peak disk increment measured at 3.92 GiB for a 1.32 GiB
compressed image (rust:1.77-bullseye) including docker pull, compared
to an estimated ~13.9 GiB under the previous power-of-2 alignment.
Signed-off-by: jinlong <jinlong@tencent.com>
|
Claude finished @fslongjin's task in 1m 59s —— View job Code Review: feat(template): optimize disk usage peak during image-to-ext4 build
Tasks:
OverviewThis PR significantly reduces peak disk usage during template image-to-ext4 builds from ~4.2× image size down to ~1.2× (72% reduction) through a combination of pipe-streamed exports, early cleanup, precise ext4 sizing, direct-to-store exports, a pre-flight disk space check, and an optional loop-mount streaming path. It also adds security hardening across docker/mount invocations. The code is generally well-structured with clear comments, good error messages, and appropriate use of Go idioms (deferred cleanup, Security: ✅ Overall solid improvements
Issues Found1. Dead code:
|
| Metric | Old (power-of-2) | New | Improvement |
|---|---|---|---|
| Peak disk (5 GiB image) | ~20.8 GiB (4.2X) | ~3.9 GiB (~1.2X) | ↓ 72% |
This is a high-quality PR with measurable improvements and good attention to security. The main actionable items are removing the dead code and fixing the duplicated comment. The rest are design observations that the maintainers can weigh.
Tasks:
- Read changed files and understand the changes
- Review Phase 1 optimizations (pipe-export, early cleanup, ext4 sizing)
- Review Phase 2 loop-mount streaming build
- Review security hardening
- Review new helper functions (checkDiskSpace, isLocalFastFS, canUseLoopMount, etc.)
- Check for potential issues, dead code, or bugs
- Post comprehensive review feedback
| Branch: fix-cubemaster-tpl-create-disk-usage
…ation
- Remove unused dockerRootFS struct and RootFS field (estimateImageSizeFromInspect
now uses docker image inspect --format {{.Size}} directly)
- Deduplicate pipe logic: exportImageRootfs now calls pipeExportToDir instead of
duplicating the same docker export | tar -xf - pipeline
- Fix isLocalFastFS to fall back to parent directory on ENOENT (consistent with
checkDiskSpace), and add FUSE_SUPER_MAGIC (0x65735546) detection
- Use strconv.ParseBool in loopMountExt4Enabled for broader boolean support
(true/TRUE/1/false/0)
- Delete dead code: cleanupIntermediateArtifacts (logic is now inline in
buildRootfsArtifact defer closure)
- Replace TestCleanupIntermediateArtifacts* tests with:
TestIsLocalFastFSFallsBackToParentForMissingArtifactDir
TestLoopMountExt4EnabledParsesBoolValues
- Fix alignment of JobPhase constant declarations
- Fix error message in pipeExportToDir to include destDir parameter
Signed-off-by: jinlong <jinlong@tencent.com>
|
Claude finished @fslongjin's task in 2m 49s —— View job Re-review Results: All Previous Issues Resolved ✅Thanks for the updates! I've reviewed the commit ✅ Resolved Issues1. Dead code removed — 2. Dead struct fields removed — 3. Duplicated comment fixed — The duplicate opening sentence in 4. 5. 6. Pipe-logic deduplicated — 7. ✅ Updated TestsThe test changes match the production code changes:
The new Code QualityThe indentation fix in SummaryAll previously-identified issues have been addressed. The PR is in good shape — clean, well-structured, and well-tested. No further issues found in this re-review. |

Summary
Reduce peak disk usage from ~4.2X image size down to ~1.2X (72% reduction) during template image-to-ext4 builds.
Changes
Phase 1 (default on)
rootfs.tarfile by connectingdocker export | tar -xf -via Go pipePhase 2 (opt-in, default off)
CUBEMASTER_LOOP_MOUNT_EXT4_ENABLED=true, auto-detects CAP_SYS_ADMIN and /dev/loop-controlSecurity hardening
-prefixed image references--separator to all docker/mount/umount/losetup invocationsnosuid,noexec,nodev,noatimefor loop devicessync.Oncefor idempotent umount and losetup detachOther improvements
Verification
Tested with alpine (~8 MiB), mysql:8.0 (~786 MiB), golang:1.22 (~754 MiB), rust:1.77 (~1.32 GiB) on x86_64.
All existing unit tests pass.
🤖 Generated with Claude Code