feat(guest-image): shrink ext4 image after creation and optimize Dockerfile#347
Conversation
…erfile - Add shrink_ext4_image to minimize guest image size with configurable headroom - Strip runtime-only packages from Dockerfile (util-linux, busybox) and clean caches - Expose ONE_CLICK_GUEST_IMAGE_RESERVED_BYTES env variable for headroom tuning Signed-off-by: jinlong <jinlong@tencent.com>
|
Claude finished @fslongjin's task in 1m 13s —— View job Code Review: PR #347 — feat(guest-image): shrink ext4 image after creation and optimize DockerfileI've reviewed all three changed files. Overall this is a well-structured PR with good error handling and clear intent. Below are my findings. Summary
deploy/guest-image/Dockerfile — ✅ Looks goodThe Dockerfile changes are solid:
Nit: The comment on line 7 says busybox provides deploy/one-click/build-vm-assets.sh — Findings1.
|
…ndling - Optimize run_as_root to try without sudo first for better performance - Add dump_ext4_header to ensure C locale for dumpe2fs parsing - Add defensive check against shrinking below filesystem minimum size - Clarify busybox limitations in Dockerfile comments Signed-off-by: jinlong <jinlong@tencent.com>
|
Claude finished @fslongjin's task in 1m 41s —— View job Code Review: PR #347 — feat(guest-image): shrink ext4 image after creation and optimize DockerfileI've reviewed all three changed files in detail. The second commit ( deploy/guest-image/DockerfileThe Dockerfile changes are clean and well-motivated. Removing build-time tools (
deploy/one-click/build-vm-assets.shFinding 1:
|
| Code | Meaning |
|---|---|
| 0 | No errors |
| 1 | Errors corrected |
| 2 | Errors corrected, system should be rebooted |
| 4 | Errors left uncorrected |
| 8 | Operational error |
The || true swallows all non-zero exits, including exit code 4 (uncorrected errors). If e2fsck encounters corruption it cannot fix, execution continues to resize2fs -M on a corrupt FS. The second e2fsck at line 320 offers a safety net, but operating on a corrupt filesystem with resize2fs risks data loss.
Recommendation: Check for exit code >= 4 and abort:
run_as_root e2fsck -fy "${img}" >&2 || {
rc=$?
if (( rc >= 4 )); then
die "e2fsck encountered uncorrectable errors (exit ${rc}) on ${img}"
fi
}Finding 2: run_as_root stderr silencing can hide non-permission failures
Line 254: "$@" 2>/dev/null
The first (non-sudo) attempt silences all stderr, not just permission errors. If the command fails for a reason other than EACCES (e.g., invalid arguments, wrong image format), the error is hidden. The sudo fallback then runs and fails with its own error, which may be less informative than the original error message.
This is a minor concern since the commands in this file are well-tested system tools. But if someone debugs a failure here, the hidden stderr could be confusing.
Recommendation: Not strictly necessary, but if you want to improve it, capture stderr from the first attempt and log it if the command fails:
local errfile; errfile="$(mktemp)"
"$@" 2>"${errfile}"
rc=$?
if [[ ${rc} -eq 0 ]]; then
rm -f "${errfile}"
return 0
fi
# First attempt failed — show its stderr before retrying with sudo
[[ -s "${errfile}" ]] && log "info: first attempt stderr: $(< "${errfile}")" >&2
rm -f "${errfile}"
require_cmd sudo
sudo "$@"Finding 3: shrink_ext4_image error message lacks debugging context
Line 298:
die "failed to parse ext4 metadata from ${img}"If this triggers, the operator has no information about what dumpe2fs actually returned. Including the raw output (or part of it) would aid debugging:
die "failed to parse ext4 metadata from ${img}: block_size=${block_size:-<unset>} min_blocks=${min_blocks:-<unset>}"Or log the full dumpe2fs output ahead of time:
dumpe2fs_out="$(dump_ext4_header "${img}")" || die "dumpe2fs failed on ${img}"Finding 4: (Minor) The initial 64MB headroom is wasted when followed by shrink
In calculate_guest_image_size_bytes (line 222-223), 64MB is reserved above the raw file-size estimate before mkfs.ext4 creates the image. This is intended to give mkfs.ext4 -d enough space for metadata overhead.
However, shrink_ext4_image then runs resize2fs -M, which compacts the FS to its absolute minimum — discarding that 64MB. The final image gets 32MB from the new SHRINK_RESERVED_BYTES.
The net effect is correct (32MB headroom in the final image), but the shrink step has to scan a much larger image (256MB/512MB step) to get there. If performance of the shrink step matters, you could reduce calculate_guest_image_size_bytes's reserved_bytes now that the shrink step provides its own headroom. Not a correctness issue, just an efficiency observation.
Adjusted Finding: truncate ordering is correct
The previous review noted a concern about truncating before resize2fs. Looking at the current code:
resize2fs -Mshrinks FS tomin_blocks(inside a still-large file)truncate -s final_bytestrims the file tomin_bytes + reserved_bytesresize2fs img target_blocksextends the FS to fill the trimmed file
The min_bytes <= final_bytes assertion on line 309 guarantees no data is truncated. This is correct.
Adjusted Finding: run_as_root flexibility
The previous review noted that run_as_root is less flexible than run_mkfs_ext4_with_optional_sudo. Commit 5a7d73f addressed this by making run_as_root try without sudo first — this is now equivalent in behavior to the existing pattern. Good fix.
Adjusted Finding: LC_ALL=C locale handling
Commit 5a7d73f added dump_ext4_header with LC_ALL=C. This fix properly addresses the original concern. Good.
deploy/one-click/env.example
Clean documentation. The variable is well-named (ONE_CLICK_GUEST_IMAGE_RESERVED_BYTES) and consistent with the existing ONE_CLICK_* naming convention.
Summary
| Category | Count |
|---|---|
| Potential issues | 2 |
| Suggestions | 2 |
| Previously addressed issues | 3 |
| Correct patterns | 4+ |
Key strengths of this PR:
e2fsckbefore and after the shrink — excellent safety practice- Sparse file documentation — clarifies trade-offs for downstream packagers
- Configurable reserved bytes via env variable — flexible without hardcoding
- Defensive
min_bytes <= final_bytesassertion — prevents silent data truncation - Second commit (
5a7d73f) responded well to earlier review feedback
The two issues I'd prioritize are: (1) handling e2fsck exit code 4+ with a hard abort rather than silent continuation, and (2) making the metadata parse error message more actionable.
|
Item 3 (sudo escalation): Fixed. Item 4 (locale-fragile parsing): Fixed. Added a Dockerfile nit: Comment updated to call out that busybox applets are reimplementations without every GNU extension, and to note the current startup path doesn't rely on those. Item 1 (exit code asymmetry): Adjusted for readability — both branches now Item 2 (truncate before resize2fs): Added a Item 5 (sparse): Added an inline comment documenting the sparse output. No behavior change needed — the current tar.gz pipeline produces a 261MB release. Item 6 (double headroom 96MB): Doesn't hold. The 64MB in |
Summary
This PR shrinks the one-click guest image
cube-guest-image-cpu.imgby removing runtime-only packages from the guest rootfs and shrinking the generated ext4 image after creation.Motivation
The guest image used to be around
768MB, while the actual rootfs contents were much smaller. That left a large amount of unused space in the released package and on disk.What changed
deploy/guest-image/Dockerfileto keep only the runtime essentials.yum clean alland removed cache/docs/man/locale/temporary files during image build.deploy/one-click/build-vm-assets.sh.ONE_CLICK_GUEST_IMAGE_RESERVED_BYTESand defaulted it to32MB.Results
Measured on the built package:
cube-guest-image-cpu.img:245MB205MB41MB261MBThis reduces the guest image size from roughly
768MBto245MB.