image/compress-checksum: maximize CPU + memory use#9758
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughoutput_images_compress_and_checksum() now probes host CPU/load/memory, counts concurrent compressor jobs, computes a per-job memory budget and total input size, selects elastic xz/zstd compression levels, and applies per-file thread caps and memory-aware compression parameters while logging decisions and compression metrics. ChangesResource-Aware Compression Planning and Execution
Sequence DiagramsequenceDiagram
actor Script
participant Host as "Host (nproc, ps)"
participant Mem as "/proc/meminfo"
participant Planner as "Elastic Level Planner"
participant Images as "Image Files"
participant xz as "xz"
participant zstd as "zstd/zstdmt"
Script->>Host: query nproc & count active xz/zstd jobs
Script->>Mem: read MemAvailable / MemTotal
Script->>Planner: compute mem_budget_mb, total input size
Planner->>Planner: select xz_elastic_level, zstd_elastic_level
Planner-->>Script: log resource share and selection trace
loop for each image
Script->>Images: stat file size
Script->>Planner: compute file_threads, peak_mem_mb
Script->>xz: run with -T ${file_threads} and -${xz_level}
xz-->>Script: compressed chunk
Script->>zstd: run zstdmt --long=27 -T ${compress_threads} [--ultra?]
zstd-->>Script: compressed output
Script->>Script: delete input, compute elapsed & throughput, log result
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
cf53f27 to
cb2b882
Compare
cb2b882 to
4c394dd
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
lib/functions/image/compress-checksum.sh (1)
43-69: ⚡ Quick winDRY violation: combine level selection and trace building into a single loop.
The xz level selection (lines 43-50) and trace building (lines 56-69) iterate over the same
9:674 6:94 3:32 1:9list with nearly identical logic. If the memory values are updated later, both loops must be kept in sync.♻️ Proposed refactor to single loop
- # Pick the strongest xz preset whose footprint (compress_threads * per_thread_MB) - # fits the per-job memory budget. Per-thread mem from xz manpage. - declare xz_elastic_level="1" - for lvl_mem in 9:674 6:94 3:32 1:9; do - lvl="${lvl_mem%:*}" - pt="${lvl_mem#*:}" - if (( compress_threads * pt <= mem_budget_mb )); then - xz_elastic_level="$lvl" - break - fi - done - - # zstd: bump to --ultra -22 only when there's clear memory headroom. - declare zstd_elastic_level="19" - (( mem_budget_mb >= 2048 )) && zstd_elastic_level="22" - - # Trace which xz levels were considered against the budget; useful when - # the picked level seems surprising in a build log. - declare xz_walk_trace="" - for lvl_mem in 9:674 6:94 3:32 1:9; do - lvl="${lvl_mem%:*}" - pt="${lvl_mem#*:}" - need=$(( compress_threads * pt )) - if (( need <= mem_budget_mb )); then - xz_walk_trace+="-${lvl}:${need}MB<=budget OK; " - break - else - xz_walk_trace+="-${lvl}:${need}MB>budget skip; " - fi - done + # Pick the strongest xz preset whose footprint (compress_threads * per_thread_MB) + # fits the per-job memory budget. Per-thread mem from xz manpage. + # Also build a trace of considered levels for diagnostics. + declare xz_elastic_level="1" xz_walk_trace="" + for lvl_mem in 9:674 6:94 3:32 1:9; do + lvl="${lvl_mem%:*}" + pt="${lvl_mem#*:}" + need=$(( compress_threads * pt )) + if (( need <= mem_budget_mb )); then + xz_elastic_level="$lvl" + xz_walk_trace+="-${lvl}:${need}MB<=budget OK; " + break + else + xz_walk_trace+="-${lvl}:${need}MB>budget skip; " + fi + done + + # zstd: bump to --ultra -22 only when there's clear memory headroom. + declare zstd_elastic_level="19" + (( mem_budget_mb >= 2048 )) && zstd_elastic_level="22"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/functions/image/compress-checksum.sh` around lines 43 - 69, The two loops that iterate over "9:674 6:94 3:32 1:9" should be merged into one so level selection and trace building stay in sync: in a single for loop over lvl_mem (used by xz_elastic_level and xz_walk_trace) extract lvl and pt as before, compute need=$(( compress_threads * pt )), append the appropriate trace fragment ("-${lvl}:${need}MB<=budget OK; " or "-${lvl}:${need}MB>budget skip; "), and when need <= mem_budget_mb set xz_elastic_level="$lvl" and break; preserve existing variables compress_threads, mem_budget_mb, and that zstd_elastic_level logic remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@lib/functions/image/compress-checksum.sh`:
- Around line 43-69: The two loops that iterate over "9:674 6:94 3:32 1:9"
should be merged into one so level selection and trace building stay in sync: in
a single for loop over lvl_mem (used by xz_elastic_level and xz_walk_trace)
extract lvl and pt as before, compute need=$(( compress_threads * pt )), append
the appropriate trace fragment ("-${lvl}:${need}MB<=budget OK; " or
"-${lvl}:${need}MB>budget skip; "), and when need <= mem_budget_mb set
xz_elastic_level="$lvl" and break; preserve existing variables compress_threads,
mem_budget_mb, and that zstd_elastic_level logic remains unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e4aa7a60-39d2-43e3-af62-84471d4ea9db
📒 Files selected for processing (1)
lib/functions/image/compress-checksum.sh
|
The default For The "resource share" heuristic looks at |
4c394dd to
afc1e8c
Compare
|
@iav — good catch, agree the elastic decision was memory-only and over-escalated on weak ARM hosts. Pushed a fix in
Behaviour at the cases you raised, plus a few sanity points:
The 2-core cases now max at xz The xz level-walk trace logged before each compression phase now shows both the memory and CPU constraint per preset, e.g.: So when something looks surprising on a given host, the reason is in the build log without an ssh trip. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
lib/functions/image/compress-checksum.sh (1)
50-88: ⚡ Quick winPreset table embedded twice — divergence risk when updating levels.
The selection loop (lines 50–59) and the trace loop (lines 72–88) parse the same
9:674:8 6:94:4 3:32:2 1:9:1table and apply identical logic. If a new level or adjusted per-thread memory value is added to one loop, the other silently diverges, and the logged trace will no longer match the actual selection.Merging into a single pass (trace everything, set
xz_elastic_levelon first fit) eliminates the duplication:♻️ Proposed refactor: single combined loop
- declare xz_elastic_level="1" - for entry in 9:674:8 6:94:4 3:32:2 1:9:1; do - lvl="${entry%%:*}" - rest="${entry#*:}" - pt="${rest%%:*}" - floor="${rest##*:}" - if (( compress_threads >= floor )) && (( compress_threads * pt <= mem_budget_mb )); then - xz_elastic_level="$lvl" - break - fi - done - - # ...zstd block stays unchanged... - - declare xz_walk_trace="" - for entry in 9:674:8 6:94:4 3:32:2 1:9:1; do - lvl="${entry%%:*}" - rest="${entry#*:}" - pt="${rest%%:*}" - floor="${rest##*:}" - need=$(( compress_threads * pt )) - if (( compress_threads < floor )); then - xz_walk_trace+="-${lvl}:cpu<${floor} skip; " - continue - fi - if (( need <= mem_budget_mb )); then - xz_walk_trace+="-${lvl}:${need}MB<=budget OK; " - break - else - xz_walk_trace+="-${lvl}:${need}MB>budget skip; " - fi - done + declare xz_elastic_level="1" xz_walk_trace="" + for entry in 9:674:8 6:94:4 3:32:2 1:9:1; do + lvl="${entry%%:*}" + rest="${entry#*:}" + pt="${rest%%:*}" + floor="${rest##*:}" + need=$(( compress_threads * pt )) + if (( compress_threads < floor )); then + xz_walk_trace+="-${lvl}:cpu<${floor} skip; " + continue + fi + if (( need <= mem_budget_mb )); then + xz_walk_trace+="-${lvl}:${need}MB<=budget OK; " + xz_elastic_level="$lvl" + break + else + xz_walk_trace+="-${lvl}:${need}MB>budget skip; " + fi + done🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/functions/image/compress-checksum.sh` around lines 50 - 88, The xz preset table is duplicated across two loops which can diverge; consolidate into a single loop that iterates the same entries (e.g. "for entry in 9:674:8 6:94:4 3:32:2 1:9:1"), computes lvl/pt/floor and need once, appends the same trace snippets to xz_walk_trace, and sets xz_elastic_level to lvl on the first entry that satisfies (( compress_threads >= floor )) && (( need <= mem_budget_mb )) then break; remove the separate selection loop so xz_elastic_level and xz_walk_trace are derived from the same pass (use the existing variable names compress_threads, mem_budget_mb, xz_elastic_level, xz_walk_trace to locate and update the code).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/functions/image/compress-checksum.sh`:
- Around line 155-161: The unconditional addition of --long=27 to zstd_args
forces a 128MB decoder window for all compressed images; modify the logic in the
compression block (where zstd_args, zstdmt and zstd_level are used) to only
append --long=27 for high compression levels (e.g., when zstd_level >= 19) so
that lower levels (like 9–18) keep the smaller default window; keep the existing
--ultra handling (which already gates levels >=20) and ensure the conditional
mirrors that threshold to avoid increasing decompressor memory requirements for
low/medium levels.
---
Nitpick comments:
In `@lib/functions/image/compress-checksum.sh`:
- Around line 50-88: The xz preset table is duplicated across two loops which
can diverge; consolidate into a single loop that iterates the same entries (e.g.
"for entry in 9:674:8 6:94:4 3:32:2 1:9:1"), computes lvl/pt/floor and need
once, appends the same trace snippets to xz_walk_trace, and sets
xz_elastic_level to lvl on the first entry that satisfies (( compress_threads >=
floor )) && (( need <= mem_budget_mb )) then break; remove the separate
selection loop so xz_elastic_level and xz_walk_trace are derived from the same
pass (use the existing variable names compress_threads, mem_budget_mb,
xz_elastic_level, xz_walk_trace to locate and update the code).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8c6327ce-6623-4783-8da1-4314ba1e386e
📒 Files selected for processing (1)
lib/functions/image/compress-checksum.sh
|
@igorpecovnik Thanks for the quick turnaround — the CPU-floor table makes the decision logic clear and flexible. |
|
✅ This PR has been reviewed and approved — all set for merge! |
Replace static `xz -T 0` / `zstdmt -9` with a CPU- and memory-aware
share so a single runner alone on a big box uses every core, peers
share fairly without OOMing each other, and small sibling artefacts
don't reserve per-thread memory they can't use.
Threads = full nproc (kernel time-slices when peer xz/zstd jobs contend;
we get the cores when they idle). Memory is fair-shared among peers:
budget = MemAvailable * 0.6 / (active+1).
The level walk requires BOTH a memory ceiling and a CPU floor — without
the floor, a 2-core ARM box with a few GB free would have escalated to
xz -6 / zstd -22 --ultra, which is much heavier on weak CPUs than the
old static defaults. Floors map per-thread compress throughput to
roughly equivalent wall-time bands:
xz -9 needs 8 threads + memory
xz -6 needs 4 threads + memory
xz -3 needs 2 threads + memory
xz -1 any (final fallback)
zstd -22 --ultra 16 threads, 4 GB budget
zstd -19 8 threads, 2 GB budget
zstd -12 4 threads
zstd -9 any (matches the old static default)
zstd's --long=27 (128 MB matching window) is gated to level >= 19. At
lower levels the size win doesn't justify the 128 MB decoder memory
requirement, and zstd's default windowLog (~4-16 MB) decompresses much
cheaper on small devices. --ultra stays gated at level >= 20.
Per-file thread cap on the xz path: file_size / block_size, where
block_size is ~3x the preset's dict (192 MB at -9, 48 MB at -7/8,
24 MB at -6, 12 MB at -3/4/5, 3 MB at -0/1). Threads beyond that
count sit idle and reserve per-thread memory for nothing — capping
trims ~21 GB on a 100 MB hyperv.zip at -9. zstd auto-scales workers
to input size, so it keeps the full compress_threads.
Stable images (BETA=no) force xz -0 — they deploy to our own infra
without a size cap and benefit most from speed. Nightly/user builds
use the elastic level to fit GitHub's 2 GB per-asset cap as snugly
as memory allows.
Diagnostics, four alerts before the loop:
Compression host nproc, loadavg, MemTotal, MemAvail
Compression resource share active peers -> threads, budget, picks
Compression xz level walk per-level cpu/budget skip/OK trace
Compression input file count + total MB
Plus per-file:
Compressing with xz/zstd level, threads_used/budgeted, file size,
block size, predicted peak MB
Compressed input -> output, ratio %, elapsed s, MB/s
Makes it possible to diagnose why a given build picked level X without
ssh to the runner. ZSTD_COMPRESSION_LEVEL / IMAGE_XZ_COMPRESSION_RATIO
overrides remain honored.
afc1e8c to
22ecfcc
Compare
|
Follow-up —
|
Benefits
-T 0at level 1. Order-of-magnitude reduction in compression time on hosts that were previously CPU-starved by the cap.xz -T 0) didn't account for memory at all — N peers at level 9 could collectively reserve more RAM than the box has. The new memory budget per-job (MemAvailable × 0.6 / (active+1)) makes that mathematically impossible at any peer count.-9(down from-1), shaving ~25% off the compressed size. Critical for staying under GitHub's 2 GB per-asset cap on desktop builds that were previously borderline.BETA=nodeploys to our own infra (no size cap) and is now pinned to xz-0— roughly 2× faster than-1for the same wall-clock budget per release.hyperv.zipat-9previously reserved ~21 GB for parallelism it couldn't use; the file-size-aware thread cap brings it down to the few hundred MB the file can actually drive. Frees that headroom for whoever's using the box next.How it works
nproc. Kernel time-slices when peerxz/zstdjobs contend; we get the cores when they idle.MemAvailable × 0.6 / (active_jobs + 1). Walks xz presets9 → 6 → 3 → 1and picks the strongest whose footprint (compress_threads × per_thread_mem) fits.--ultra -22when budget ≥ 2 GB, else-19. Always uses--long=27(128 MB window — exactly the decoder's default cap) so plainzstd -dstill works.file_size / block_size, where block size is ~3× the preset's dict (192 MB at-9, 24 MB at-6, 12 MB at-3/4/5, 3 MB at-0/1). zstd auto-scales workers and uses the full budget directly.BETA=noforces xz-0for stable releases (own infra, speed first).ZSTD_COMPRESSION_LEVEL/IMAGE_XZ_COMPRESSION_RATIOoverrides remain honored.Behaviour matrix
CPU stays maxed; level slides down only when memory genuinely can't absorb the strongest preset across all concurrent peers.
Diagnostic output
Four alerts before the loop:
Per file:
Test plan
threads=$(nproc)and a sane level pick.BETA=noand confirm the per-file alert readsxz=-0.threads_usedlower than the budget when the chosen level has a large block size..xz/.zstdecompresses cleanly with stockxz -d/zstd -d.Summary by CodeRabbit