Skip to content

build: use threads to speed up tar layer creation#77

Merged
jlebon merged 3 commits intomainfrom
issue-15-multithreaded-tar
Mar 3, 2026
Merged

build: use threads to speed up tar layer creation#77
jlebon merged 3 commits intomainfrom
issue-15-multithreaded-tar

Conversation

@jlebon
Copy link
Copy Markdown
Member

@jlebon jlebon commented Mar 3, 2026

By far the largest time spent during a build is in the tar layer creation (and specifically the SHA-256 calculations). Since tar layers are independent, we can pretty easily parallelize this.

Do this by default, but add a -T/--threads knob (and CHUNKAH_THREADS env) to control this.

This reduces split time for my workstation bootc image (2.5 GiB compressed) from 21s to 12.5s.

Closes #15.

Assisted-by: Claude Opus 4.6

jlebon added 3 commits March 3, 2026 12:22
This gives nicer output and feedback from e.g. hyperfine and also
propagates Ctrl-C properly.
I hit this weird issue running benchmarks between two different
worktrees to compare their results and the chunkah from one worktree
ended up being reused by the benchmark for another tree.

I think this is basically the consequence of (1) using a shared cache
for the cargo target dir, and (2) cargo using mtime for freshness which
can be wrong (e.g. a worktree with older timestamps than the one that
was last built would not trigger a rebuild).

Fix this by making the cache name unique to the workdir.
By far the largest time spent during a build is in the tar layer
creation (and specifically the SHA-256 calculations). Since tar layers
are independent, we can pretty easily parallelize this.

Do this by default, but add a `-T`/`--threads` knob (and
`CHUNKAH_THREADS` env) to control this.

This reduces split time for my workstation bootc image (2.5 GiB
compressed) from 21s to 12.5s.

Closes #15.

Assisted-by: Claude Opus 4.6
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces parallel processing for tar layer creation, which significantly speeds up the build process. The implementation is well-done, using scoped threads and an atomic counter for work distribution, which is a robust pattern. A new command-line argument and environment variable are added to control the number of threads, with auto-detection as a sensible default. My feedback includes a couple of suggestions for improving code clarity and robustness in the new parallel processing logic.

Comment thread src/ocibuilder.rs
Comment thread src/ocibuilder.rs
@jlebon jlebon merged commit 02101f6 into main Mar 3, 2026
8 checks passed
@jlebon jlebon deleted the issue-15-multithreaded-tar branch March 3, 2026 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use multi-threading for tar layer writing

1 participant