Skip to content

solver: run image and cache exports in parallel#6451

Merged
tonistiigi merged 1 commit intomoby:masterfrom
amrmahdi:amrmahdi/parallel-export
Feb 5, 2026
Merged

solver: run image and cache exports in parallel#6451
tonistiigi merged 1 commit intomoby:masterfrom
amrmahdi:amrmahdi/parallel-export

Conversation

@amrmahdi
Copy link
Contributor

@amrmahdi amrmahdi commented Jan 11, 2026

Image export and cache export currently run sequentially. This adds unnecessary latency when both are configured, especially when exporting to different destinations.

Tested with vLLM CI builds. A full build-from-cache takes about 7 minutes, with export dominating: image export runs for ~2min, then cache export for another ~2min. Running them in parallel cuts export time in half, saving ~2 minutes per build (~30% of total time).

The one exception is inline cache: it works by running the cache export first to generate metadata, which is then embedded into the image config before export. This means image export cannot start until inline cache completes, so we preserve sequential execution in that case.

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot one other case that affects this. In ExportTo based on Mode, when cache export checks if result is to be exported, it checks if it already exists or not. Eg. if image exporter(or previous cache source) already generated the layer then it is exported in CacheExportModeMin but otherwise only metadata is exported and no new layer tarball is created by the exporter. I wonder that maybe we need to instead break the export phase into two passes so that the push can run in parallel, but creating the objects is still sequential.

@tonistiigi
Copy link
Member

Windows CI doesn't seem to like this as well @rzlink

failed to commit ngtnv4s8gc8tp0beynhncp87s to oexb7oycg3u2l5y777xguo7u6 during finalize: failed to reimport snapshot: hcsshim::ActivateLayer failed in Win32: The process cannot access the file because it is being used by another process. (0x20)

@amrmahdi
Copy link
Contributor Author

I forgot one other case that affects this. In ExportTo based on Mode, when cache export checks if result is to be exported, it checks if it already exists or not. Eg. if image exporter(or previous cache source) already generated the layer then it is exported in CacheExportModeMin but otherwise only metadata is exported and no new layer tarball is created by the exporter. I wonder that maybe we need to instead break the export phase into two passes so that the push can run in parallel, but creating the objects is still sequential.

@tonistiigi would a simplification be to treat CacheExportModeMin as sequential like the inline cache?

@amrmahdi
Copy link
Contributor Author

I wonder that maybe we need to instead break the export phase into two passes so that the push can run in parallel, but creating the objects is still sequential.

@tonistiigi If we go with the 2-phase approach, we will need to change the interfaces to add the 2 export phases (maybe Prepare and Finalize), and we could keep Export as is for backward compatibility (internally it just calls Prepare + Finalize). If I understand correctly, Finalize will mostly be a no-op for most exporters. Does this align with your thoughts?

@tonistiigi
Copy link
Member

would a simplification be to treat CacheExportModeMin as sequential like the inline cache?

I think it could still be different even with max. The blobs creation depends on how the compression algorithm and level was configured for exporter if the blobs are shared.

and we could keep Export as is for backward compatibility

In buildkit repo we don't care about backward compatibility of Go API. Only on-wire APIs. Either splitting in two or maybe Export can optionally return a "finalize" callback that needs to be called (but can be called in parallel with other tasks).

@amrmahdi
Copy link
Contributor Author

I wonder that maybe we need to instead break the export phase into two passes so that the push can run in parallel, but creating the objects is still sequential.

@tonistiigi i took a stab at it, let me know what you think

@amrmahdi amrmahdi requested a review from tonistiigi January 14, 2026 21:26
Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach is ok, check the comments and CI.

This doesn't need to be in the first PR, but if we add Finalize to exporter then all the exporters should use it, not just the image exporter.

@amrmahdi amrmahdi force-pushed the amrmahdi/parallel-export branch from a4a707c to caf2afb Compare January 16, 2026 06:15
@amrmahdi
Copy link
Contributor Author

I think the approach is ok, check the comments and CI.

This doesn't need to be in the first PR, but if we add Finalize to exporter then all the exporters should use it, not just the image exporter.

Sure, I can addressed the other exporters in a follow up PR

@amrmahdi amrmahdi force-pushed the amrmahdi/parallel-export branch 8 times, most recently from 8eeaa5c to 59c5372 Compare January 16, 2026 08:47
@amrmahdi amrmahdi changed the title solver: add option to run image and cache exports in parallel solver: run image and cache exports in parallel Jan 16, 2026
@amrmahdi amrmahdi requested a review from tonistiigi January 16, 2026 16:31
@amrmahdi
Copy link
Contributor Author

@tonistiigi any chance this can make it to the v0.27.0 release ? 😇

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any chance this can make it to the v0.27.0 release ?

Too late for that.

@amrmahdi amrmahdi requested a review from tonistiigi January 21, 2026 07:13
@amrmahdi amrmahdi force-pushed the amrmahdi/parallel-export branch 4 times, most recently from c4f6f9a to 156699f Compare January 22, 2026 03:28
@tonistiigi tonistiigi requested a review from Copilot February 3, 2026 23:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces overall build latency by parallelizing image finalization (e.g., registry push) with remote cache exports, while preserving the required sequential behavior for inline cache.

Changes:

  • Introduces an exporter.FinalizeFunc returned from ExporterInstance.Export to split artifact creation from post-export operations (like pushes).
  • Updates the solver to run exporter finalizers and remote cache exporters concurrently via an errgroup.
  • Refactors remote cache export execution into a per-exporter function and adjusts existing exporters to the new Export signature.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
solver/llbsolver/solver.go Runs exporter finalizers and cache exporters in parallel; refactors cache export runner and updates exporter execution flow.
exporter/exporter.go Extends exporter interface to return an optional finalize callback.
exporter/containerimage/export.go Moves registry push work into a finalize callback to enable parallelism.
exporter/oci/export.go Updates exporter signature to return (resp, finalize, descref, err) (finalize is nil).
exporter/local/export.go Updates exporter signature to include finalize callback (nil).
exporter/tar/export.go Updates exporter signature and adjusts tar send completion handling (finalize is nil).
Comments suppressed due to low confidence (1)

solver/llbsolver/solver.go:893

  • runExporters assigns to the outer named return variable err from inside multiple eg.Go goroutines (resps[i], finalizeFuncs[i], descs[i], err = exp.Export(...)). This is a data race and can also cause the wrong error value to be observed/returned. Use a goroutine-local variable (e.g., resp, fin, desc, expErr := exp.Export(...)) and then assign into the slices, returning expErr without touching the outer err from concurrent goroutines.
				resps[i], finalizeFuncs[i], descs[i], err = exp.Export(ctx, inp, exporter.ExportBuildInfo{
					Ref:         ref,
					SessionID:   job.SessionID,
					InlineCache: inlineCache,
				})
				if err != nil {
					return err
				}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is close. Maybe remove the parallelization of multiple --cache-to in the initial PR (see comments).

Also squash the commits. If you have separate logical chunks, then they can be in separate commits, but avoid code that gets reverted later or "review commits" in the version that gets merged. Also, all commits should build.

dgst := desc.Digest
finalize := func(ctx context.Context) error {
for _, targetName := range namesToPush {
err := e.pushImage(ctx, srcCopy, sessionID, targetName, dgst)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential follow-up in the future would be to push separate registries (or repo names?) in parallel.

ctx := withDescHandlerCacheOpts(ctx, ref)

// Configure compression
compressionConfig := exp.Config().Compression
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the parallel cache export really an issue for you? Iiuc if you really have multiple --cache-to and they use different compression conf then the current implementation would be undeterministic.

Also wonder if when multiple --cache-to mode=max then they can try to compete with each other on creating the same layer tarballs in parallel and actually be slower overall.

@amrmahdi amrmahdi force-pushed the amrmahdi/parallel-export branch 3 times, most recently from cfaedb0 to 1b74db1 Compare February 5, 2026 00:33
@tonistiigi tonistiigi requested a review from Copilot February 5, 2026 00:59
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@amrmahdi amrmahdi force-pushed the amrmahdi/parallel-export branch from 1b74db1 to 73ab5b8 Compare February 5, 2026 01:07
@amrmahdi amrmahdi requested a review from tonistiigi February 5, 2026 01:08
Split image export into two phases to enable parallel execution:
1. Export creates artifacts (layers, manifests) in the content store
2. FinalizeFunc pushes artifacts to the registry

This allows image push to run in parallel with cache export, reducing
overall build time when both image and cache exports are configured.

The cache exporters run after image Export completes, ensuring they can
see and reuse the layers in the content store.

Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
@amrmahdi amrmahdi force-pushed the amrmahdi/parallel-export branch from 73ab5b8 to 88ef66c Compare February 5, 2026 02:21
@tonistiigi tonistiigi merged commit e07b5ed into moby:master Feb 5, 2026
222 of 223 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants