Fix cache cannot reuse lazy layers #3109

ktock · 2022-09-12T21:46:16Z

Currently, cache cannot reuse lazy layers that returns cache.NeedsRemoteProviderError on cacheresult.cacheResultStorage.Exist().
This causes the issue that cache doesn't reuse lazy layers as reported in docker/buildx#1306 .
This commit tries to fix this issue by allowing cache.NeedsRemoteProviderError during cache lookup.

ktock · 2022-09-12T22:19:53Z

cc @sipsma @tonistiigi @gabrieldemarmiesse

tonistiigi

Is this specific to stargz or for unpulled layers as well. Eg. cache match for merge(image(), ..).

Do we have a guarantee that we have connected the correct remote provider when the actual cache load would happen in this case?

sipsma · 2022-09-17T22:25:11Z

Is this specific to stargz or for unpulled layers as well. Eg. cache match for merge(image(), ..).

Thinking out loud: not sure if a cache match on unpulled non-stargz layers is really that useful. If you need to pull the layers anyways then that's as good as a cache miss. I'm guessing that for stargz this matters because you can have a "half-pulled" layer where some of the data is present locally but you still need a remote provider for the unpulled data, is that correct @ktock?

For the change itself, it seems like the main code path where Exists gets called is CacheManager.Records, which I think only gets called in solver/edge.go when e.cacheMap is available. That in turn probably means that we could plumb the cache opts from cacheMap through ctx to Exists and verify that the caller actually has a remote provider hooked up? Maybe? It's obviously kind of convoluted, so not sure if this actually adds up. If it does make sense though then maybe that'd help guarantee the result is actually going to be loadable.

The other place Exists gets called is ReleaseUnreferenced, which we'd need to handle too.

If there's a different way of guaranteeing the record will be loadable that's simpler then I'm all for it, the above is just my first thought looking through the code paths.

ktock · 2022-09-19T10:31:46Z

@tonistiigi @sipsma

Thank you for taking a look at this.

I'm guessing that for stargz this matters because you can have a "half-pulled" layer where some of the data is present locally but you still need a remote provider for the unpulled data, is that correct @ktock?

Thank you for elaborating this. Yes, it's correct.

For the change itself, it seems like the main code path where Exists gets called is CacheManager.Records, which I think only gets called in solver/edge.go when e.cacheMap is available. That in turn probably means that we could plumb the cache opts from cacheMap through ctx to Exists and verify that the caller actually has a remote provider hooked up? Maybe? It's obviously kind of convoluted, so not sure if this actually adds up. If it does make sense though then maybe that'd help guarantee the result is actually going to be loadable.
The other place Exists gets called is ReleaseUnreferenced, which we'd need to handle too.

Thank you for the suggestion. Fixed the patch to pass cache opts to CacheManager.Records through ctx (as similar as what is done by sharedOp.LoadCache calling CacheManager.Load).

tonistiigi · 2022-09-30T19:58:01Z

@ktock Could you also take a look at docker/buildx#1325 (comment) . Lmk if you think that is completely different.

ktock · 2022-10-27T07:29:31Z

@ktock Could you also take a look at docker/buildx#1325 (comment) . Lmk if you think that is completely different.

@tonistiigi @sipsma It seems that docker/buildx#1325 (comment) can be addressed separately from this PR. (Please see also #3229).
Could we move this PR forward if the patch looks fine?

tonistiigi

This does not mean thatRecords() can be a long-running call, right? Because it looks like it is called directly from the scheduler event loop where blocking is not allowed.

ktock · 2022-11-04T08:07:05Z

@tonistiigi It seems that the call of Records API can reach to (worker/base).Worker.LoadRef() which calls cache.cacheManager.Get() API. Though the current implementation of the Get API checks the existence of descHandlers but doesn't look like cause slow operation like unlazying of the layer. It calls Stat API of the snapshotter but it's just a metadata lookup on the bbot db.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>

ktock · 2023-01-25T01:01:42Z

CI passed, can we move this patch forward? @tonistiigi

ktock marked this pull request as ready for review September 12, 2022 22:19

crazy-max added the status/code-review label Sep 14, 2022

tonistiigi reviewed Sep 16, 2022

View reviewed changes

ktock force-pushed the reuseremotelayers branch from b8387a4 to 39c8707 Compare September 19, 2022 08:54

tonistiigi requested a review from sipsma September 20, 2022 23:21

sipsma approved these changes Oct 29, 2022

View reviewed changes

ktock requested a review from tonistiigi October 31, 2022 22:49

tonistiigi reviewed Nov 1, 2022

View reviewed changes

ktock force-pushed the reuseremotelayers branch 2 times, most recently from e9d3c6f to e198844 Compare November 29, 2022 08:32

Fix cache cannot reuse lazy layers

085bd8a

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>

ktock force-pushed the reuseremotelayers branch from e198844 to 085bd8a Compare January 24, 2023 07:24

ktock requested a review from tonistiigi February 7, 2023 05:31

tonistiigi approved these changes Feb 15, 2023

View reviewed changes

tonistiigi merged commit 0ad8d61 into moby:master Feb 15, 2023

ktock deleted the reuseremotelayers branch February 15, 2023 00:53

ktock mentioned this pull request Jun 22, 2023

Estargz base images don't seem to get cached if using stargz snapshotter #3958

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix cache cannot reuse lazy layers #3109

Fix cache cannot reuse lazy layers #3109

ktock commented Sep 12, 2022

ktock commented Sep 12, 2022

tonistiigi left a comment

sipsma commented Sep 17, 2022

ktock commented Sep 19, 2022

tonistiigi commented Sep 30, 2022

ktock commented Oct 27, 2022

tonistiigi left a comment

ktock commented Nov 4, 2022

ktock commented Jan 25, 2023

Fix cache cannot reuse lazy layers #3109

Fix cache cannot reuse lazy layers #3109

Conversation

ktock commented Sep 12, 2022

ktock commented Sep 12, 2022

tonistiigi left a comment

Choose a reason for hiding this comment

sipsma commented Sep 17, 2022

ktock commented Sep 19, 2022

tonistiigi commented Sep 30, 2022

ktock commented Oct 27, 2022

tonistiigi left a comment

Choose a reason for hiding this comment

ktock commented Nov 4, 2022

ktock commented Jan 25, 2023