Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client.Pull(..., WithPullUnpack) doesn't fetch layer contents if snapshot already exists #8973

Open
vvoland opened this issue Aug 16, 2023 · 5 comments
Labels

Comments

@vvoland
Copy link
Contributor

vvoland commented Aug 16, 2023

Description

When pulling an image using client.Pull function with WithPullUnpack option, the manifest blobs are split into "layers" and "non-layers". This allows to defer the download of layers which makes it possible for remote snapshotters to work.

// Split layers from non-layers, layers will be handled after
// the config
for i, child := range children {
span.SetAttributes(
tracing.Attribute("descriptor.child."+strconv.Itoa(i), []string{child.MediaType, child.Digest.String()}),
)
if images.IsLayerType(child.MediaType) {
manifestLayers = append(manifestLayers, child)
} else {
nonLayers = append(nonLayers, child)
}
}
lock.Lock()
for _, nl := range nonLayers {
layers[nl.Digest] = manifestLayers
}
lock.Unlock()
children = nonLayers

However, the unpack code doesn't reach the point where it fetches the layers because it exits early when the snapshot is already present:

if _, err := sn.Stat(ctx, chainID); err == nil {
// no need to handle
return nil

Steps to reproduce the issue

  1. Pull image
  2. Add snapshot lease or create child snapshot
  3. Delete the image
  4. Pull the image again

Nerdctl reproduction:

$ nerdctl image pull busybox
docker.io/library/busybox:latest:                                                 resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:3fbc632167424a6d997e74f52b878d7cc478225cffac6bc977eedfe51c7f4e79:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:1fa89c01cd0473cedbd1a470abb8c139eeb80920edf1bc55de87851bfb63ea11: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:fc9db2894f4e4b8c296b8c9dab7e18a6e78de700d21bc0cfaf5c78484226db9c:   done           |++++++++++++++++++++++++++++++++++++++|
layer-sha256:8a0af25e8c2e5dc07c14df3b857877f58bf10c944685cb717b81c5a90974a5ee:    done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 1.4 s                                                                    total:  1.8 Mi (1.3 MiB/s)


$ nerdctl image run -d busybox
b8b8e9599bccbeacdaa83fcd63f5039dd9a619f5f9ae9511f5bd0da9e4cdcac1

$ nerdctl image rm busybox
FATA[0000] 1 errors:
conflict: unable to delete busybox (must be forced) - image is being used by stopped container b8b8e9599bccbeacdaa83fcd63f5039dd9a619f5f9ae9511f5bd0da9e4cdcac1

$ nerdctl image rm -f busybox
Untagged: docker.io/library/busybox:latest@sha256:3fbc632167424a6d997e74f52b878d7cc478225cffac6bc977eedfe51c7f4e79
Deleted: sha256:3694737149b11ec4d2c9f15ad24788e81955cd1c7f2c6f555baf1e4a3615bd26


### At this point snapshot exists because of a stopped container, but all image content is deleted.

$ nerdctl pull busybox
docker.io/library/busybox:latest:                                                 resolved       |++++++++++++++++++++++++++++++++++++++|
index-sha256:3fbc632167424a6d997e74f52b878d7cc478225cffac6bc977eedfe51c7f4e79:    done           |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:1fa89c01cd0473cedbd1a470abb8c139eeb80920edf1bc55de87851bfb63ea11: done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:fc9db2894f4e4b8c296b8c9dab7e18a6e78de700d21bc0cfaf5c78484226db9c:   done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 1.1 s                                                                    total:   0.0 B (0.0 B/s)

## ^^^ Note that the layer was not fetched

# Now we can't do anything that involves the packed content (for example push).
$ nerdctl tag busybox myregistry:5000/test
$ nerdctl push myregistry:5000/test
FATA[0000] failed to create a tmp single-platform image "myregistry:5000/test:latest-tmp-reduced-platform": content digest sha256:8a0af25e8c2e5dc07c14df3b857877f58bf10c944685cb717b81c5a90974a5ee: not found

Describe the results you received and expected

Operations that need packed content work after client.Pull(..., WithPullUnpack).

What version of containerd are you using?

v1.7.3

Any other relevant information

No response

Show configuration if it is related to CRI plugin.

No response

@fuweid
Copy link
Member

fuweid commented Aug 16, 2023

create child snapshot

It means that there is reference to the parent by the container. The delete image is to delete the reference from image to parent snapshot. However, the reference count is still one maintained by container. The snapshots won't be deleted.

I think the solution is to generate the content blob if it's not exist when push.

cc @cardyok

containerd/nerdctl#2295

@dmcgowan
Copy link
Member

This is a common situation today in the pull->tag->push flow. By default pull+unpack is optimizing for running a container on the current platform and will skip unneeded content (whether because different platform or already exists). Changing the default pull behavior should be up to the client but we could possibly try and be smarter on tag and push operations as well.

  1. Add flag to pull to get the all the related content for an image so that it can be pushed.
  2. Check content on tag, pulling missing content.
  3. Attempt to pull missing content on push if needed using cross repo mount metadata, or just perform cross repo mount if exists from same registry (This might assume the WithAllMetadata option is at least provided on pull).

@yankay
Copy link
Contributor

yankay commented Aug 21, 2023

The same as containerd/nerdctl#2327

@henry118
Copy link
Member

#8878 seems like a fix to this issue.

@manishmalik
Copy link

Facing the exact same issue. While we are waiting for the fix PR #8878 to be reviewed and merged, is there any workaround we can try?
I tried to remove the image via nerdctl rmi and tried to remove it via ctr i prune --all. But whenever I tried to pull the same image again, I am facing the same reported issue.

nerdctl pull "amazon-ecr.amazonaws.com/some/image:latest"
amazon-ecr.amazonaws.com/some/image:latest: resolved       |++++++++++++++++++++++++++++++++++++++|
manifest-sha256:somesha:           done           |++++++++++++++++++++++++++++++++++++++|
config-sha256:some-different-sha:             done           |++++++++++++++++++++++++++++++++++++++|
elapsed: 1.3 s                                                                              total:  9.1 Ki (7.0 KiB/s)

ctr i check
REF                                                                                        TYPE                                                 DIGEST                                                                  STATUS            SIZE             UNPACKED
amazon-ecr.amazonaws.com/some/image:latest application/vnd.docker.distribution.manifest.v2+json sha256:somesha incomplete (1/44) 14.5 KiB/1.1 GiB true

Whenever I try export this image I faced the following issue:

nerdctl save -o ./image.tar "amazon-ecr.amazonaws.com/some/image:latest"
FATA[0000] failed to get reader: content digest sha256:some-entirely-different-sha: not found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants