-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
devmapper snapshotter image pulls result in error unpacking image: failed to extract layer #8674
Comments
hmm.. I've noticed if I am on the node and do a
|
I tried with the 1.7.0 and 1.7.2 binaries and got exactly the same problem so there must be something else going on. |
@skaegi how reproducible is this on the other nodes? Have you seen with a different snapshotter? |
Thanks @dmcgowan -- I'm still working the problem and trying to create as small a reproducer as I can. Even without the devmapper snapshotter enabled if I |
Ok. I think I understand what's happening. We're running into a case where the same diff id maps to two different digests based on where it's pulled from. This can happen based on differences in gzip versions and also on how gzip is used to compress content (e.g. fast vs. best) -- see google/go-containerregistry#895 (comment) and then scroll to "The Issue" for details. [sorry for the long comment here] For example...
In this case both the following digests...
map to the same diff id
Also notice that in the second docker pull we get ... If I do the equivalent with
|
The alpine image is incomplete because we don't download the
Even though the alpine image is incomplete this will work fine when we later create a container using the default snapshotter because the diff id is already there and we don't have to unpack the content. It however will fail if we try to
|
@dmcgowan this might seem naive but what if the image puller "always" also checked that the digest id was present in the content store and if not downloaded the layer even though the resulting diff id is present in the default snapshotter? |
Hmm... after the fact I noticed that this is what
Anyway... it would be good if |
Related or in some sense the real issue -- #8580 @dmcgowan @mikebrow I need a fix here as we really need devmapper to work in IKS and can do the work but am trying to figure out how to do this without creating a PR that would never be accepted. I can see tweaking the logic around https://github.com/containerd/containerd/blob/release/1.7/pkg/unpack/unpacker.go#L321 or perhaps adding a config parameter to force content store fetching. Because the devmapper snapshotter is native perhaps it could be made smarter to understand the layer mapping in the overlayfs snapshotter and get the content that way? Do you have any suggestions on an approach I might try here? |
tagged to 2.0 milestone as Runtime Specific Snapshotters is currently listed to come out of experimental in 2.0 https://github.com/containerd/containerd/blob/main/RELEASES.md#experimental-features |
Description
When running a pod with a runtimeclass that uses a devmapper snapshotter we get a CreateContainerError when creating a pod. This previously worked with containerd 1.7.0 but fails with 1.7.1. We are using a managed K8s platform (IBM Cloud IKS) so it's possible something else changed too.
Using kubectl describe on the pod we see events similar to...
No matter what image we use we always see the same error with the same sha256 values, even on different nodes. If we nsenter on to the node and do a
ctr -n k8s.io images check
we see that the image is incomplete...if we then manually do a
ctr -n k8s.io images pull --snapshotter devmapper docker.io/library/alpine:latest
the image seems to download successfully and becomes complete.After this manual first time pull everything suddenly seems to work again. e.g. we can run pods normally without problem etc.
Steps to reproduce the issue
Describe the results you received and expected
We now see... CreateContainerError
previously this was... Running.
Suspect this is a change in 1.7.1 but not certain as we are still narrowing down the problem
What version of containerd are you using?
containerd github.com/containerd/containerd v1.7.1 1677a17
Any other relevant information
crictl 1.26.0
Show configuration if it is related to CRI plugin.
[plugins."io.containerd.snapshotter.v1.devmapper"]
pool_name = "devpool"
root_path = "/var/data/containerd/devmapper"
base_image_size = "100GB"
fs_type = "ext4"
discard_blocks = true
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
runtime_type = "io.containerd.kata.v2"
privileged_without_host_devices = true
pod_annotations = ["io.katacontainers.*"]
snapshotter = "devmapper"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata.options]
ConfigPath = "/opt/kata/share/defaults/kata-containers/configuration.toml"
The text was updated successfully, but these errors were encountered: