Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ctr.exe images import can fail on Windows #5690

Closed
claudiubelu opened this issue Jul 6, 2021 · 6 comments · Fixed by #5916
Closed

ctr.exe images import can fail on Windows #5690

claudiubelu opened this issue Jul 6, 2021 · 6 comments · Fixed by #5916
Labels

Comments

@claudiubelu
Copy link
Contributor

claudiubelu commented Jul 6, 2021

Description

Importing images into containerd through ctr.exe can fail with the following error:

unpacking docker.io/claudiubelu/busybox:1.29 (sha256:435b5572d74b64c9646803376baad711af044e2130e3567bed07b809af4ca8bf)...ctr: content digest sha256:7c3822e9f6e642ac53d1da8e3aa2f08b556d82ee8827dda1c93868cde18799d9: not found

This can happen because the Garbage Collector deletes the layers while it's still being unpacked / diffs are being applied. This can be seen in the server logs:

evel=debug msg="removed content" digest="sha256:77b0672dda4a9a40f4a00b73dc21b446e2bc1b2ffb3e6f37525fc5967a971d46"
level=warning msg="content garbage collection failed" error="remove C:\\Program Files\\Git\\var\\lib\\containerd-test\\io.containerd.content.v1.content\\blobs\\sha256\\7c3822e9f6e642ac53d1da8e3aa2f08b556d82ee8827dda1c93868cde18799d9: The process cannot access the file because it is being used by another process."
level=debug msg="garbage collected" d=126.4602ms
level=debug msg="diff applied" d=501.3442ms digest="sha256:7c3822e9f6e642ac53d1da8e3aa2f08b556d82ee8827dda1c93868cde18799d9" media=application/vnd.docker.image.rootfs.diff.tar size=1376768

That blob couldn't be removed because the diff is still being applied at that moment.

The image still shows up in ctr.exe image list:

REF                                                                                                                  TYPE                                                      DIGEST          SIZE      PLATFORMS                                                                    LABELS
docker.io/claudiubelu/busybox:1.29                                                                                   application/vnd.docker.distribution.manifest.v2+json      sha256:435b5572d74b64c9646803376baad711af044e2130e3567bed07b809af4ca8bf 402.5 MiB -                                                                            io.cri-containerd.image=managed

But that image is still unusable. Trying to spawn containers using that image will generate errors.

Steps to reproduce the issue:

# export image to tar.
docker save docker.io/claudiubelu/busybox:1.29 -o image.tar

# make sure we don't have it in store already.
ctr.exe --address //./pipe//run/containerd-test/containerd --namespace k8s.io image remove docker.io/claudiubelu/busybox:1.29
ctr.exe --address //./pipe//run/containerd-test/containerd --namespace k8s.io images import .\image.tar

Describe the results you received:

Output:

PS C:\containerd> ctr.exe --address //./pipe//run/containerd-test/containerd --namespace k8s.io images import .\image.tar
unpacking docker.io/claudiubelu/busybox:1.29 (sha256:435b5572d74b64c9646803376baad711af044e2130e3567bed07b809af4ca8bf)...ctr: content digest sha256:7c3822e9f6e642ac53d1da8e3aa2f08b556d82ee8827dda1c93868cde18799d9: not found

Server logs: https://paste.ubuntu.com/p/pNdkv6YFSm/

Describe the results you expected:

The image should have been imported properly, and it can be used to start new containers.

At the very least, broken images should not be kept / listed in containerd.

What version of containerd are you using:

$ containerd --version
containerd github.com/containerd/containerd v1.5.0-197-g25d7f907c 25d7f907c03a58b5fa30c6f90c3a36aa308e124c

Any other relevant information (runC version, CRI configuration, OS/Kernel version, etc.):

runc --version
$ runc --version

crictl info
$ crictl info
{
  "status": {
    "conditions": [
      {
        "type": "RuntimeReady",
        "status": true,
        "reason": "",
        "message": ""
      },
      {
        "type": "NetworkReady",
        "status": true,
        "reason": "",
        "message": ""
      }
    ]
  },
  "cniconfig": {
    "PluginDirs": [
      "C:\\Program Files\\containerd\\cni\\bin"
    ],
    "PluginConfDir": "C:\\Program Files\\containerd\\cni\\conf",
    "PluginMaxConfNum": 1,
    "Prefix": "eth",
    "Networks": [
      {
        "Config": {
          "Name": "nat",
          "CNIVersion": "0.2.0",
          "Plugins": [
            {
              "Network": {
                "cniVersion": "0.2.0",
                "name": "nat",
                "type": "nat",
                "capabilities": {
                  "dns": true,
                  "portMappings": true
                },
                "ipam": {},
                "dns": {}
              },
              "Source": "{\"capabilities\":{\"dns\":true,\"portMappings\":true},\"cniVersion\":\"0.2.0\",\"ipam\":{\"routes\":[{\"GW\":\"172.27.240.1\"}],\"subnet\":\"172.27.240.0/12\"},\"master\":\"Ethernet\",\"name\":\"nat\",\"type\":\
"nat\"}"
            }
          ],
          "Source": "{\"cniVersion\":\"0.2.0\",\"name\":\"nat\",\"plugins\":[{\"capabilities\":{\"dns\":true,\"portMappings\":true},\"cniVersion\":\"0.2.0\",\"ipam\":{\"routes\":[{\"GW\":\"172.27.240.1\"}],\"subnet\":\"172.27.240.0/12\"}
,\"master\":\"Ethernet\",\"name\":\"nat\",\"type\":\"nat\"}]}"
        },
        "IFName": "eth0"
      }
    ]
  },
  "config": {
    "containerd": {
      "snapshotter": "windows",
      "defaultRuntimeName": "default",
      "defaultRuntime": {
        "runtimeType": "io.containerd.runhcs.v1",
        "runtimeEngine": "",
        "PodAnnotations": null,
        "ContainerAnnotations": null,
        "runtimeRoot": "",
        "options": {
          "Debug": true,
          "DebugType": 2,
          "SandboxImage": "mcr.microsoft.com/oss/kubernetes/pause:3.4.1-windows-1809-amd64",
          "SandboxIsolation": 0,
          "SandboxPlatform": "windows/amd64"
        },
        "privileged_without_host_devices": false,
        "baseRuntimeSpec": ""
      },
      "untrustedWorkloadRuntime": {
        "runtimeType": "",
        "runtimeEngine": "",
        "PodAnnotations": null,
        "ContainerAnnotations": null,
        "runtimeRoot": "",
        "options": null,
        "privileged_without_host_devices": false,
        "baseRuntimeSpec": ""
      },
      "runtimes": {
        "default": {
          "runtimeType": "io.containerd.runhcs.v1",
          "runtimeEngine": "",
          "PodAnnotations": null,
          "ContainerAnnotations": null,
          "runtimeRoot": "",
          "options": {
            "Debug": true,
            "DebugType": 2,
            "SandboxImage": "mcr.microsoft.com/oss/kubernetes/pause:3.4.1-windows-1809-amd64",
            "SandboxIsolation": 0,
            "SandboxPlatform": "windows/amd64"
          },
          "privileged_without_host_devices": false,
          "baseRuntimeSpec": ""
        },
        "runhcs-wcow-process": {
          "runtimeType": "io.containerd.runhcs.v1",
          "runtimeEngine": "",
          "PodAnnotations": null,
          "ContainerAnnotations": null,
          "runtimeRoot": "",
          "options": {
            "Debug": true,
            "DebugType": 2,
            "SandboxImage": "mcr.microsoft.com/oss/kubernetes/pause:3.4.1-windows-1809-amd64",
            "SandboxPlatform": "windows/amd64"
          },
          "privileged_without_host_devices": false,
          "baseRuntimeSpec": ""
        }
      },
      "noPivot": false,
      "disableSnapshotAnnotations": false,
      "discardUnpackedLayers": true
    },
    "cni": {
      "binDir": "C:\\Program Files\\containerd\\cni\\bin",
      "confDir": "C:\\Program Files\\containerd\\cni\\conf",
      "maxConfNum": 1,
      "confTemplate": ""
    },
    "registry": {
      "configPath": "",
      "mirrors": {
        "docker.io": {
          "endpoint": [
            "https://registry-1.docker.io"
          ]
        }
      },
      "configs": null,
      "auths": null,
      "headers": null
    },
    "imageDecryption": {
      "keyModel": "node"
    },
    "disableTCPService": true,
    "streamServerAddress": "127.0.0.1",
    "streamServerPort": "0",
    "streamIdleTimeout": "4h0m0s",
    "enableSelinux": false,
    "selinuxCategoryRange": 0,
    "sandboxImage": "mcr.microsoft.com/oss/kubernetes/pause:3.4.1-windows-1809-amd64",
    "statsCollectPeriod": 10,
    "systemdCgroup": false,
    "enableTLSStreaming": false,
    "x509KeyPairStreaming": {
      "tlsCertFile": "",
      "tlsKeyFile": ""
    },
    "maxContainerLogSize": 16384,
    "disableCgroup": false,
    "disableApparmor": false,
    "restrictOOMScoreAdj": false,
    "maxConcurrentDownloads": 3,
    "disableProcMount": false,
    "unsetSeccompProfile": "",
    "tolerateMissingHugetlbController": false,
    "disableHugetlbController": false,
    "ignoreImageDefinedVolumes": false,
    "netnsMountsUnderStateDir": false,
    "containerdRootDir": "C:\\Program Files\\Git\\var\\lib\\containerd-test",
    "containerdEndpoint": "//./pipe//run/containerd-test/containerd",
    "rootDir": "C:\\Program Files\\Git\\var\\lib\\containerd-test\\io.containerd.grpc.v1.cri",
    "stateDir": "C:\\Program Files\\Git\\run\\containerd-test\\io.containerd.grpc.v1.cri"
  },
  "golang": "go1.16.5",
  "lastCNILoadStatus": "OK"
}

uname -a
$ uname -a

@kzys
Copy link
Member

kzys commented Jul 6, 2021

I've added and removed windows label. Seems the issue is not Windows-specific.

kzys added a commit to kzys/containerd that referenced this issue Jul 6, 2021
"ctr image import" didn't have a lease. Due to that, GC would
remove contents during the import process.

Fixes containerd#5690.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
@oldthreefeng
Copy link

oldthreefeng commented Aug 18, 2021

same error on arm64 .. I compile arm64 myself from tag v1.5.5
export image error

$ uname  -a
Linux sealos 4.18.0-80.7.2.el7.aarch64 #1 SMP Thu Sep 12 16:13:20 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

[root@sealos ~]# ctr version
Client:
  Version:  v1.5.5
  Revision: 72cec4be58a9eb6b2910f5d10f1c01ca47d231c0
  Go version: go1.16.6

Server:
  Version:  v1.5.5
  Revision: 72cec4be58a9eb6b2910f5d10f1c01ca47d231c0
  UUID: f97f6dda-9179-4ff2-933d-5940a3cbbf55


 ctr -n=k8s.io image export image.tar k8s.gcr.io/pause:3.5 
ctr: content digest sha256:94ee124c4b0ca7c1315c06c31532f78a929051ae8da7f122f905b2cbbfb1ecba: not found

but when return to v1.4.3 , it works ok.

[root@sealos bin]# ./ctr  version
Client:
  Version:  v1.4.3
  Revision: 269548fa27e0089a8b8278fc4fc781d7f65a939b
  Go version: go1.15.5

Server:
  Version:  v1.5.5
  Revision: 72cec4be58a9eb6b2910f5d10f1c01ca47d231c0
  UUID: f97f6dda-9179-4ff2-933d-5940a3cbbf55
WARNING: version mismatch
WARNING: revision mismatch

[root@sealos bin]#  ./ctr  -n=k8s.io image export image.tar k8s.gcr.io/pause:3.5

logs about pause3.5


Aug 18 14:15:24 sealos containerd[1799]: time="2021-08-18T14:15:24.611413033+08:00" level=info msg="PullImage \"k8s.gcr.io/pause:3.2\""
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.933975382+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause:3.2,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.941916771+08:00" level=info msg="ImageUpdate event &ImageUpdate{Name:k8s.gcr.io/pause:3.2,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.945033162+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause@sha256:927d98197ec1141a368550822d18fa1c60bdae27b78b0c004f705f548c07814f,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.945425033+08:00" level=info msg="PullImage \"k8s.gcr.io/pause:3.2\" returns image reference \"sha256:2a060e2e7101d419352bf82c613158587400be743482d9a537ec4a9d1b4eb93c\""
Aug 18 14:16:35 sealos containerd[1799]: time="2021-08-18T14:16:35.064097875+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause:3.5,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:16:35 sealos containerd[1799]: time="2021-08-18T14:16:35.071463944+08:00" level=info msg="ImageUpdate event &ImageUpdate{Name:k8s.gcr.io/pause:3.5,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:16:35 sealos containerd[1799]: time="2021-08-18T14:16:35.074446456+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause@sha256:1ff6c18fbef2045af6b9c16bf034cc421a29027b800e4f9b68ae9b1cb3e9ae07,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"

tips:
and only the pause:3.5 image export error . other images export ok.. this really confused me...
ctr v1.4.3 image export all ok.

@claudiubelu
Copy link
Contributor Author

same error on arm64 .. I compile arm64 myself from tag v1.5.5
export image error

$ uname  -a
Linux sealos 4.18.0-80.7.2.el7.aarch64 #1 SMP Thu Sep 12 16:13:20 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

[root@sealos ~]# ctr version
Client:
  Version:  v1.5.5
  Revision: 72cec4be58a9eb6b2910f5d10f1c01ca47d231c0
  Go version: go1.16.6

Server:
  Version:  v1.5.5
  Revision: 72cec4be58a9eb6b2910f5d10f1c01ca47d231c0
  UUID: f97f6dda-9179-4ff2-933d-5940a3cbbf55


 ctr -n=k8s.io image export image.tar k8s.gcr.io/pause:3.5 
ctr: content digest sha256:94ee124c4b0ca7c1315c06c31532f78a929051ae8da7f122f905b2cbbfb1ecba: not found

but when return to v1.4.3 , it works ok.

[root@sealos bin]# ./ctr  version
Client:
  Version:  v1.4.3
  Revision: 269548fa27e0089a8b8278fc4fc781d7f65a939b
  Go version: go1.15.5

Server:
  Version:  v1.5.5
  Revision: 72cec4be58a9eb6b2910f5d10f1c01ca47d231c0
  UUID: f97f6dda-9179-4ff2-933d-5940a3cbbf55
WARNING: version mismatch
WARNING: revision mismatch

[root@sealos bin]#  ./ctr  -n=k8s.io image export image.tar k8s.gcr.io/pause:3.5

logs about pause3.5


Aug 18 14:15:24 sealos containerd[1799]: time="2021-08-18T14:15:24.611413033+08:00" level=info msg="PullImage \"k8s.gcr.io/pause:3.2\""
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.933975382+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause:3.2,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.941916771+08:00" level=info msg="ImageUpdate event &ImageUpdate{Name:k8s.gcr.io/pause:3.2,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.945033162+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause@sha256:927d98197ec1141a368550822d18fa1c60bdae27b78b0c004f705f548c07814f,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:15:25 sealos containerd[1799]: time="2021-08-18T14:15:25.945425033+08:00" level=info msg="PullImage \"k8s.gcr.io/pause:3.2\" returns image reference \"sha256:2a060e2e7101d419352bf82c613158587400be743482d9a537ec4a9d1b4eb93c\""
Aug 18 14:16:35 sealos containerd[1799]: time="2021-08-18T14:16:35.064097875+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause:3.5,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:16:35 sealos containerd[1799]: time="2021-08-18T14:16:35.071463944+08:00" level=info msg="ImageUpdate event &ImageUpdate{Name:k8s.gcr.io/pause:3.5,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"
Aug 18 14:16:35 sealos containerd[1799]: time="2021-08-18T14:16:35.074446456+08:00" level=info msg="ImageCreate event &ImageCreate{Name:k8s.gcr.io/pause@sha256:1ff6c18fbef2045af6b9c16bf034cc421a29027b800e4f9b68ae9b1cb3e9ae07,Labels:map[string]string{io.cri-containerd.image: managed,},XXX_unrecognized:[],}"

tips:
and only the pause:3.5 image export error . other images export ok.. this really confused me...
ctr v1.4.3 image export all ok.

Hm, this issue is regarding ctr image import, not export. But I think there might be some underlying issue somewhere in the Garbage Collector. From what I can tell, if some content is not "visible", then it's removed, including some things that shouldn't, like your sha256:94ee124c4b0ca7c1315c06c31532f78a929051ae8da7f122f905b2cbbfb1ecba.

Question: after switching ctr versions, do you delete / pull the pause image, and then export it?

@oldthreefeng
Copy link

oldthreefeng commented Aug 24, 2021

Question: after switching ctr versions, do you delete / pull the pause image, and then export it?

switch to old version 1.4.4. erverything is ok. i have export and import many times..
switch to new version 1.5.5. i just to test the export cmd. and delete pull and others are not. but i can test angin if you needed.
@claudiubelu

claudiubelu pushed a commit to claudiubelu/containerd that referenced this issue Aug 25, 2021
"ctr image import" didn't have a lease. Due to that, GC would
remove contents during the import process.

Fixes containerd#5690.

Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
@claudiubelu
Copy link
Contributor Author

Question: after switching ctr versions, do you delete / pull the pause image, and then export it?

switch to old version 1.4.4. erverything is ok. i have export and import many times..
switch to new version 1.5.5. i just to test the export cmd. and delete pull and others are not. but i can test angin if you needed.
@claudiubelu

I've sent some updates here regarding what was the issue I was facing, and what the solution was: #5692 (comment)

I'd say that ctr import --all-platforms might solve your issue for now, but it would still be useful to know that what PRs have been sent are useful for people.

There is one thing I didn't understand from your scenario: are you importing the image, or are you pulling it? These are different scenarios, so it would be good to know. If it's from pulling, the above might not help you. :)

I wonder if ctr has issues picking the right image from a manifest list for your platform. I see that you're on linux/arm64. Can you try pulling and exporting the gcr.io/k8s-staging-kubernetes/pause:3.5-linux-arm64 instead?

@oldthreefeng
Copy link

the image is already contains arm64 k8s.gcr.io/pause:3.5 ==> multi arch images is support for this repo.
so I think this is a right image . and on the same time , i will try yours~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants