Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kind load docker-image fails, but nodes believe image exists #3479

Closed
biqiangwu opened this issue Jan 12, 2024 · 9 comments
Closed

kind load docker-image fails, but nodes believe image exists #3479

biqiangwu opened this issue Jan 12, 2024 · 9 comments
Labels
area/provider/docker Issues or PRs related to docker kind/bug Categorizes issue or PR as related to a bug.

Comments

@biqiangwu
Copy link

biqiangwu commented Jan 12, 2024

What happened:

kind  load --loglevel trace docker-image --name=egressgateway my-image
WARNING: --loglevel is deprecated, please switch to -v and -q!
Image: "my-image" with ID "sha256:1345d28350994835ab873e08bbc194ed5187477d9412d6accefe015717519d49" not yet present on node "egressgateway-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i egressgateway-control-plane ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -" failed with error: exit status 1
Command Output: unpacking my-image (sha256:1f69c5e4cd45c9ca8b50711b297266de6cfee58e75cd497a1f6fa714941a1700)...ctr: mismatched image rootfs and manifest layers
Stack Trace: 
sigs.k8s.io/kind/pkg/errors.WithStack
	sigs.k8s.io/kind/pkg/errors/errors.go:59
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
	sigs.k8s.io/kind/pkg/exec/local.go:124
sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run
	sigs.k8s.io/kind/pkg/cluster/internal/providers/docker/node.go:146
sigs.k8s.io/kind/pkg/cluster/nodeutils.LoadImageArchive
	sigs.k8s.io/kind/pkg/cluster/nodeutils/util.go:86
sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image.loadImage
	sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image/docker-image.go:205
sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image.runE.func1
	sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image/docker-image.go:190
sigs.k8s.io/kind/pkg/errors.UntilErrorConcurrent.func1
	sigs.k8s.io/kind/pkg/errors/concurrent.go:30
runtime.goexit
	runtime/asm_amd64.s:1598

A similar problem was found: #2402

I experienced this problem on a VM in Windows 11

# kk  get po 
NAME                                                  READY   STATUS                      RESTARTS   AGE
calico-kube-controllers-6d6dc86d84-wjljs              0/1     Pending                     0          20m
calico-node-477dm                                     0/1     Init:CreateContainerError   0          20m
calico-node-8sp7m                                     0/1     Init:CreateContainerError   0          20m
calico-node-p8294                                     0/1     Init:CreateContainerError   0          20m
coredns-787d4945fb-5f888                              0/1     Pending                     0          17h
coredns-787d4945fb-qq7cb                              0/1     Pending                     0          17h
etcd-egressgateway-control-plane                      1/1     Running                     0          17h
kube-apiserver-egressgateway-control-plane            1/1     Running                     0          17h
kube-controller-manager-egressgateway-control-plane   1/1     Running                     0          17h
kube-proxy-bnxc4                                      1/1     Running                     0          17h
kube-proxy-nbs97                                      1/1     Running                     0          17h
kube-proxy-pnbkd                                      1/1     Running                     0          17h
kube-scheduler-egressgateway-control-plane            1/1     Running                     0          17h

# kk describe po calico-node-477dm
...
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  22m                   default-scheduler  Successfully assigned kube-system/calico-node-477dm to egressgateway-control-plane
  Warning  Failed     20m (x12 over 22m)    kubelet            Error: failed to create containerd container: error unpacking image: mismatched image rootfs and manifest layers
  Normal   Pulled     2m40s (x94 over 22m)  kubelet            Container image "docker.io/calico/cni:v3.26.4" already present on machine


Environment:

  • kind version: (use kind version):
    kind v0.20.0 go1.20.4 linux/amd64

  • Runtime info: (use docker info or podman info):
    ocker info
    Client: Docker Engine - Community
    Version: 25.0.0-beta.3
    Context: default
    Debug Mode: false
    Plugins:
    buildx: Docker Buildx (Docker Inc.)
    Version: v0.12.0
    Path: /usr/libexec/docker/cli-plugins/docker-buildx
    compose: Docker Compose (Docker Inc.)
    Version: v2.23.3
    Path: /usr/libexec/docker/cli-plugins/docker-compose

Server:
Containers: 4
Running: 4
Paused: 0
Stopped: 0
Images: 6
Server Version: 25.0.0-beta.3
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 3dd1e886e55dd695541fdcd67420c2888645a495
runc version: v1.1.10-0-g18a0cb0
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.2.0-39-generic
Operating System: Ubuntu 22.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 15.58GiB
Name: t2-virtual-machine
ID: 2821dd6e-513e-44e5-8d29-3a2fba4b9acb
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

@biqiangwu biqiangwu added the kind/bug Categorizes issue or PR as related to a bug. label Jan 12, 2024
@aojea
Copy link
Contributor

aojea commented Jan 15, 2024

are you using latest kind images https://github.com/kubernetes-sigs/kind/releases/tag/v0.20.0 ?

@biqiangwu
Copy link
Author

are you using latest kind images https://github.com/kubernetes-sigs/kind/releases/tag/v0.20.0 ?

yes,kind v0.20.0 go1.20.4 linux/amd64

@alfsch
Copy link

alfsch commented Jan 22, 2024

I'm also getting this message when loading the cilium image into the kind container

Image: "quay.io/cilium/cilium:v1.13.9" with ID "sha256:b2c3b0a10e701ff1bacff8e154c3ce793ae4a761306c9a9f7d278d039b3a8b2d" not yet present on node "sinapse-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i sinapse-control-plane ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -" failed with error: exit status 1
Command Output: unpacking quay.io/cilium/cilium:v1.13.9 (sha256:74d80929fbeb70819778fbacc0a324fa7940b794346ae2305f142d526ca1a08b)...time="2024-01-22T08:58:07Z" level=info msg="apply failure, attempting cleanup" error="wrong diff id calculated on extraction \"sha256:ed018dfbc761bd3a6087a167dc71c11f57649321043601e56237633870cd501d\"" key="extract-716946236-ihhx sha256:df72b8ed5945342750c33f74d5cf8d0872426d3df82604f5cdf686cfa719c19d"
ctr: wrong diff id calculated on extraction "sha256:ed018dfbc761bd3a6087a167dc71c11f57649321043601e56237633870cd501d"
Stack Trace: 
sigs.k8s.io/kind/pkg/errors.WithStack
	sigs.k8s.io/kind/pkg/errors/errors.go:59
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
	sigs.k8s.io/kind/pkg/exec/local.go:124
sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run
	sigs.k8s.io/kind/pkg/cluster/internal/providers/docker/node.go:146
sigs.k8s.io/kind/pkg/cluster/nodeutils.LoadImageArchive
	sigs.k8s.io/kind/pkg/cluster/nodeutils/util.go:86
sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image.loadImage
	sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image/docker-image.go:205
sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image.runE.func1
	sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image/docker-image.go:190
sigs.k8s.io/kind/pkg/errors.UntilErrorConcurrent.func1
	sigs.k8s.io/kind/pkg/errors/concurrent.go:30
runtime.goexit
	runtime/asm_amd64.s:1598

Also kind 0.20.0 with kindest/node:v1.26.6@sha256:6e2d8b28a5b601defe327b98bd1c2d1930b49e5d8c512e1895099e4504007adb is used. This makes loading images into the cluster impossible.

docker info
Client:
 Version:    24.0.7
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-buildx

Server:
 Containers: 183
  Running: 3
  Paused: 0
  Stopped: 180
 Images: 440
 Server Version: 25.0.0
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: local
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: a1496014c916f9e62104b33d1bb5bd03b0858e59
 runc version: v1.1.11-0-g4bccb38
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.5.0-14-generic
 Operating System: Ubuntu 22.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 16
 Total Memory: 42.93GiB
 ID: BW5V:QDYB:QE2E:KDA2:RJ7K:2NA6:QIQR:3WNK:QHVK:REIP:Q64J:XIJS
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Default Address Pools:
   Base: 10.172.0.0/16, Size: 24

@pablodgcatulo
Copy link

pablodgcatulo commented Jan 22, 2024

I'm also getting this message when loading the cilium image into the kind container

Image: "quay.io/cilium/cilium:v1.13.9" with ID "sha256:b2c3b0a10e701ff1bacff8e154c3ce793ae4a761306c9a9f7d278d039b3a8b2d" not yet present on node "sinapse-control-plane", loading...
ERROR: failed to load image: command "docker exec --privileged -i sinapse-control-plane ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -" failed with error: exit status 1
Command Output: unpacking quay.io/cilium/cilium:v1.13.9 (sha256:74d80929fbeb70819778fbacc0a324fa7940b794346ae2305f142d526ca1a08b)...time="2024-01-22T08:58:07Z" level=info msg="apply failure, attempting cleanup" error="wrong diff id calculated on extraction \"sha256:ed018dfbc761bd3a6087a167dc71c11f57649321043601e56237633870cd501d\"" key="extract-716946236-ihhx sha256:df72b8ed5945342750c33f74d5cf8d0872426d3df82604f5cdf686cfa719c19d"
ctr: wrong diff id calculated on extraction "sha256:ed018dfbc761bd3a6087a167dc71c11f57649321043601e56237633870cd501d"
Stack Trace: 
sigs.k8s.io/kind/pkg/errors.WithStack
	sigs.k8s.io/kind/pkg/errors/errors.go:59
sigs.k8s.io/kind/pkg/exec.(*LocalCmd).Run
	sigs.k8s.io/kind/pkg/exec/local.go:124
sigs.k8s.io/kind/pkg/cluster/internal/providers/docker.(*nodeCmd).Run
	sigs.k8s.io/kind/pkg/cluster/internal/providers/docker/node.go:146
sigs.k8s.io/kind/pkg/cluster/nodeutils.LoadImageArchive
	sigs.k8s.io/kind/pkg/cluster/nodeutils/util.go:86
sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image.loadImage
	sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image/docker-image.go:205
sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image.runE.func1
	sigs.k8s.io/kind/pkg/cmd/kind/load/docker-image/docker-image.go:190
sigs.k8s.io/kind/pkg/errors.UntilErrorConcurrent.func1
	sigs.k8s.io/kind/pkg/errors/concurrent.go:30
runtime.goexit
	runtime/asm_amd64.s:1598

Also kind 0.20.0 with kindest/node:v1.26.6@sha256:6e2d8b28a5b601defe327b98bd1c2d1930b49e5d8c512e1895099e4504007adb is used. This makes loading images into the cluster impossible.

Same here. Since the docker update (V25 3 days ago) there is no way to load images.

The only workaround I found (works on my machine TM) was to downgrade to docker V24.X.X.

@jglick
Copy link

jglick commented Jan 22, 2024

Looks like this will be fixed in 25.0.1: moby/moby#47161 (comment)

This worked for me on Ubuntu:

sudo apt install docker-ce=5:24.0.7-1~ubuntu.22.04~jammy

(after identifying the old version from /var/log/apt/history.log)

@alfsch
Copy link

alfsch commented Jan 24, 2024

My temporary workaround is using skopeo to generate the image to import, since docker downgrade isn't possible for me.

I'll do it like that:

skopeo copy -f oci --multi-arch all docker://quay.io/cilium/cilium:v1.13.9 oci-archive:cilium-v1.13.9.tar
kind load image-archive cilium-v1.13.9.tar --name myCluster

cheers,

Alfred

@tamalsaha
Copy link
Contributor

This is fixed by the latest release of docker 0.25.1

@BenTheElder BenTheElder added the area/provider/docker Issues or PRs related to docker label Jan 24, 2024
@BenTheElder
Copy link
Member

Docker v25 (including the betas) has a bug with docker save.

FWIW, I recommend running a stable release.

This is fixed by the latest release of docker 0.25.1

v25.0.1*, though this release still doesn't work kind build node-image, FYI.

@BenTheElder
Copy link
Member

This should've been resolved by docker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/docker Issues or PRs related to docker kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

7 participants