-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Environmental Info:
K3s Version:
k3s version v1.32.2+k3s1 (381620ef)
go version go1.23.6
Node(s) CPU architecture, OS, and Version:
2 Node Test Cluster uname -r reporting 6.1.0-30-amd64
Both installed with a minimal Debian 12 OS (ansible deployment)
root@staging1:~# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
staging1 Ready control-plane,master 61m v1.32.2+k3s1 172.16.0.1 172.16.0.1 Debian GNU/Linux 12 (bookworm) 6.1.0-30-amd64 containerd://2.0.2-k3s2
staging2 Ready control-plane,master 60m v1.32.2+k3s1 172.16.0.2 172.16.0.2 Debian GNU/Linux 12 (bookworm) 6.1.0-30-amd64 containerd://2.0.2-k3s2
Cluster Configuration:
- 2 Node (both as control-plane,master) VMs running in stock KVM (libvirt)
- every Node has next to
lotwo Interfaceseth0for WAN Traffic and eth1 for LAN Traffic - Host: staging1 | LAN-IP: 172.16.0.1 | WAN-IP: 192.168.122.246
- Host: staging2 | LAN-IP: 172.16.0.2 | WAN-IP: 192.168.122.90
- Following helm charts are installed...
root@staging2:~# helm list -A NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION cilium cilium-system 1 2025-02-28 07:28:32.804174608 +0100 CET deployed cilium- 1.17.1 1.17.1 ingress-nginx ingress-nginx 1 2025-02-28 07:31:31.713591888 +0100 CET failed ingress-nginx-4.12.0 1.12.0 longhorn longhorn-system 1 2025-02-28 07:30:47.23270258 +0100 CET deployed longhorn-1.8.0 v1.8.0
Describe the bug:
New Workload-Deployments in K3s v1.32.2+k3s1 are failing/hanging in ContainerCreating status, because something must have changed with the format of the registries.yaml config.
root@staging1:~# kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cilium-system cilium-envoy-krklb 0/1 ContainerCreating 0 4m29s 172.16.0.2 staging2 <none> <none>
cilium-system cilium-envoy-vmckv 0/1 ContainerCreating 0 5m48s 172.16.0.1 staging1 <none> <none>
cilium-system cilium-operator-85bf6f5694-sc5x8 0/1 ContainerCreating 0 5m48s 172.16.0.2 staging2 <none> <none>
cilium-system cilium-operator-85bf6f5694-x6k7v 0/1 ContainerCreating 0 5m48s 172.16.0.1 staging1 <none> <none>
cilium-system cilium-p9fpg 0/1 Init:0/6 0 4m29s 172.16.0.2 staging2 <none> <none>
cilium-system cilium-pnpn6 0/1 Init:0/6 0 5m48s 172.16.0.1 staging1 <none> <none>
cilium-system hubble-relay-75d4f954d-gnlsg 0/1 ContainerCreating 0 5m48s <none> staging1 <none> <none>
cilium-system hubble-relay-75d4f954d-slc5r 0/1 ContainerCreating 0 5m48s <none> staging1 <none> <none>
ingress-nginx ingress-nginx-admission-create-n2852 0/1 ContainerCreating 0 2m51s <none> staging2 <none> <none>
kube-system coredns-ff8999cc5-mwjg6 0/1 ContainerCreating 0 6m11s <none> staging1 <none> <none>
kube-system coredns-ff8999cc5-nzpfl 0/1 ContainerCreating 0 6m11s <none> staging1 <none> <none>
kube-system local-path-provisioner-774c6665dc-bzlcr 0/1 ContainerCreating 0 6m11s <none> staging1 <none> <none>
kube-system metrics-server-6f4c6675d5-zjdpk 0/1 ContainerCreating 0 6m11s <none> staging1 <none> <none>
longhorn-system longhorn-driver-deployer-b8bc4675f-wfhw2 0/1 Init:0/1 0 3m34s <none> staging2 <none> <none>
longhorn-system longhorn-manager-qcg9t 0/2 ContainerCreating 0 3m34s <none> staging1 <none> <none>
longhorn-system longhorn-manager-zdjmp 0/2 ContainerCreating 0 3m34s <none> staging2 <none> <none>
longhorn-system longhorn-ui-7749bb466f-52gcb 0/1 ContainerCreating 0 3m34s <none> staging1 <none> <none>
longhorn-system longhorn-ui-7749bb466f-gjk9k 0/1 ContainerCreating 0 3m34s <none> staging2 <none> <none>
Output from a cilium-agent Pod describe:
Warning FailedCreatePodSandBox 7s (x9 over 110s) kubelet (combined from similar events): Failed to create pod sandbox: rpc error: code = NotFound desc = failed to start sandbox "eb7a7309006ec34550497ac44a5aea5b18e4348f076babd763cb1c1a19fe5d6d": failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to resolve reference "docker.io/rancher/mirrored-pause:3.6": docker.io/rancher/mirrored-pause:3.6: not found
So i moved /etc/rancher/k3s/registries.yaml to another location and restarted the k3s.service and voila everything got pulled.
Content of the registries.yaml:
mirrors:
"*":
endpoint:
- "http://localhost:5000"
configs:
"docker.io":
"quay.io":
"*":
tls:
insecure_skip_verify: trueSteps To Reproduce:
see Bug-Description above.
Expected behavior:
registries.yaml hasn't changed between my earlier deployments nor does the Documentation mention something.
So everything should working.
Actual behavior:
Images can't be pulled, the root cause is actually unknown, currently i haven't much time to dive into the code.
Additional context / logs:
see Bug-Description above.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status