Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After upgrading to 21/stable all my pods fail to create #2223

Closed
Aaron-Ritter opened this issue May 1, 2021 · 5 comments
Closed

After upgrading to 21/stable all my pods fail to create #2223

Aaron-Ritter opened this issue May 1, 2021 · 5 comments

Comments

@Aaron-Ritter
Copy link

Aaron-Ritter commented May 1, 2021

Hi, I wanted to ask if anyone has tried to upgrade fron 19/stable to 21/stable and experienced the same issue, every single container faces the issue. (i for now downgraded to 20/stable and everything is working)

sofo@k8s-test-m:~$ kubectl get all --all-namespaces
NAMESPACE                  NAME                                                          READY   STATUS    RESTARTS   AGE
metallb-system             pod/speaker-qcj9x                                             0/1     Unknown   23         138d
kube-system                pod/calico-node-bs46g                                         0/1     Unknown   1          27d
metallb-system             pod/speaker-cn6nx                                             0/1     Unknown   29         160d
kube-system                pod/calico-node-62x6q                                         0/1     Unknown   1          27d
sofo@k8s-test-m:~$ kubectl -n metallb-system describe pod/speaker-qcj9x
Name:         speaker-qcj9x
Namespace:    metallb-system
Priority:     0
Node:         k8s-test-m/10.14.214.40
Start Time:   Sun, 13 Dec 2020 22:26:36 +0000
Labels:       app=metallb
              component=speaker
              controller-revision-hash=69f56bb877
              pod-template-generation=1
Annotations:  prometheus.io/port: 7472
              prometheus.io/scrape: true
Status:       Running
IP:           10.14.214.40
IPs:
  IP:           10.14.214.40
Controlled By:  DaemonSet/speaker
Containers:
  speaker:
    Container ID:  containerd://d56653f8ee10a407bd8383ebc2b0c9735a09bb6beba2ae9c49ec0500ce08a506
    Image:         metallb/speaker:v0.8.2
    Image ID:      docker.io/metallb/speaker@sha256:f1941498a28cdb332429e25d18233683da6949ecfc4f6dacf12b1416d7d38263
    Port:          7472/TCP
    Host Port:     7472/TCP
    Args:
      --port=7472
      --config=config
    State:          Terminated
      Reason:       Unknown
      Exit Code:    255
      Started:      Sun, 04 Apr 2021 03:45:28 +0000
      Finished:     Sat, 01 May 2021 20:46:47 +0000
    Ready:          False
    Restart Count:  23
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:     100m
      memory:  100Mi
    Environment:
      METALLB_NODE_NAME:   (v1:spec.nodeName)
      METALLB_HOST:        (v1:status.hostIP)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from speaker-token-gwnsh (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  speaker-token-gwnsh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  speaker-token-gwnsh
    Optional:    false
QoS Class:       Guaranteed
Node-Selectors:  beta.kubernetes.io/os=linux
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists
                 node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                 node.kubernetes.io/unreachable:NoExecute op=Exists
                 node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason                  Age                    From             Message
  ----     ------                  ----                   ----             -------
  Warning  NodeNotReady            27m                    node-controller  Node is not ready
  Warning  FailedCreatePodSandBox  22m (x4 over 24m)      kubelet          Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:334: copying bootstrap data to pipe caused: write init-p: broken pipe: unknown
  Warning  FailedCreatePodSandBox  21m (x8 over 24m)      kubelet          Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:338: getting the final child's pid from pipe caused: read init-p: connection reset by peer: unknown
  Normal   SandboxChanged          9m9s (x71 over 24m)    kubelet          Pod sandbox changed, it will be killed and re-created.
  Normal   SandboxChanged          2m22s (x9 over 3m52s)  kubelet          Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  2m22s (x9 over 3m51s)  kubelet          Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:338: getting the final child's pid from pipe caused: read init-p: connection reset by peer: unknown
  Warning  FailedCreatePodSandBox  43s                    kubelet          Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:334: copying bootstrap data to pipe caused: write init-p: broken pipe: unknown
  Normal   SandboxChanged          1s (x9 over 94s)       kubelet          Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox  1s (x8 over 93s)       kubelet          Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:338: getting the final child's pid from pipe caused: read init-p: connection reset by peer: unknown
@ktsakalozos
Copy link
Member

Hi @Aaron-Ritter it is recommended to upgrade one track at a time. So you could try to upgrade first to 1.20 and then to 1.21. It would also help if you could attach an inspection tarball.

It look like a problem with containerd but I cannot tell without some logs. (journalctl -u snap.microk8s.daemon-containerd)

The issue you are seeing may be already addressed on 1.21/edge and will soon land on 1.21/stable.

@Aaron-Ritter
Copy link
Author

Thanks a lot for the clarifications @ktsakalozos You are right 1.21/edge fixes it.

I've upgraded in our test environment first from 1.20/stable to 1.21/stable as suggested to upgrade one track at a time but with the same problem. After that I've upgraded to the edge release and everything is working as expected.

What's the current timeline on upgrading 1.21/stable ?

@ktsakalozos
Copy link
Member

With the 1.21/stable channel should be updated with v1.21.1 release.

@Aaron-Ritter
Copy link
Author

closing this for now as I am waiting for the 1.21.1 release for the production installation.

@Aaron-Ritter
Copy link
Author

Aaron-Ritter commented Jun 25, 2021

@ktsakalozos do you have any news on the 1.21.1 release?

never mind, i did not see this: 1.21/stable: v1.21.1 2021-06-16 (2262) 191MB classic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants