Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.17.4+k3s1 CoreDNS and LocalPathProvisioner pods crashlooping #1583

Closed
p-hash opened this issue Mar 25, 2020 · 6 comments
Closed

v1.17.4+k3s1 CoreDNS and LocalPathProvisioner pods crashlooping #1583

p-hash opened this issue Mar 25, 2020 · 6 comments
Milestone

Comments

@p-hash
Copy link

p-hash commented Mar 25, 2020

Version:
v1.17.4+k3s1
Does not reproduce in the same setup (vagrant generic/centos7 box) with v1.17.3+k3s1

K3s arguments:

INSTALL_K3S_EXEC="\
        --advertise-address 192.168.10.10 \
        --no-deploy traefik,metrics-server \
        --write-kubeconfig-mode 664"

Describe the bug
CoreDNS and LocalPathProvisioner pods are unable to start

To Reproduce

firewall-cmd --permanent --add-port 6443/tcp
firewall-cmd --reload
curl -sfL https://get.k3s.io | sh - # with $INSTALL_K3S_EXEC from above

Expected behavior
CoreDNS and Local Path Provisioner are started and operating normally

Actual behavior
CoreDNS and LocalPathProvisioner pods crashlooping

Additional context / logs

[vagrant@node1 ~]$ kubectl -n kube-system get po
NAME                                      READY   STATUS             RESTARTS   AGE
local-path-provisioner-58fb86bdfd-bh5n8   0/1     CrashLoopBackOff   5          3m40s
coredns-6c6bb68b64-nxvqb                  0/1     CrashLoopBackOff   5          3m40s
[vagrant@node1 ~]$ kubectl -n kube-system get events
LAST SEEN   TYPE      REASON              OBJECT                                         MESSAGE
3m44s       Normal    ScalingReplicaSet   deployment/coredns                             Scaled up replica set coredns-6c6bb68b64 to 1
3m44s       Normal    ScalingReplicaSet   deployment/local-path-provisioner              Scaled up replica set local-path-provisioner-58fb86bdfd to 1
3m44s       Normal    SuccessfulCreate    replicaset/local-path-provisioner-58fb86bdfd   Created pod: local-path-provisioner-58fb86bdfd-bh5n8
3m44s       Normal    SuccessfulCreate    replicaset/coredns-6c6bb68b64                  Created pod: coredns-6c6bb68b64-nxvqb
<unknown>   Warning   FailedScheduling    pod/local-path-provisioner-58fb86bdfd-bh5n8    0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
<unknown>   Warning   FailedScheduling    pod/coredns-6c6bb68b64-nxvqb                   0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
<unknown>   Warning   FailedScheduling    pod/local-path-provisioner-58fb86bdfd-bh5n8    0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
<unknown>   Normal    Scheduled           pod/coredns-6c6bb68b64-nxvqb                   Successfully assigned kube-system/coredns-6c6bb68b64-nxvqb to node1
<unknown>   Normal    Scheduled           pod/local-path-provisioner-58fb86bdfd-bh5n8    Successfully assigned kube-system/local-path-provisioner-58fb86bdfd-bh5n8 to node1
3m37s       Normal    Pulling             pod/local-path-provisioner-58fb86bdfd-bh5n8    Pulling image "rancher/local-path-provisioner:v0.0.11"
3m37s       Normal    Pulling             pod/coredns-6c6bb68b64-nxvqb                   Pulling image "rancher/coredns-coredns:1.6.3"
3m31s       Normal    Pulled              pod/coredns-6c6bb68b64-nxvqb                   Successfully pulled image "rancher/coredns-coredns:1.6.3"
3m31s       Warning   Failed              pod/coredns-6c6bb68b64-nxvqb                   Error: failed to get sandbox container task: no running task found: task 53deca49741d68fc6dd7dc96e6c55ec5f03bceea55bfd12ff1dd4fb82d01f59d not found: not found
3m31s       Normal    Pulled              pod/local-path-provisioner-58fb86bdfd-bh5n8    Successfully pulled image "rancher/local-path-provisioner:v0.0.11"
3m31s       Warning   Failed              pod/local-path-provisioner-58fb86bdfd-bh5n8    Error: failed to get sandbox container task: no running task found: task 31545f69fe5383c2d613437097c40500c87cbc5009ae522f7f91ab89dbea1b35 not found: not found
3m30s       Warning   Failed              pod/local-path-provisioner-58fb86bdfd-bh5n8    Error: sandbox container "6af8fd577d97197b49e22e001e942d8ffc64778f1bf94d5433b8c8eb8ddba4c0" is not running
3m30s       Warning   Failed              pod/coredns-6c6bb68b64-nxvqb                   Error: failed to create containerd task: OCI runtime create failed: container_linux.go:338: creating new parent process caused "container_linux.go:1920: running lstat on namespace path \"/proc/24201/ns/ipc\" caused \"lstat /proc/24201/ns/ipc: no such file or directory\"": unknown
3m29s       Normal    Pulled              pod/coredns-6c6bb68b64-nxvqb                   Container image "rancher/coredns-coredns:1.6.3" already present on machine
3m29s       Normal    Pulled              pod/local-path-provisioner-58fb86bdfd-bh5n8    Container image "rancher/local-path-provisioner:v0.0.11" already present on machine
3m29s       Normal    Created             pod/coredns-6c6bb68b64-nxvqb                   Created container coredns
3m29s       Normal    Created             pod/local-path-provisioner-58fb86bdfd-bh5n8    Created container local-path-provisioner
3m29s       Warning   Failed              pod/coredns-6c6bb68b64-nxvqb                   Error: failed to create containerd task: OCI runtime create failed: container_linux.go:338: creating new parent process caused "container_linux.go:1920: running lstat on namespace path \"/proc/24504/ns/ipc\" caused \"lstat /proc/24504/ns/ipc: no such file or directory\"": unknown
3m29s       Warning   Failed              pod/local-path-provisioner-58fb86bdfd-bh5n8    Error: failed to create containerd task: OCI runtime create failed: container_linux.go:338: creating new parent process caused "container_linux.go:1920: running lstat on namespace path \"/proc/24546/ns/ipc\" caused \"lstat /proc/24546/ns/ipc: no such file or directory\"": unknown
3m22s       Normal    SandboxChanged      pod/local-path-provisioner-58fb86bdfd-bh5n8    Pod sandbox changed, it will be killed and re-created.
3m22s       Normal    SandboxChanged      pod/coredns-6c6bb68b64-nxvqb                   Pod sandbox changed, it will be killed and re-created.
3m22s       Warning   BackOff             pod/coredns-6c6bb68b64-nxvqb                   Back-off restarting failed container
3m22s       Warning   BackOff             pod/local-path-provisioner-58fb86bdfd-bh5n8    Back-off restarting failed container
@benfairless
Copy link

benfairless commented Mar 26, 2020

@erikwilson Could this be a related to f2a4e1d which introduced a toleration if the node has node-role.kubernetes.io/master?

Can you share the list of taints on your nodes?
kubectl get nodes -o=jsonpath="{.items[*]['metadata.name', 'spec.taints']} should provide the necessary information.

@p-hash
Copy link
Author

p-hash commented Mar 27, 2020

Hi @benfairless

I dont see any taints here

$ kubectl get nodes
NAME    STATUS   ROLES    AGE     VERSION
node1   Ready    master   2m47s   v1.17.4+k3s1
$ kubectl get nodes -o=jsonpath="{.items[*]['metadata.name', 'spec.taints']}"
node1

Hope it helps

I think the following must be key to the problem:
Error: failed to create containerd task: OCI runtime create failed: container_linux.go:338: creating new parent process caused "container_linux.go:1920: running lstat on namespace path \"/proc/24546/ns/ipc\" caused \"lstat /proc/24546/ns/ipc: no such file or directory\"": unknown

@p-hash
Copy link
Author

p-hash commented Mar 27, 2020

Might be related cri-o/cri-o#528

@erikwilson
Copy link
Contributor

f2a4e1d is not in 1.17.4, we don't use cri-o but did add selinux support to cri.

So now that selinux is turned on in containerd for 1.17.4 this is probably causing the issue. You can revert to the old behavior by disabling selinux with the --disable-selinux flag, or you can use selinux support and install the k3s-selinux policy like the following:

yum install -y container-selinux selinux-policy-base
rpm -i https://rpm.rancher.io/k3s-selinux-0.1.1-rc1.el7.noarch.rpm

Ultimately we would like to recommend using k3s rpm installation through a repo which will require the k3s-selinux policy, but still have a few things that need to happen for that process.

Apologies that this was a problem & thanks for your patience while we iron this out.

@cjellick
Copy link
Contributor

cjellick commented May 5, 2020

To test, we just need to verify that when we follow the instructions for properly installing the k3s selinux policies, coredns and localPathProvisioner do not go into a crash loop

@ShylajaDevadiga
Copy link
Contributor

Verified using 1.18.2-rc2+k3s1. Install following packages as in installation docs https://rancher.com/docs/k3s/latest/en/advanced/

yum install -y container-selinux selinux-policy-base
rpm -i https://rpm.rancher.io/k3s-selinux-0.1.1-rc1.el7.noarch.rpm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants