Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrashLoopBackOff / Error #1019

Closed
a1exus opened this issue Nov 5, 2019 · 30 comments
Closed

CrashLoopBackOff / Error #1019

a1exus opened this issue Nov 5, 2019 · 30 comments

Comments

@a1exus
Copy link

a1exus commented Nov 5, 2019

Hello World!

I just installed k3s (Single Master Install), yet seeing Error and/or CrashLoopBackOff for few deployments (that came out of the box)

# k3s kubectl get pods --all-namespaces=true
NAMESPACE     NAME                                      READY   STATUS             RESTARTS   AGE
kube-system   coredns-57d8bbb86-jzhmv                   1/1     Running            0          29m
kube-system   local-path-provisioner-58fb86bdfd-bwhsj   0/1     CrashLoopBackOff   10         29m
kube-system   helm-install-traefik-nmvrz                0/1     CrashLoopBackOff   10         29m
# 
# k3s -v
k3s version v0.10.2 (8833bfd9)
# 
# cat /etc/redhat-release 
CentOS Linux release 8.0.1905 (Core) 
# uname -a
Linux X.X.X 4.18.0-80.11.2.el8_0.x86_64 rancher/k3s#1 SMP Tue Sep 24 11:32:19 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
# 

Please advise.

@erikwilson
Copy link
Contributor

Please kubectl describe the failing pods for more info.

@a1exus
Copy link
Author

a1exus commented Nov 6, 2019

Please kubectl describe the failing pods for more info.

# k3s kubectl get pods --namespace=kube-system
NAME                                      READY   STATUS             RESTARTS   AGE
coredns-57d8bbb86-4v5z6                   1/1     Running            0          2m48s
local-path-provisioner-58fb86bdfd-5v85x   0/1     CrashLoopBackOff   4          2m48s
helm-install-traefik-5m5rv                0/1     CrashLoopBackOff   3          2m48s
# k3s kubectl describe --namespace=kube-system local-path-provisioner-58fb86bdfd-5v85x
error: the server doesn't have a resource type "local-path-provisioner-58fb86bdfd-5v85x"
# k3s kubectl describe --namespace=kube-system helm-install-traefik-5m5rv
error: the server doesn't have a resource type "helm-install-traefik-5m5rv"
# 

@erikwilson
Copy link
Contributor

Looks like you need to give it the resource type. Is k3s kubectl describe --help useful?
(hint: k3s kubectl describe -n kube-system pod local-path-provisioner-58fb86bdfd-5v85x)

@a1exus
Copy link
Author

a1exus commented Nov 6, 2019

Looks like you need to give it the resource type. Is k3s kubectl describe --help useful?
(hint: k3s kubectl describe -n kube-system pod local-path-provisioner-58fb86bdfd-5v85x)

whoopsy) I apologize for that..

# k3s kubectl --namespace=kube-system describe pod local-path-provisioner-58fb86bdfd-npcxt
Name:         local-path-provisioner-58fb86bdfd-npcxt
Namespace:    kube-system
Priority:     0
Node:         noc.uftwf.local/216.236.150.116
Start Time:   Wed, 06 Nov 2019 16:54:06 -0500
Labels:       app=local-path-provisioner
              pod-template-hash=58fb86bdfd
Annotations:  <none>
Status:       Running
IP:           10.42.0.2
IPs:
  IP:           10.42.0.2
Controlled By:  ReplicaSet/local-path-provisioner-58fb86bdfd
Containers:
  local-path-provisioner:
    Container ID:  containerd://2fdbe42fe969b642d4d9c4370da98a820a496ee86eaf7607991ecdc738088fe8
    Image:         rancher/local-path-provisioner:v0.0.11
    Image ID:      docker.io/rancher/local-path-provisioner@sha256:0d60b97b101e432606035ab955c623604493e8956484af1cfa207753329bdf81
    Port:          <none>
    Host Port:     <none>
    Command:
      local-path-provisioner
      start
      --config
      /etc/config/config.json
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 06 Nov 2019 16:55:11 -0500
      Finished:     Wed, 06 Nov 2019 16:55:12 -0500
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 06 Nov 2019 16:54:39 -0500
      Finished:     Wed, 06 Nov 2019 16:54:40 -0500
    Ready:          False
    Restart Count:  3
    Environment:
      POD_NAMESPACE:  kube-system (v1:metadata.namespace)
    Mounts:
      /etc/config/ from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from local-path-provisioner-service-account-token-sjhkv (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      local-path-config
    Optional:  false
  local-path-provisioner-service-account-token-sjhkv:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  local-path-provisioner-service-account-token-sjhkv
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age               From                      Message
  ----     ------            ----              ----                      -------
  Warning  FailedScheduling  <unknown>         default-scheduler         0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
  Warning  FailedScheduling  <unknown>         default-scheduler         0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
  Normal   Scheduled         <unknown>         default-scheduler         Successfully assigned kube-system/local-path-provisioner-58fb86bdfd-npcxt to noc.uftwf.local
  Normal   Pulling           64s               kubelet, noc.uftwf.local  Pulling image "rancher/local-path-provisioner:v0.0.11"
  Normal   Pulled            60s               kubelet, noc.uftwf.local  Successfully pulled image "rancher/local-path-provisioner:v0.0.11"
  Normal   Pulled            8s (x3 over 55s)  kubelet, noc.uftwf.local  Container image "rancher/local-path-provisioner:v0.0.11" already present on machine
  Normal   Created           8s (x4 over 57s)  kubelet, noc.uftwf.local  Created container local-path-provisioner
  Normal   Started           8s (x4 over 57s)  kubelet, noc.uftwf.local  Started container local-path-provisioner
  Warning  BackOff           6s (x5 over 53s)  kubelet, noc.uftwf.local  Back-off restarting failed container
# 
# k3s kubectl --namespace=kube-system describe pod helm-install-traefik-mh72n
Name:         helm-install-traefik-mh72n
Namespace:    kube-system
Priority:     0
Node:         noc.uftwf.local/216.236.150.116
Start Time:   Wed, 06 Nov 2019 16:54:06 -0500
Labels:       controller-uid=97d88936-7330-41dd-afe8-c11982dbcb5c
              helmcharts.helm.cattle.io/chart=traefik
              job-name=helm-install-traefik
Annotations:  <none>
Status:       Running
IP:           10.42.0.4
IPs:
  IP:           10.42.0.4
Controlled By:  Job/helm-install-traefik
Containers:
  helm:
    Container ID:  containerd://e651ef2ed4c8fa5801f6988d01518c457252d3e962a40bafbb58517303b3f064
    Image:         rancher/klipper-helm:v0.1.5
    Image ID:      docker.io/rancher/klipper-helm@sha256:72690de1ae2259a41075e87ff453936a74e0f2dbf2ad1dd96a4f72136a48038d
    Port:          <none>
    Host Port:     <none>
    Args:
      install
      --name
      traefik
      https://%{KUBERNETES_API}%/static/charts/traefik-1.77.1.tgz
      --set-string
      kubernetes.ingressEndpoint.useDefaultPublishedService=true
      --set-string
      metrics.prometheus.enabled=true
      --set-string
      rbac.enabled=true
      --set-string
      ssl.enabled=true
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 06 Nov 2019 16:55:50 -0500
      Finished:     Wed, 06 Nov 2019 16:56:03 -0500
    Ready:          False
    Restart Count:  3
    Environment:
      NAME:         traefik
      VERSION:      
      REPO:         
      VALUES_HASH:  e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
      NO_PROXY:     ,10.42.0.0/16,10.43.0.0/16
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from helm-traefik-token-79kxw (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  helm-traefik-token-79kxw:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  helm-traefik-token-79kxw
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                  From                      Message
  ----     ------            ----                 ----                      -------
  Warning  FailedScheduling  <unknown>            default-scheduler         0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.
  Normal   Scheduled         <unknown>            default-scheduler         Successfully assigned kube-system/helm-install-traefik-mh72n to noc.uftwf.local
  Normal   Pulling           2m23s                kubelet, noc.uftwf.local  Pulling image "rancher/klipper-helm:v0.1.5"
  Normal   Pulled            2m15s                kubelet, noc.uftwf.local  Successfully pulled image "rancher/klipper-helm:v0.1.5"
  Normal   Pulled            48s (x3 over 2m1s)   kubelet, noc.uftwf.local  Container image "rancher/klipper-helm:v0.1.5" already present on machine
  Normal   Created           48s (x4 over 2m14s)  kubelet, noc.uftwf.local  Created container helm
  Normal   Started           48s (x4 over 2m14s)  kubelet, noc.uftwf.local  Started container helm
  Warning  BackOff           12s (x6 over 103s)   kubelet, noc.uftwf.local  Back-off restarting failed container
# 

@erikwilson
Copy link
Contributor

No worries, how about the log for the container?
k3s kubectl logs -n kube-system local-path-provisioner-58fb86bdfd-5v85x

@a1exus
Copy link
Author

a1exus commented Nov 6, 2019

No worries, how about the log for the container?
k3s kubectl logs -n kube-system local-path-provisioner-58fb86bdfd-5v85x

# k3s kubectl logs -n kube-system local-path-provisioner-58fb86bdfd-fpjb9
time="2019-11-06T22:07:30Z" level=fatal msg="Error starting daemon: Cannot start Provisioner: failed to get Kubernetes server version: Get https://10.43.0.1:443/version?timeout=32s: dial tcp 10.43.0.1:443: connect: no route to host" 
# 

@erikwilson
Copy link
Contributor

Is there a firewall turned on? What is the output of iptables-save?

@a1exus
Copy link
Author

a1exus commented Nov 6, 2019

Is there a firewall turned on? What is the output of iptables-save?

# systemctl status iptables.service 
Unit iptables.service could not be found.
# systemctl status firewalld.service 
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-11-04 15:13:56 EST; 2 days ago
     Docs: man:firewalld(1)
 Main PID: 22769 (firewalld)
    Tasks: 2 (limit: 26213)
   Memory: 28.1M
   CGroup: /system.slice/firewalld.service
           └─22769 /usr/libexec/platform-python -s /usr/sbin/firewalld --nofork --nopid

Nov 04 15:13:57 noc.uftwf.local firewalld[22769]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -w --table filter --delete FORWARD --out-interface virbr0 --jump REJECT' failed: iptables: Bad rule (does a ma>
Nov 04 15:13:57 noc.uftwf.local firewalld[22769]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -w --table filter --delete FORWARD --in-interface virbr0 --jump REJECT' failed: iptables: Bad rule (does a mat>
Nov 04 15:13:57 noc.uftwf.local firewalld[22769]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -w --table filter --delete INPUT --in-interface virbr0 --protocol udp --destination-port 53 --jump ACCEPT' fai>
Nov 04 15:13:57 noc.uftwf.local firewalld[22769]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -w --table filter --delete INPUT --in-interface virbr0 --protocol tcp --destination-port 53 --jump ACCEPT' fai>
Nov 04 15:13:57 noc.uftwf.local firewalld[22769]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -w --table filter --delete OUTPUT --out-interface virbr0 --protocol udp --destination-port 68 --jump ACCEPT' f>
Nov 04 15:13:57 noc.uftwf.local firewalld[22769]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -w --table filter --delete INPUT --in-interface virbr0 --protocol udp --destination-port 67 --jump ACCEPT' fai>
Nov 04 15:13:57 noc.uftwf.local firewalld[22769]: WARNING: COMMAND_FAILED: '/usr/sbin/iptables -w10 -w --table filter --delete INPUT --in-interface virbr0 --protocol tcp --destination-port 67 --jump ACCEPT' fai>
Nov 04 15:15:17 noc.uftwf.local firewalld[22769]: ERROR: INVALID_SERVICE: 9090/tcp
Nov 04 15:16:26 noc.uftwf.local firewalld[22769]: ERROR: INVALID_SERVICE: 9090/tcp
Nov 04 15:16:36 noc.uftwf.local firewalld[22769]: WARNING: NOT_ENABLED: '9090:tcp' not in 'public'
# 

iptables-save:

# iptables-save
# Generated by xtables-save v1.8.2 on Wed Nov  6 17:21:16 2019
*filter
:INPUT ACCEPT [11506090:3333494189]
:FORWARD ACCEPT [143634:6529376]
:OUTPUT ACCEPT [11279288:2269388874]
-A INPUT -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A INPUT -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A INPUT -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A OUTPUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT
COMMIT
# Completed on Wed Nov  6 17:21:16 2019
# Generated by xtables-save v1.8.2 on Wed Nov  6 17:21:16 2019
*security
:INPUT ACCEPT [11258722:3317933603]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [11279299:2269402367]
COMMIT
# Completed on Wed Nov  6 17:21:16 2019
# Generated by xtables-save v1.8.2 on Wed Nov  6 17:21:16 2019
*raw
:PREROUTING ACCEPT [11704694:3344391070]
:OUTPUT ACCEPT [11279299:2269402367]
COMMIT
# Completed on Wed Nov  6 17:21:16 2019
# Generated by xtables-save v1.8.2 on Wed Nov  6 17:21:16 2019
*mangle
:PREROUTING ACCEPT [11704694:3344391070]
:INPUT ACCEPT [11506104:3333496006]
:FORWARD ACCEPT [198588:10894939]
:OUTPUT ACCEPT [11279297:2269401231]
:POSTROUTING ACCEPT [11279637:2269461842]
-A POSTROUTING -o virbr0 -p udp -m udp --dport 68 -j CHECKSUM --checksum-fill
COMMIT
# Completed on Wed Nov  6 17:21:16 2019
# Generated by xtables-save v1.8.2 on Wed Nov  6 17:21:16 2019
*nat
:PREROUTING ACCEPT [238187:13763445]
:INPUT ACCEPT [13851:683254]
:POSTROUTING ACCEPT [50516:3062865]
:OUTPUT ACCEPT [50516:3062865]
-A POSTROUTING -s 192.168.122.0/24 -d 224.0.0.0/24 -j RETURN
-A POSTROUTING -s 192.168.122.0/24 -d 255.255.255.255/32 -j RETURN
-A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -p udp -j MASQUERADE --to-ports 1024-65535
-A POSTROUTING -s 192.168.122.0/24 ! -d 192.168.122.0/24 -j MASQUERADE
COMMIT
# Completed on Wed Nov  6 17:21:16 2019
# Table `firewalld' is incompatible, use 'nft' tool.
# Warning: iptables-legacy tables present, use iptables-legacy-save to see them
# 

@erikwilson
Copy link
Contributor

It looks like iptables need to be changed to legacy mode.
The firewall may still cause problems after that change tho.

@LukeShortCloud
Copy link

@erikwilson
Copy link
Contributor

Sorry? This doesn't seem like a problem that k3s can easily fix. The root of the problem is with iptables and how they decided to implement the upgrade. If Kubernetes/containerization has issues with nftables support you would think that RHEL would provide legacy support, maybe should file an issue there?
Any container which uses legacy iptables will cause problems if the host has switched to nftables.

@MikeCongdon1
Copy link

@erikwilson etc is that going to be the answer for my issue also? #1159 (I'm on Centos as well & getting a CrashLoopBackOff)

@erikwilson
Copy link
Contributor

Not sure @MikeCongdon1. Could you check the logs of the failing containers and use this thread to help determine the root issue?

@ethanmdavidson
Copy link

I'm seeing the same issue, with centos8 and k3s v1.0. Seems like the only way to get it working is to disable/stop firewalld (systemctl stop firewalld) but that's not really a solution.

@lerit
Copy link

lerit commented Feb 2, 2020

same issue.
centos-release-7-7.1908.0.el7.centos.x86_64
Docker version 19.03.5, build 633a0ea
k3s version v1.17.0+k3s.1 (0f64465)

new centos ,docker +rancher+k3s
when import k3s,rancher cluster state:pending-Waiting for API to be available
see container relate cattle:

time="2020-02-02T04:26:20Z" level=fatal msg="Get https://10.43.0.1:443/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions: dial tcp 10.43.0.1:443: connect: no route to host

disable firewalld resolve it,but is it a solution?

@Lohann
Copy link

Lohann commented Feb 25, 2020

+1

Same issue here on Centos 7, disable the firewalld isn't an option for me.

Environment:

  • Operating System: CentOS Linux 7 (Core)
  • Kernel: Linux 3.10.0-1062.12.1.el7.x86_64
  • Docker: Docker version 19.03.6, build 369ce74a3c
  • k3s: k3s version v1.17.2+k3s1 (cdab19b)

Start script:
k3s server --docker

Output of docker ps -a

CONTAINER ID   IMAGE              CREATED         STATUS                     NAMES
085e91faa2b5   9d12f9848b99       2 seconds ago   Exited (1) 1 second ago    k8s_local-path-provisioner_local-path-provisioner-58fb86bdfd-fkmg7_kube-system_c85a2dd6-ab79-4738-82fe-da5af08cf642_117
523d4cefc295   274808e7f6b8       26 seconds ago  Exited (1) 22 seconds ago  k8s_helm_helm-install-traefik-t6q9n_kube-system_b7f439fb-3445-4710-9928-5ecb3bf92ef5_116
6a2e05bcffdb   9dd718864ce6       3 minutes ago   Exited (2) 3 minutes ago   k8s_metrics-server_metrics-server-6d684c7b5-5nrmc_kube-system_06df24a4-dc2a-44f2-887e-408d10ae1092_116
2ffd3622d1ff   c4d3d16fe508       13 minutes ago  Up 13 minutes              k8s_coredns_coredns-d798c9dd-jnhns_kube-system_2fc16556-44c6-4613-97e3-f3aa6ec4dc79_0
0e0a7cb164ab   rancher/pause:3.1  13 minutes ago  Up 13 minutes              k8s_POD_sh-6fff558d56-5vzcj_default_7cc67992-8f2d-4b3c-9c69-e5ca55a74047_0
308e8391a06f   rancher/pause:3.1  13 minutes ago  Up 13 minutes              k8s_POD_metrics-server-6d684c7b5-5nrmc_kube-system_06df24a4-dc2a-44f2-887e-408d10ae1092_0
9e6ca60ffdca   rancher/pause:3.1  13 minutes ago  Up 13 minutes              k8s_POD_local-path-provisioner-58fb86bdfd-fkmg7_kube-system_c85a2dd6-ab79-4738-82fe-da5af08cf642_0
3f1fe38bd5b7   rancher/pause:3.1  13 minutes ago  Up 13 minutes              k8s_POD_coredns-d798c9dd-jnhns_kube-system_2fc16556-44c6-4613-97e3-f3aa6ec4dc79_0
af2de8b7e9da   rancher/pause:3.1  13 minutes ago  Up 13 minutes              k8s_POD_helm-install-traefik-t6q9n_kube-system_b7f439fb-3445-4710-9928-5ecb3bf92ef5_0

Containers outpus

k8s_metrics-server_metrics-server:

I0225 14:24:15.766726       1 serving.go:312] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
Error: Get https://10.43.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication: dial tcp 10.43.0.1:443: connect: no route to host

k8s_local-path-provisioner_local-path-provisioner:

time="2020-02-25T14:22:35Z" level=fatal msg="Error starting daemon: Cannot start Provisioner: failed to get Kubernetes server version: Get https://10.43.0.1:443/version?timeout=32s: dial tcp 10.43.0.1:443: connect: no route to host"

k8s_helm_helm-install-traefik-t6q9n_kube-system:

[storage/driver] 2020/02/25 14:27:29 list: failed to list: Get https://10.43.0.1:443/api/v1/namespaces/kube-system/secrets?labelSelector=OWNER%3DTILLER: dial tcp 10.43.0.1:443: connect: no route to host
Error: Get https://10.43.0.1:443/api/v1/namespaces/kube-system/secrets?labelSelector=OWNER%!D(MISSING)TILLER: dial tcp 10.43.0.1:443: connect: no route to host
+ EXIST=

@Lohann
Copy link

Lohann commented Mar 1, 2020

I finally got K3S working on Centos 7 and docker, it wasn't working even after replace firewalld by iptables.

Facts:

So I just added following rule to iptables in order to fix it:

iptables -A INPUT -s 10.42.0.0/16 -d <host_internal_ip>/32 -j ACCEPT

Where 10.42.0.0/16 is the default k3s network CIDR, change it if you defined a different value on --cluster-cidr flag.

The step by step solution:

# Stop and remove k3s
k3s-killall.sh
k3s-uninstall.sh

# Stop and remove all k3s containers
docker stop $(docker ps -a -q --filter "name=k8s_") | xargs docker rm

Replace firewalld by iptables

systemctl stop firewalld
systemctl disable firewalld
yum install iptables-services
systemctl start iptables
systemctl enable iptables

Configure iptables rules, my setup is based on this article:
https://www.digitalocean.com/community/tutorials/how-to-set-up-a-basic-iptables-firewall-on-centos-6

# Erase iptables rules
iptables -F

# Block the most common attacks
iptables -A INPUT -p tcp --tcp-flags ALL NONE -j DROP
iptables -A INPUT -p tcp ! --syn -m state --state NEW -j DROP
iptables -A INPUT -p tcp --tcp-flags ALL ALL -j DROP

# Enable outgoing connections
iptables -I INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -i lo -j ACCEPT
iptables -P OUTPUT ACCEPT

# Open Traffik http and https ports
iptables -A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 443 -j ACCEPT

# Open SSH port
iptables -A INPUT -p tcp -m tcp --dport 22 -j ACCEPT

# Block everything else
iptables -P INPUT DROP

Here is where the magic happens, enable connections from k3s pods to your host internal ip:

iptables -A INPUT -s 10.42.0.0/16 -d <host_internal_ip>/32 -j ACCEPT

Save changes and restart iptables

iptables-save | sudo tee /etc/sysconfig/iptables
service iptables restart

Install and run k3s

curl -sfL https://get.k3s.io | sh -s - --docker

@stevef1uk
Copy link

I am seeing this error on a k3s install onto a RPi cluster using Rasbian:

The logs of the process:

kubectl logs local-path-provisioner-58b55cb6b6-fhn4k -n local-path-storage -f
time="2020-03-03T09:33:20Z" level=debug msg="Applied config: {"nodePathMap":[{"node":"DEFAULT_PATH_FOR_NON_LISTED_NODES","paths":["/opt/local-path-provisioner"]}]}"
time="2020-03-03T09:33:20Z" level=debug msg="Provisioner started"
I0303 09:33:20.827795 1 leaderelection.go:242] attempting to acquire leader lease local-path-storage/rancher.io-local-path...
I0303 09:33:47.163803 1 leaderelection.go:252] successfully acquired lease local-path-storage/rancher.io-local-path
I0303 09:33:47.164705 1 controller.go:773] Starting provisioner controller rancher.io/local-path_local-path-provisioner-58b55cb6b6-fhn4k_7a01cc7c-f9b5-4015-b401-efeb8a1e86a7!
I0303 09:33:47.166570 1 event.go:281] Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"local-path-storage", Name:"rancher.io-local-path", UID:"02dc6406-3334-4c75-a9c2-c388373e6403", APIVersion:"v1", ResourceVersion:"46969", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' local-path-provisioner-58b55cb6b6-fhn4k_7a01cc7c-f9b5-4015-b401-efeb8a1e86a7 became leader
I0303 09:33:47.265028 1 controller.go:822] Started provisioner controller rancher.io/local-path_local-path-provisioner-58b55cb6b6-fhn4k_7a01cc7c-f9b5-4015-b401-efeb8a1e86a7!
I0303 09:39:12.199569 1 leaderelection.go:288] failed to renew lease local-path-storage/rancher.io-local-path: failed to tryAcquireOrRenew context deadline exceeded
F0303 09:39:12.199642 1 controller.go:851] leaderelection lost

@darth-veitcher
Copy link

darth-veitcher commented Mar 25, 2020

For what it's worth I wrote a ufw application profile for this - I didn't feel like turning the firewall off was a solution. Turned out my pods needed to have 6443 (TCP) and 8472 (UDP) as well as access to the host 443 (TCP).

# file: /etc/ufw/applications.d/k3s
# Used to configure the ufw definition for Lightweight Kubernetes (k3s)

[k3s]
title=Lightweight Kubernetes (k3s)
description=K3s is a highly available, certified Kubernetes distribution designed for production workloads in unattended, resource-constrained, remote locations or inside IoT appliances.
ports=443,6443/tcp|8472/udp

To use this make sure you've got your CIDR identified and then run the following (after creating the file above):

sudo ufw app update k3s
sudo ufw allow from $CIDR to any app k3s

This will allow access within the cluster. Seems to have fixed my problems, well at least some of them... 😉

@danielo515
Copy link

$CIDR

What is the $CIDR var supposed to contain?

@danielo515
Copy link

I just used 10.0.0.0/8

@agilob
Copy link

agilob commented Sep 15, 2020

I'm running k3s on clean raspbian, no firewall config at all, iptables isn't even installed, local-path-provisioner keeps crashing anyway:

k logs -n kube-system local-path-provisioner-6d59f47c7-zpcvp
time="2020-09-15T10:38:07Z" level=fatal msg="Error starting daemon: Cannot start Provisioner: failed to get Kubernetes server version: Get https://10.43.0.1:443/version?timeout=32s: dial tcp 10.43.0.1:443: connect: connection refused"
pi@rpi3:~ $ k3s -version
k3s version v1.18.8+k3s1 (6b595318)

What are my options?

@brandond
Copy link
Member

brandond commented Mar 8, 2021

@agilob

I'm running k3s on clean raspbian, no firewall config at all,

Do you mean that you haven't configured it? Or that it's turned off? You should confirm that it is indeed disabled.

@criscola
Copy link

criscola commented Mar 16, 2021

I'm having the same problem with a raspberry pi 4, I tried a couple of times with a fresh install but I'm still getting the error each time:

$ k3s kubectl logs local-path-provisioner-5ff76fc89d-7g5qk -n kube-system
time="2021-03-16T14:57:18Z" level=fatal msg="Error starting daemon: Cannot start Provisioner: failed to get Kubernetes server version: Get https://10.43.0.1:443/version?timeout=32s: dial tcp 10.43.0.1:443: connect: connection refused" 

OS is Raspberry Pi OS 64bit

PS If I execute the kill-all script and reboot the server with k3s server start it starts working again, but this is inconvenient.

@brandond
Copy link
Member

@agilob and @criscola both of you appear to be on Raspberry Pi, whereas this issue appears to be reported against CentOS on amd64. I suggest one or both of you create new issues to track your problem; pods crash-looping is a generic symptom of many potential problems and we can't troubleshoot them all in the same place.

@wjx8382
Copy link

wjx8382 commented Mar 17, 2021

I had the same problem just now and resolved it by turning off SELinux

@magiccaptain
Copy link

CentOS7 with firewall enable, this worked for me:

#1556 (comment)

@remkolems
Copy link

remkolems commented Jun 27, 2021

On Ubuntu 20.04.2 LTS confirmed: Error and CrashLoopBackOff

k3s -v
k3s version v1.21.2+k3s1 (5a67e8d)
go version go1.16.4

Solution

  1. Disable IPv6
    sudo sysctl -w net.ipv6.conf.all.disable_ipv6=1 && sudo sysctl -w net.ipv6.conf.default.disable_ipv6=1 && sudo sysctl -w net.ipv6.conf.lo.disable_ipv6=1
  2. Disable firewall (I do have a dedicated IDS/IPS firewall within in my network)
    sudo ufw disable
  3. Reboot
    sudo reboot

@stale
Copy link

stale bot commented Dec 24, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Dec 24, 2021
@stale stale bot closed this as completed Jan 7, 2022
@cguertin14
Copy link

I finally got K3S working on Centos 7 and docker, it wasn't working even after replace firewalld by iptables.

Facts:

So I just added following rule to iptables in order to fix it:

iptables -A INPUT -s 10.42.0.0/16 -d <host_internal_ip>/32 -j ACCEPT

Where 10.42.0.0/16 is the default k3s network CIDR, change it if you defined a different value on --cluster-cidr flag.

The step by step solution:

# Stop and remove k3s
k3s-killall.sh
k3s-uninstall.sh

# Stop and remove all k3s containers
docker stop $(docker ps -a -q --filter "name=k8s_") | xargs docker rm

Replace firewalld by iptables

systemctl stop firewalld
systemctl disable firewalld
yum install iptables-services
systemctl start iptables
systemctl enable iptables

Configure iptables rules, my setup is based on this article: https://www.digitalocean.com/community/tutorials/how-to-set-up-a-basic-iptables-firewall-on-centos-6

# Erase iptables rules
iptables -F

# Block the most common attacks
iptables -A INPUT -p tcp --tcp-flags ALL NONE -j DROP
iptables -A INPUT -p tcp ! --syn -m state --state NEW -j DROP
iptables -A INPUT -p tcp --tcp-flags ALL ALL -j DROP

# Enable outgoing connections
iptables -I INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -i lo -j ACCEPT
iptables -P OUTPUT ACCEPT

# Open Traffik http and https ports
iptables -A INPUT -p tcp -m tcp --dport 80 -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 443 -j ACCEPT

# Open SSH port
iptables -A INPUT -p tcp -m tcp --dport 22 -j ACCEPT

# Block everything else
iptables -P INPUT DROP

Here is where the magic happens, enable connections from k3s pods to your host internal ip:

iptables -A INPUT -s 10.42.0.0/16 -d <host_internal_ip>/32 -j ACCEPT

Save changes and restart iptables

iptables-save | sudo tee /etc/sysconfig/iptables
service iptables restart

Install and run k3s

curl -sfL https://get.k3s.io | sh -s - --docker

this doesn't seem to be working on Ubuntu 22.04. Any ideas on how to proceed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests