Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3s Install on Raspberry Pi 4b failed (TLS Handshake Timeout pi3, pi4, etc) #970

Closed
gm12367 opened this issue Oct 27, 2019 · 58 comments
Closed

Comments

@gm12367
Copy link

gm12367 commented Oct 27, 2019

Thanks for helping us to improve k3s! We welcome all bug reports. Please fill out each area of the template so we can better help you. You can delete this message portion of the bug report.

Version:
Provide the output from k3s -v and provide the flags used to install or run k3s server.

root@raspberrypi:/home/pi# k3s -v
k3s version v0.10.0 (f9888ca3)

OS version:
Linux raspberrypi 4.19.75-v7l+ rancher/k3s#1270 SMP Tue Sep 24 18:51:41 BST 2019 armv7l
bootloader version:

root@raspberrypi:~# vcgencmd bootloader_version
Sep 10 2019 10:41:50
version f626c772b15ba1b7e0532a8d50a761b3ccbdf3bb (release)
timestamp 1568112110

Describe the bug
A clear and concise description of what the bug is.
After run install command "curl -sfL https://get.k3s.io | sh -", installation can't be completed, and TLS handshake timeout error prompted

To Reproduce
Steps to reproduce the behavior:
Run command 'curl -sfL https://get.k3s.io | sh -' on Raspberry Pi 4b 4G memory

Expected behavior
A clear and concise description of what you expected to happen.

Actual behavior
A clear and concise description of what actually happened.
TLS handshake timeout

Additional context
Add any other context about the problem here.
I put some error logs below, hope them can help:

root@raspberrypi:/home/pi# journalctl -u k3s.service
-- Logs begin at Thu 2019-09-26 01:24:23 BST, end at Sun 2019-10-27 01:22:17 GMT. --
Oct 27 01:19:58 raspberrypi systemd[1]: Starting Lightweight Kubernetes...
Oct 27 01:19:58 raspberrypi k3s[3688]: time="2019-10-27T01:19:58Z" level=info msg="Preparing data dir /var/lib/rancher/k3s/data/3f43b16ca97dbb7ba58868cdb2137a72ad7215762a2852ed944237bf45d44f07"
Oct 27 01:20:13 raspberrypi k3s[3688]: time="2019-10-27T01:20:13.437098936Z" level=info msg="Starting k3s v0.10.0 (f9888ca3)"
Oct 27 01:20:13 raspberrypi k3s[3688]: time="2019-10-27T01:20:13.945042885Z" level=info msg="Kine listening on unix://kine.sock"
Oct 27 01:20:13 raspberrypi k3s[3688]: time="2019-10-27T01:20:13.947965657Z" level=info msg="Fetching bootstrap data from etcd"
Oct 27 01:20:15 raspberrypi k3s[3688]: time="2019-10-27T01:20:15.186636567Z" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=unknown --authorization-mode=Node,RBAC --basic-auth-file=/var/lib
Oct 27 01:20:15 raspberrypi k3s[3688]: Flag --basic-auth-file has been deprecated, Basic authentication mode is deprecated and will be removed in a future release. It is not recommended for production environments.
Oct 27 01:20:15 raspberrypi k3s[3688]: I1027 01:20:15.189751    3688 server.go:650] external host was not specified, using 192.168.199.80
Oct 27 01:20:15 raspberrypi k3s[3688]: I1027 01:20:15.191063    3688 server.go:162] Version: v1.16.2-k3s.1
Oct 27 01:20:19 raspberrypi k3s[3688]: I1027 01:20:19.782703    3688 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultT
Oct 27 01:20:19 raspberrypi k3s[3688]: I1027 01:20:19.782801    3688 plugins.go:161] Loaded 7 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,RuntimeCl
Oct 27 01:20:19 raspberrypi k3s[3688]: I1027 01:20:19.785373    3688 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultT
Oct 27 01:20:19 raspberrypi k3s[3688]: I1027 01:20:19.785425    3688 plugins.go:161] Loaded 7 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,RuntimeCl
Oct 27 01:20:19 raspberrypi k3s[3688]: I1027 01:20:19.856982    3688 master.go:259] Using reconciler: lease
Oct 27 01:20:19 raspberrypi k3s[3688]: I1027 01:20:19.966350    3688 rest.go:115] the default service ipfamily for this cluster is: IPv4
Oct 27 01:20:20 raspberrypi k3s[3688]: W1027 01:20:20.788011    3688 genericapiserver.go:404] Skipping API batch/v2alpha1 because it has no resources.
Oct 27 01:20:20 raspberrypi k3s[3688]: W1027 01:20:20.853703    3688 genericapiserver.go:404] Skipping API node.k8s.io/v1alpha1 because it has no resources.
Oct 27 01:20:20 raspberrypi k3s[3688]: W1027 01:20:20.919549    3688 genericapiserver.go:404] Skipping API rbac.authorization.k8s.io/v1alpha1 because it has no resources.
Oct 27 01:20:20 raspberrypi k3s[3688]: W1027 01:20:20.931880    3688 genericapiserver.go:404] Skipping API scheduling.k8s.io/v1alpha1 because it has no resources.
Oct 27 01:20:20 raspberrypi k3s[3688]: W1027 01:20:20.973747    3688 genericapiserver.go:404] Skipping API storage.k8s.io/v1alpha1 because it has no resources.
Oct 27 01:20:21 raspberrypi k3s[3688]: W1027 01:20:21.043638    3688 genericapiserver.go:404] Skipping API apps/v1beta2 because it has no resources.
Oct 27 01:20:21 raspberrypi k3s[3688]: W1027 01:20:21.043695    3688 genericapiserver.go:404] Skipping API apps/v1beta1 because it has no resources.
Oct 27 01:20:21 raspberrypi k3s[3688]: I1027 01:20:21.078307    3688 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultT
Oct 27 01:20:21 raspberrypi k3s[3688]: I1027 01:20:21.078434    3688 plugins.go:161] Loaded 7 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,RuntimeCl
Oct 27 01:20:21 raspberrypi k3s[3688]: time="2019-10-27T01:20:21.096613858Z" level=info msg="Running kube-scheduler --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --leader-elect=false --port=10251 --secure-port=0"
Oct 27 01:20:21 raspberrypi k3s[3688]: time="2019-10-27T01:20:21.098945424Z" level=info msg="Running kube-controller-manager --allocate-node-cidrs=true --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16 --cluster-signing-cert-file=/var/lib/rancher/k3s/server/tls/s
Oct 27 01:20:21 raspberrypi k3s[3688]: I1027 01:20:21.119387    3688 controllermanager.go:161] Version: v1.16.2-k3s.1
Oct 27 01:20:21 raspberrypi k3s[3688]: I1027 01:20:21.121660    3688 deprecated_insecure_serving.go:53] Serving insecurely on [::]:10252
Oct 27 01:20:21 raspberrypi k3s[3688]: I1027 01:20:21.127479    3688 server.go:143] Version: v1.16.2-k3s.1
Oct 27 01:20:21 raspberrypi k3s[3688]: I1027 01:20:21.127709    3688 defaults.go:91] TaintNodesByCondition is enabled, PodToleratesNodeTaints predicate is mandatory
Oct 27 01:20:21 raspberrypi k3s[3688]: W1027 01:20:21.139439    3688 authorization.go:47] Authorization is disabled
Oct 27 01:20:21 raspberrypi k3s[3688]: W1027 01:20:21.139494    3688 authentication.go:79] Authentication is disabled
Oct 27 01:20:21 raspberrypi k3s[3688]: I1027 01:20:21.139527    3688 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251
Oct 27 01:20:31 raspberrypi k3s[3688]: time="2019-10-27T01:20:31.111017958Z" level=fatal msg="starting tls server: Get https://127.0.0.1:6444/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions: net/http: TLS handshake timeout"
Oct 27 01:20:31 raspberrypi systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Oct 27 01:20:31 raspberrypi systemd[1]: k3s.service: Failed with result 'exit-code'.
Oct 27 01:20:31 raspberrypi systemd[1]: Failed to start Lightweight Kubernetes.
Oct 27 01:20:36 raspberrypi systemd[1]: k3s.service: Service RestartSec=5s expired, scheduling restart.
Oct 27 01:20:36 raspberrypi systemd[1]: k3s.service: Scheduled restart job, restart counter is at 1.
Oct 27 01:20:36 raspberrypi systemd[1]: Stopped Lightweight Kubernetes.
Oct 27 01:20:36 raspberrypi systemd[1]: Starting Lightweight Kubernetes...
@kaihendry
Copy link

I upgraded yesterday curl -sfL https://get.k3s.io | sh - and I think I see the same issue. The logs are pretty intense: https://s.natalian.org/2019-10-28/k3s.txt

@kaihendry
Copy link

Workaround is to downgrade curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v0.9.1 sh -, thanks to https://twitter.com/ibuildthecloud/status/1188640874642563072

@mordredz
Copy link

I have the same problem with raspberrypi model 3B+ (version k3s 0.10.0) but with the 0.9.1 it's working.

@dan-mcm
Copy link

dan-mcm commented Oct 28, 2019

Same as above ^ first time attempting to setup, v0.10.0 bugged out, downgrade to 0,9,1 worked 👍

@m0wlheld
Copy link

Same for Raspberry Pi 3 / 3B with v0.10.1, but 0.9.1 works. Somebody please adjust the issue's title: "K3S Install on Raspberry Pi fails since v0.10.0"

@m0wlheld
Copy link

Related to #869? Spotted the same error message there.

@erikwilson
Copy link
Contributor

And #556 as already linked here.

Haven't really been able to find a reproducible case.
Does cat /proc/sys/kernel/random/entropy_avail show sufficient entropy?
Would not be surprised if it is some golang arm issue, if possible might be worth trying out a 64-bit OS.

@m0wlheld
Copy link

cat /proc/sys/kernel/random/entropy_avail gives 3233 in return, but the RPi3 Architecture is not 64 Bit AFAIK. OS is Raspian

lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 9.11 (stretch)
Release:        9.11
Codename:       stretch

@erikwilson
Copy link
Contributor

@gocursor
Copy link

gocursor commented Oct 30, 2019

Same issue with k3s version v0.10.1 (7d650d3) on Intel/Amd64. (Manually copied k3s v0.10.1 from releases into VirtualBox VM with Ubuntu 18.04.3) The exact error message is:

FATA[2019-10-30T16:52:21.768049354+01:00] starting tls server: Get https://127.0.0.1:6444/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions: net/http: TLS handshake timeout

k3s v0.9.1 starts on the same VM without this error.

@cjellick
Copy link
Contributor

@gocursor your problem could be a general networking problem, maybe not directly related to this arm issue.

@gm12367
Copy link
Author

gm12367 commented Oct 31, 2019

@erikwilson I re-imaged OS as Ubuntu 19.10 64bit on my Raspberry Pi4, and then tried again, issue is the same as previous, "TLS handshake timeout".

Below information you probably need:

K3s version:

root@ubuntu:~# k3s -version
k3s version v0.10.1 (7d650d32)

OS version:

root@ubuntu:~# uname -a
Linux ubuntu 5.3.0-1008-raspi2 #9-Ubuntu SMP Fri Oct 18 13:26:35 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

Arch version:

root@ubuntu:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 19.10
Release:        19.10
Codename:       eoan

entropy_avail:

root@ubuntu:~# cat /proc/sys/kernel/random/entropy_avail
401

Error Logs:

root@ubuntu:~# journalctl -u k3s.service
-- Logs begin at Thu 2019-04-11 16:28:37 UTC, end at Thu 2019-10-31 13:18:39 UTC. --
Oct 31 13:04:45 ubuntu systemd[1]: Starting Lightweight Kubernetes...
Oct 31 13:04:45 ubuntu k3s[1884]: time="2019-10-31T13:04:45Z" level=info msg="Preparing data dir /var/lib/rancher/k3s/data/11f1b1f5f9884701e429998dc51d3b6df601985460dc405a0ad74bd87c99d1ea"
Oct 31 13:04:51 ubuntu k3s[1884]: time="2019-10-31T13:04:51.893934370Z" level=info msg="Starting k3s v0.10.1 (7d650d32)"
Oct 31 13:04:51 ubuntu k3s[1884]: time="2019-10-31T13:04:51.996804834Z" level=info msg="Kine listening on unix://kine.sock"
Oct 31 13:04:51 ubuntu k3s[1884]: time="2019-10-31T13:04:51.998663740Z" level=info msg="Fetching bootstrap data from etcd"
Oct 31 13:04:54 ubuntu k3s[1884]: time="2019-10-31T13:04:54.241480500Z" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=unknown --authorization-mode=Node,RBAC --basic-auth-file=/var/lib/ranc
Oct 31 13:04:54 ubuntu k3s[1884]: Flag --basic-auth-file has been deprecated, Basic authentication mode is deprecated and will be removed in a future release. It is not recommended for production environments.
Oct 31 13:04:54 ubuntu k3s[1884]: I1031 13:04:54.244334    1884 server.go:650] external host was not specified, using 192.168.199.79
Oct 31 13:04:54 ubuntu k3s[1884]: I1031 13:04:54.245661    1884 server.go:162] Version: v1.16.2-k3s.1
Oct 31 13:05:00 ubuntu k3s[1884]: I1031 13:05:00.630222    1884 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolera
Oct 31 13:05:00 ubuntu k3s[1884]: I1031 13:05:00.630305    1884 plugins.go:161] Loaded 7 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,RuntimeClass,R
Oct 31 13:05:00 ubuntu k3s[1884]: I1031 13:05:00.632921    1884 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolera
Oct 31 13:05:00 ubuntu k3s[1884]: I1031 13:05:00.632981    1884 plugins.go:161] Loaded 7 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,RuntimeClass,R
Oct 31 13:05:00 ubuntu k3s[1884]: I1031 13:05:00.702617    1884 master.go:259] Using reconciler: lease
Oct 31 13:05:00 ubuntu k3s[1884]: I1031 13:05:00.828329    1884 rest.go:115] the default service ipfamily for this cluster is: IPv4
Oct 31 13:05:02 ubuntu k3s[1884]: W1031 13:05:02.720128    1884 genericapiserver.go:404] Skipping API batch/v2alpha1 because it has no resources.
Oct 31 13:05:02 ubuntu k3s[1884]: W1031 13:05:02.842611    1884 genericapiserver.go:404] Skipping API node.k8s.io/v1alpha1 because it has no resources.
Oct 31 13:05:02 ubuntu k3s[1884]: W1031 13:05:02.966471    1884 genericapiserver.go:404] Skipping API rbac.authorization.k8s.io/v1alpha1 because it has no resources.
Oct 31 13:05:02 ubuntu k3s[1884]: W1031 13:05:02.989855    1884 genericapiserver.go:404] Skipping API scheduling.k8s.io/v1alpha1 because it has no resources.
Oct 31 13:05:03 ubuntu k3s[1884]: W1031 13:05:03.066864    1884 genericapiserver.go:404] Skipping API storage.k8s.io/v1alpha1 because it has no resources.
Oct 31 13:05:03 ubuntu k3s[1884]: W1031 13:05:03.195220    1884 genericapiserver.go:404] Skipping API apps/v1beta2 because it has no resources.
Oct 31 13:05:03 ubuntu k3s[1884]: W1031 13:05:03.195351    1884 genericapiserver.go:404] Skipping API apps/v1beta1 because it has no resources.
Oct 31 13:05:03 ubuntu k3s[1884]: I1031 13:05:03.259508    1884 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolera
Oct 31 13:05:03 ubuntu k3s[1884]: I1031 13:05:03.259642    1884 plugins.go:161] Loaded 7 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,RuntimeClass,R
Oct 31 13:05:03 ubuntu k3s[1884]: time="2019-10-31T13:05:03.289835152Z" level=info msg="Running kube-scheduler --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --leader-elect=false --port=10251 --secure-port=0"
Oct 31 13:05:03 ubuntu k3s[1884]: time="2019-10-31T13:05:03.292533883Z" level=info msg="Running kube-controller-manager --allocate-node-cidrs=true --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16 --cluster-signing-cert-file=/var/lib/rancher/k3s/server/tls/server
Oct 31 13:05:03 ubuntu k3s[1884]: I1031 13:05:03.379460    1884 controllermanager.go:161] Version: v1.16.2-k3s.1
Oct 31 13:05:03 ubuntu k3s[1884]: I1031 13:05:03.390324    1884 deprecated_insecure_serving.go:53] Serving insecurely on [::]:10252
Oct 31 13:05:03 ubuntu k3s[1884]: I1031 13:05:03.401644    1884 server.go:143] Version: v1.16.2-k3s.1
Oct 31 13:05:03 ubuntu k3s[1884]: I1031 13:05:03.403409    1884 defaults.go:91] TaintNodesByCondition is enabled, PodToleratesNodeTaints predicate is mandatory
Oct 31 13:05:03 ubuntu k3s[1884]: W1031 13:05:03.412646    1884 authorization.go:47] Authorization is disabled
Oct 31 13:05:03 ubuntu k3s[1884]: W1031 13:05:03.412819    1884 authentication.go:79] Authentication is disabled
Oct 31 13:05:03 ubuntu k3s[1884]: I1031 13:05:03.412886    1884 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251
Oct 31 13:05:13 ubuntu k3s[1884]: time="2019-10-31T13:05:13.355859773Z" level=fatal msg="starting tls server: Get https://127.0.0.1:6444/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions: net/http: TLS handshake timeout"
Oct 31 13:05:13 ubuntu systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
Oct 31 13:05:13 ubuntu systemd[1]: k3s.service: Failed with result 'exit-code'.
Oct 31 13:05:13 ubuntu systemd[1]: Failed to start Lightweight Kubernetes.
Oct 31 13:05:18 ubuntu systemd[1]: k3s.service: Service RestartSec=5s expired, scheduling restart.
Oct 31 13:05:18 ubuntu systemd[1]: k3s.service: Scheduled restart job, restart counter is at 1.
Oct 31 13:05:18 ubuntu systemd[1]: Stopped Lightweight Kubernetes.
Oct 31 13:05:18 ubuntu systemd[1]: Starting Lightweight Kubernetes...
Oct 31 13:05:22 ubuntu k3s[1921]: time="2019-10-31T13:05:22.126716301Z" level=info msg="Starting k3s v0.10.1 (7d650d32)"
Oct 31 13:05:22 ubuntu k3s[1921]: time="2019-10-31T13:05:22.136877861Z" level=info msg="Kine listening on unix://kine.sock"
Oct 31 13:05:22 ubuntu k3s[1921]: time="2019-10-31T13:05:22.138038312Z" level=info msg="Fetching bootstrap data from etcd"
Oct 31 13:05:22 ubuntu k3s[1921]: time="2019-10-31T13:05:22.271186322Z" level=info msg="Running kube-apiserver --advertise-port=6443 --allow-privileged=true --anonymous-auth=false --api-audiences=unknown --authorization-mode=Node,RBAC --basic-auth-file=/var/lib/ranc
Oct 31 13:05:22 ubuntu k3s[1921]: Flag --basic-auth-file has been deprecated, Basic authentication mode is deprecated and will be removed in a future release. It is not recommended for production environments.
Oct 31 13:05:22 ubuntu k3s[1921]: I1031 13:05:22.273924    1921 server.go:650] external host was not specified, using 192.168.199.79

@sttts
Copy link
Contributor

sttts commented Oct 31, 2019

Same on a recent https://github.com/hypriot.

@gm12367
Copy link
Author

gm12367 commented Oct 31, 2019

Also tried to install old version v0.9.1, first time failed with cgroup error:

Oct 31 14:09:12 ubuntu k3s[2377]: time="2019-10-31T14:09:12.021942176Z" level=error msg="Failed to find memory cgroup, you may need to add \"cgroup_memory=1 cgroup_enable=memory\" to your linux cmdline (/boot/cmdline.txt on a Raspberry Pi)"
Oct 31 14:09:12 ubuntu k3s[2377]: time="2019-10-31T14:09:12.022021433Z" level=fatal msg="failed to find memory cgroup, you may need to add \"cgroup_memory=1 cgroup_enable=memory\" to your linux cmdline (/boot/cmdline.txt on a Raspberry Pi)"

After add the two option of cgroup into /boot/firmware/config.txt file, and tried again, it succeed.

root@ubuntu:~# kubectl get node
NAME     STATUS   ROLES    AGE     VERSION
ubuntu   Ready    master   3m55s   v1.15.4-k3s.1
root@ubuntu:~# kubectl get pod -A
NAMESPACE     NAME                                      READY   STATUS      RESTARTS   AGE
kube-system   local-path-provisioner-5b8648d6f6-7fgm5   1/1     Running     0          3m52s
kube-system   coredns-66f496764-cjg7q                   1/1     Running     0          3m52s
kube-system   helm-install-traefik-szt4n                0/1     Completed   0          3m52s
kube-system   svclb-traefik-9b7cv                       3/3     Running     0          51s
kube-system   traefik-d869575c8-4gf95                   1/1     Running     0          51s

After that, I tried to upgrade K3s to latest version, it succeed this time:

root@ubuntu:~# k3s -version
k3s version v0.9.1 (755bd1c6)
root@ubuntu:~# curl -sfL https://get.k3s.io | sh -
[INFO]  Finding latest release
[INFO]  Using v0.10.1 as release
[INFO]  Downloading hash https://github.com/rancher/k3s/releases/download/v0.10.1/sha256sum-arm64.txt
[INFO]  Downloading binary https://github.com/rancher/k3s/releases/download/v0.10.1/k3s-arm64
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/crictl symlink to k3s, already exists
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, already exists
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s
root@ubuntu:~# kubectl get node
NAME     STATUS   ROLES    AGE   VERSION
ubuntu   Ready    master   7m    v1.15.4-k3s.1
root@ubuntu:~# kubectl get pod -A
NAMESPACE     NAME                                      READY   STATUS      RESTARTS   AGE
kube-system   helm-install-traefik-szt4n                0/1     Completed   0          6m53s
kube-system   local-path-provisioner-5b8648d6f6-7fgm5   0/1     Error       0          6m53s
kube-system   coredns-66f496764-cjg7q                   1/1     Running     0          6m53s
kube-system   svclb-traefik-9b7cv                       3/3     Running     0          3m52s
kube-system   traefik-d869575c8-4gf95                   0/1     Running     0          3m52s
root@ubuntu:~# kubectl get node
NAME     STATUS   ROLES    AGE     VERSION
ubuntu   Ready    master   7m14s   v1.16.2-k3s.1

root@ubuntu:~# kubectl get pod -A
NAMESPACE     NAME                                      READY   STATUS      RESTARTS   AGE
kube-system   helm-install-traefik-szt4n                0/1     Completed   0          7m40s
kube-system   local-path-provisioner-5b8648d6f6-7fgm5   1/1     Running     1          7m40s
kube-system   coredns-66f496764-cjg7q                   1/1     Running     1          7m40s
kube-system   traefik-d869575c8-4gf95                   1/1     Running     1          4m39s
kube-system   svclb-traefik-vq8nb                       3/3     Running     0          32s
root@ubuntu:~# uname -a
Linux ubuntu 5.3.0-1008-raspi2 #9-Ubuntu SMP Fri Oct 18 13:26:35 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux
root@ubuntu:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 19.10
Release:        19.10
Codename:       eoan
root@ubuntu:~# k3s -version
k3s version v0.10.1 (7d650d32)

If I have time, will tried to directly install the latest version of K3s with the two cgroup option on a fresh Ubuntu 19.10 OS. At least I can run latest K3s on my Raspberry Pi4.
But as of now, still don't know if the issue relate to golang arm issue or other issue.

@erikwilson
Copy link
Contributor

Thanks for testing & the data points @gm12367! Interesting, I would expect k3s v0.10.1 to error out with the same memory cgroup message as v0.9.1.

401 bytes of entropy is pretty low, would think there would be a crypto error instead of handshake timeout, but if possible please try to reproduce with the haveged package installed.

@f2hex
Copy link

f2hex commented Oct 31, 2019

Same on a Rock64 with Armbian:

...
I1031 20:14:08.702543    6977 controllermanager.go:161] Version: v1.16.2-k3s.1
I1031 20:14:08.707560    6977 deprecated_insecure_serving.go:53] Serving insecurely on [::]:10252
I1031 20:14:08.708128    6977 server.go:143] Version: v1.16.2-k3s.1
I1031 20:14:08.708814    6977 defaults.go:91] TaintNodesByCondition is enabled, PodToleratesNodeTaints predicate is mandatory
W1031 20:14:08.715137    6977 authorization.go:47] Authorization is disabled
W1031 20:14:08.715536    6977 authentication.go:79] Authentication is disabled
I1031 20:14:08.715755    6977 deprecated_insecure_serving.go:51] Serving healthz insecurely on [::]:10251
FATA[2019-10-31T20:14:18.691839727Z] starting tls server: Get https://127.0.0.1:6444/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions: net/http: TLS handshake timeout
root@gaia:~# k3s --version
k3s version v0.10.1 (7d650d32)
root@gaia:~# cat /proc/sys/kernel/random/entropy_avail
2564

root@gaia:~# uname -a
Linux gaia 4.4.192-rockchip64 #1 SMP Tue Oct 8 18:39:24 CEST 2019 aarch64 GNU/Linux

haveged is running by default on Armbian.

After downgrading to k3s version v0.9.1 it worked.

@drbugfinder
Copy link

Same problem here on openSUSE 15.1 ARM64 (RPi3)

@zimme
Copy link

zimme commented Nov 1, 2019

When I do a get request to the api server on the secure port I get the following output.

renegade [~]$ curl -v https://127.0.0.1:6444
* Expire in 0 ms for 6 (transfer 0x5591476360)
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x5591476360)
* Connected to 127.0.0.1 (127.0.0.1) port 6444 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:6444
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:6444

@gm12367
Copy link
Author

gm12367 commented Nov 2, 2019

@erikwilson Today I found k3s version update to v0.10.2, and now it can install on Raspbian Buster successfully, don't know if the new version include the fix. I also tried with v0.10.1 and succeed as well. So I don't know if there is something changed.
I checked the /proc/sys/kernel/random/entropy_avail in Raspbian, it's always above 2000, but in Ubuntu it's pretty low, sometime even lower than 100. But after add cgroup option K3s can install successfully. So maybe it not refer to crypto issue?

@m0wlheld
Copy link

m0wlheld commented Nov 2, 2019

@erikwilson Today I found k3s version update to v0.10.2, and now it can install on Raspbian Buster successfully, don't know if the new version include the fix. I also tried with v0.10.1 and succeed as well. So I don't know if there is something changed.

Still no success with 0.10.2 on a RPi 3B+. Same TLS handshake timeout error as above. What "cgroup option" are you referring to?

@gm12367
Copy link
Author

gm12367 commented Nov 2, 2019

@m0wlheld "cgroup_memory=1 cgroup_enable=memory", I mentioned in my previous reply, you can add it into config.txt and try again

@m0wlheld
Copy link

m0wlheld commented Nov 2, 2019

@gm12367 Okay, I have that in my /boot/cmdline.txt (see below), still no success with any version > 0.9.1

dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 root=PARTUUID=a0df87db-02 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory

@mcanevet
Copy link

mcanevet commented Nov 2, 2019

Same here. Couldn't get any v0.10.x working on rpi3b+ with raspbian up-to-date (even with cgroup_memory=1 cgroup_enable=memory)

@squishykid
Copy link

Running v0.10.2 on an RPi 3B+, also with cgroup_memory=1 cgroup_enable=memory. I have the same issue with k3s exiting after the "TLS handshake timeout" message.

@gvanderberg
Copy link

gvanderberg commented Nov 3, 2019

downgrading to k3s version 0.9.1 worked for me too.

Running on RPi 3B+ with OS:

Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 10 (buster)
Release:        10
Codename:       buster

The error I got on version 0.10.2 and 0.10.0 was starting tls server: Get https://127.0.0.1:6444/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions: net/http: TLS handshake timeout

@larmog
Copy link
Contributor

larmog commented Nov 4, 2019

There's a race condition happening starting the apiserver and waiting for crds to be created. In pkg/server/context.go:41 the call to create crds is failing because of a timeout waiting for crds in pkg/server/context.go:69. The CRDs is taking time because of the apiserver is not yet available. If adding a simple sleep (not a suggested solution) after pkg/daemons/control/server.go:89 seems to resolve the issue.

@xiaods
Copy link
Contributor

xiaods commented Nov 10, 2019

$ k3s --version
k3s version v0.11.0-alpha2 (405f85a)

failed on RPI3.

INFO[2019-11-10T12:46:55.473870979Z] Done waiting for CRD helmcharts.helm.cattle.io to become available 
FATA[2019-11-10T12:46:55.476566942Z] starting tls server: timed out waiting for the condition 

@pierremahot
Copy link

$ k3s --version
k3s version v0.11.0-alpha2 (405f85a)

failed on RPI3.

INFO[2019-11-10T12:46:55.473870979Z] Done waiting for CRD helmcharts.helm.cattle.io to become available 
FATA[2019-11-10T12:46:55.476566942Z] starting tls server: timed out waiting for the condition 

@xiaods I have install rng-tool and set the swapaccount=1 as @zimme comment
May it make the difference because is working

# k3s --version
k3s version v0.11.0-alpha2 (405f85aa)
# kubectl get pod -A
NAMESPACE      NAME                                      READY   STATUS      RESTARTS   AGE
kube-system    metrics-server-6d684c7b5-sjh44            1/1     Running     0          132m
kube-system    local-path-provisioner-58fb86bdfd-f4cjr   1/1     Running     0          132m
kube-system    coredns-d798c9dd-8wj8x                    1/1     Running     0          132m
kube-system    helm-install-traefik-pwp9g                0/1     Completed   0          132m
kube-system    svclb-traefik-h7tcv                       3/3     Running     0          131m
kube-system    traefik-65bccdc4bd-vt9hd                  1/1     Running     0          131m
cert-manager   cert-manager-687f47b874-x4jk5             1/1     Running     0          124m
cert-manager   cert-manager-cainjector-f44b4b959-h27xh   1/1     Running     0          124m
cert-manager   cert-manager-webhook-7f8bdb755f-qqcw4     1/1     Running     1          124m
tick           influxdb-deployment-c7cb599b4-txgh5       1/1     Running     0          90m
tick           chronograf-deployment-7c48d8b5dc-c72jf    1/1     Running     0          84m
tick           telegraf-deployment-889755bb-sgkfs        1/1     Running     0          82m
tick           kapacitor-deployement-6cff699c4d-bv8jh    1/1     Running     6          86m

@erikwilson
Copy link
Contributor

For what it is worth, it is recommended for kubernetes nodes to have swap disabled, but probably especially important for the RPi3 with poor i/o, as once the system starts swapping it can slow to a crawl.

@xiaods
Copy link
Contributor

xiaods commented Nov 11, 2019

@pierremahot my os is hypriotos, it default have no rng-tools. i follow this instruct :https://www.nico-maas.de/?p=1562 to install rng-tools.

the rpi4 it works, but rpi3 not working . if i let rpi3 as agent, it raised a error:

ERRO[2019-11-11T04:48:23.412769327Z] failed to get CA certs at https://127.0.0.1:46457/cacerts: Get https://127.0.0.1:46457/cacerts: read tcp 127.0.0.1:47170->127.0.0.1:46457: read: connection reset by peer 

@davidnuzik
Copy link
Contributor

No movement on this in a while (Nov 2019) @erikwilson any thoughts on next steps on this issue?

@pierremahot
Copy link

Hi, just for update I am now using actualy k3s version v1.17.3+k3s1 (5b17a17) (work very well) and I don't have any problème on this issue since v1.13 if i remember.

@rcarmo
Copy link

rcarmo commented May 1, 2020

v1.17.4+k3s1 does not work for me in similar circumstances, so this isn't fixed in modern releases.

@KsanterX
Copy link

v1.18.2+k3s1, 5.6.14-300.fc32.aarch64, rpi3 - same problem. I also added "cgroup_enable=cpuset cgroup_memory=1 cgroup_enable=memory" but this makes no difference.

@aceslick911
Copy link

I'm having this issue too on ubuntu 64 on Pi 3 and 3+

@brandond
Copy link
Contributor

@aceslick911 can you open a new issue and fill out the template so that we can assist in diagnosing your problem?

@stale
Copy link

stale bot commented Jul 31, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Jul 31, 2021
@stale stale bot closed this as completed Aug 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests