You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After running the site.yml playbook. The task to spin up the services on the nodes halt indefinitely:
TASK [k3s/node : Enable and check K3s service] **************************************************************************************************************************************************************************************************************************************************
Sunday 08 January 2023 11:27:39 -0500 (0:00:04.668) 0:03:45.715 ********
After investigating the nodes service logs for k3s-node I discovered continuous timeouts in the transaction to get the CA certs from the master node (via load balancer):
node@node1:~ $ journalctl -fu k3s-node.service
...
Jan 08 11:39:45 node1 k3s[2662]: time="2023-01-08T11:39:45-05:00" level=error msg="failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
Diagnostics
I then tried to curl the cacerts endpoint from the node1 system and discovered the curl command is stalling here:
Seems like the master node has a server issue. To double check I verified port 6444 is listening on the master node:
node@master:~ $ netstat -lupt
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 356 0 master:sge-qmaster 0.0.0.0:* LISTEN -
tcp 0 0 master:10010 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN -
tcp 0 0 master:10249 0.0.0.0:* LISTEN -
tcp 0 0 master:10248 0.0.0.0:* LISTEN -
tcp 0 0 master:10259 0.0.0.0:* LISTEN -
tcp 0 0 master:10258 0.0.0.0:* LISTEN -
tcp 0 0 master:10257 0.0.0.0:* LISTEN -
tcp 0 0 master:10256 0.0.0.0:* LISTEN -
tcp 0 0 master:ipp 0.0.0.0:* LISTEN -
tcp6 0 0 [::]:10251 [::]:* LISTEN -
tcp6 0 0 [::]:10250 [::]:* LISTEN -
tcp6 0 0 [::]:ssh [::]:* LISTEN -
tcp6 0 0 localhost:ipp [::]:* LISTEN -
tcp6 825 0 [::]:6443 [::]:* LISTEN -
udp 0 0 0.0.0.0:bootpc 0.0.0.0:* -
udp 0 0 0.0.0.0:631 0.0.0.0:* -
udp 0 0 0.0.0.0:mdns 0.0.0.0:* -
udp 0 0 0.0.0.0:8472 0.0.0.0:* -
udp 0 0 0.0.0.0:47457 0.0.0.0:* -
udp6 0 0 [::]:mdns [::]:* -
udp6 0 0 [::]:59701 [::]:* -
This is the extent of my troubleshooting as I lack expertise on the inner workings of k3s. Any guidance would be appreciated. Happy to provide more information if required. I'm also willing to dedicate my setup to solve this issue so I will leave it in the aforementioned state.
The text was updated successfully, but these errors were encountered:
I ran into what I believe is the same problem, for me it was a network connectivity issue between the agent nodes and the master node.
The error message I think is misleading because it has "localhost" in the connection, but when I unblocked the agents access to the master node on port 6443 it started working
Yea I think the requests on the nodes go through a load balancer hosted locally before getting forwarded to the master node. I'm not sure what would cause a connection problem between my nodes.
Typically issues around port 6443 are related to firewall. It is recommended to disable firewalls or provide minimal openings. See https://docs.k3s.io/advanced#ubuntu--debian. Additionally, tracking #234
Context
I have a Turingpi Cluster running 7 RPi3+ Compute modules. Each of them are freshly installed with 64 bit RaspberryPi OS (5.15.84-v8+):
Here is my
hosts.ini
for my deployment:Issue
After running the
site.yml
playbook. The task to spin up the services on the nodes halt indefinitely:After investigating the nodes service logs for
k3s-node
I discovered continuous timeouts in the transaction to get the CA certs from the master node (via load balancer):Diagnostics
I then tried to curl the
cacerts
endpoint from thenode1
system and discovered the curl command is stalling here:Seems like the master node has a server issue. To double check I verified port
6444
is listening on the master node:This is the extent of my troubleshooting as I lack expertise on the inner workings of k3s. Any guidance would be appreciated. Happy to provide more information if required. I'm also willing to dedicate my setup to solve this issue so I will leave it in the aforementioned state.
The text was updated successfully, but these errors were encountered: