Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent node install error #9009

Closed
songxychn opened this issue Dec 8, 2023 · 1 comment
Closed

agent node install error #9009

songxychn opened this issue Dec 8, 2023 · 1 comment

Comments

@songxychn
Copy link

songxychn commented Dec 8, 2023

Environmental Info:
K3s Version: v1.27.7+k3s2 (575bce7)

Node(s) CPU architecture, OS, and Version: Linux C20231107117427 5.15.0-30-generic #31-Ubuntu SMP Thu May 5 10:00:34 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration: 1 server, 1 agent

all ports are open on both server and agent machine.
server and agent machine has unique hostname

Describe the bug:

agent install error. hangs at "[INFO] systemd: Starting k3s-agent".
systemctl status k3s-agent.service returns:

k3s-agent.service - Lightweight Kubernetes
     Loaded: loaded (/etc/systemd/system/k3s-agent.service; enabled; vendor preset: enabled)
     Active: activating (start) since Fri 2023-12-08 12:39:04 CST; 31s ago
       Docs: https://k3s.io
    Process: 1990 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service (code=exited, status=0/SUCCESS)
    Process: 1992 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
    Process: 1993 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
   Main PID: 1994 (k3s-agent)
      Tasks: 7
     Memory: 206.3M
        CPU: 2.400s
     CGroup: /system.slice/k3s-agent.service
             └─1994 "/usr/local/bin/k3s agent"
Dec 08 12:39:04 iZbp1ih3jaqpntkqf51og7Z systemd[1]: Starting Lightweight Kubernetes...
Dec 08 12:39:04 iZbp1ih3jaqpntkqf51og7Z sh[1990]: + /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service
Dec 08 12:39:04 iZbp1ih3jaqpntkqf51og7Z sh[1991]: Failed to get unit file state for nm-cloud-setup.service: No such file or directory
Dec 08 12:39:04 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:04+08:00" level=info msg="Acquiring lock file /var/lib/rancher/k3s/data/.lock"
Dec 08 12:39:04 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:04+08:00" level=info msg="Preparing data dir /var/lib/rancher/k3s/data/bf3548384eaabb3435bf08112f1b0cba1afc5add6a6f2f2372aa2906a598fd04"
Dec 08 12:39:07 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:07+08:00" level=info msg="Starting k3s agent v1.27.7+k3s2 (575bce76)"
Dec 08 12:39:07 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:07+08:00" level=info msg="Adding server to load balancer k3s-agent-load-balancer: 38.207.179.174:6443"
Dec 08 12:39:07 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:07+08:00" level=info msg="Running load balancer k3s-agent-load-balancer 127.0.0.1:6444 -> [38.207.179.174:6443] [default: 38.207.179.174:6443]"
Dec 08 12:39:21 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:21+08:00" level=error msg="CA cert validation failed: Get \"https://127.0.0.1:6444/cacerts\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
Dec 08 12:39:31 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:31+08:00" level=info msg="Waiting to retrieve agent configuration; server is not ready: Node password rejected, duplicate hostname or contents of '/etc/rancher/node/password' may not match server node-passwd entry, try enabling a unique node name with the --with-node-id flag"

i tried to add the --with-node-id flag when install the agent, but i still get error log like below:

E1208 12:32:21.063418   87489 reflector.go:148] k8s.io/client-go@v1.27.7-k3s1/tools/cache/reflector.go:231: Failed to watch *v1.Endpoints: failed to list *v1.Endpoints: Get "https://127.0.0.1:6444/api/v1/namespaces/default/endpoints?fieldSelector=metadata.name%3Dkubernetes&limit=500&resourceVersion=0": net/http: TLS handshake timeout
time="2023-12-08T12:32:33Z" level=info msg="Waiting to retrieve kube-proxy configuration; server is not ready: failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": context deadline exceed (Client.Timeout exceeded while awaiting headers)"
time="2023-12-08T12:32:34Z" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 10.19.177.2:6443: connect: connection timed out"
time="2023-12-08T12:32:34Z" level=error msg="Remotedialer proxy error" error="dial tcp 10.19.177.2:6443: connect: connection timed out"
time="2023-12-08T12:32:39Z" level=info msg="Connecting to proxy" url="wss://10.19.177.2:6443/v1-k3s/connect"

Steps To Reproduce:

  • Installed K3s: curl -sfL https://get.k3s.io | sh -
  • get token by cat /var/lib/rancher/k3s/server/node-token
  • install k3s agent on another machine by curl -sfL https://get.k3s.io | K3S_URL=https://xxx:6443 K3S_TOKEN=xxx sh -
  • agent install hangs at "[INFO] systemd: Starting k3s-agent"

Expected behavior:

agent node can be installed and join the cluster successfully

Actual behavior:

agent node install error

Additional context / logs:

@brandond
Copy link
Contributor

brandond commented Dec 8, 2023

Dec 08 12:39:31 iZbp1ih3jaqpntkqf51og7Z k3s[1994]: time="2023-12-08T12:39:31+08:00" level=info msg="Waiting to retrieve agent configuration; server is not ready: Node password rejected, duplicate hostname or contents of '/etc/rancher/node/password' may not match server node-passwd entry, try enabling a unique node name with the --with-node-id flag"

See: https://docs.k3s.io/architecture#how-agent-node-registration-works

If that does not help, confirm that you can access the server from the agent: curl -vks https://10.19.177.2:6443

@brandond brandond closed this as completed Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants