Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

playbook for ubuntu server 4.2 fails to setup kubernetes #16

Closed
mattf opened this issue Feb 19, 2022 · 5 comments
Closed

playbook for ubuntu server 4.2 fails to setup kubernetes #16

mattf opened this issue Feb 19, 2022 · 5 comments

Comments

@mattf
Copy link

mattf commented Feb 19, 2022

https://github.com/NVIDIA/egx-platform/blob/master/playbooks/Ubuntu_Server_v4.2.md on a g4dn.2xlarge ec2 instance running Ubuntu 20.04.3 LTS fails to successfully install

$ cat hosts 
[master]
172.31.24.228 ansible_ssh_user=ubuntu ansible_ssh_common_args='-o StrictHostKeyChecking=no'
[nodes]
172.31.24.228 ansible_ssh_user=ubuntu ansible_ssh_common_args='-o StrictHostKeyChecking=no'


$ bash setup.sh install
Ansible Already Installed


EGX DIY Stack Version 4.2

Installing EGX Stack

PLAY [all] **********************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Checking Nouveau is disabled] *********************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [unload nouveau] ***********************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [blacklist nouveau] ********************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Add an Kubernetes apt signing key for Ubuntu] *****************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Adding Kubernetes apt repository for Ubuntu] ******************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Install kubernetes components for Ubuntu on EGX Stack 1.2 or 1.3] *********************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install kubernetes components for Ubuntu on EGX Stack 2.0] ****************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install kubernetes components for Ubuntu on EGX Stack 3.1] ****************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install kubernetes components for Ubuntu on EGX Stack 4.x] ****************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install kubernetes components for Ubuntu on EGX Stack 4.2] ****************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Hold the installed Packages] **********************************************************************************************************************************************************************************************************
changed: [172.31.24.228] => (item=kubelet)
changed: [172.31.24.228] => (item=kubectl)
changed: [172.31.24.228] => (item=kubeadm)

TASK [Creating a Kubernetes repository file for RHEL/CentOS] ********************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Adding repository details in Kubernetes repo file for RHEL/CentOS] ********************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Installing required packages for RHEL/CentOS] *****************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Validate whether Kubernetes cluster installed] ****************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Add Docker GPG key for Ubuntu] ********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Add Docker APT repository for Ubuntu] *************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install Docker-CE Engine for Ubuntu 20.04 on EGX Stack 3.1 or 1.3] ********************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install Docker-CE Engine for Ubuntu 18.04 on EGX Stack 3.1 or 1.3] ********************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install Docker-CE Engine for Ubuntu on EGX Stack 2.0] *********************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install Docker-CE Engine for Ubuntu on EGX Stack 1.2] *********************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Configuring Docker-CE repo for RHEL/CentOS] *******************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install Docker-CE Engine on RHEL/CentOS] **********************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Adding Docker to Current User] ********************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [SetEnforce for RHEL/CentOS] ***********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [SELinux for RHEL/CentOS] **************************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Enable Firewall Service for RHEL/CentOS] **********************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Allow Network Ports in Firewalld for RHEL/CentOS] *************************************************************************************************************************************************************************************
skipping: [172.31.24.228] => (item=6443/tcp) 
skipping: [172.31.24.228] => (item=10250/tcp) 

TASK [Remove swapfile from /etc/fstab] ******************************************************************************************************************************************************************************************************
ok: [172.31.24.228] => (item=swap)
ok: [172.31.24.228] => (item=none)

TASK [Disable swap] *************************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Create containerd.conf] ***************************************************************************************************************************************************************************************************************
changed: [172.31.24.228] => (item=overlay)
changed: [172.31.24.228] => (item=br_netfilter)

TASK [Modprobe for overlay and br_netfilter] ************************************************************************************************************************************************************************************************
ok: [172.31.24.228] => (item=overlay)
ok: [172.31.24.228] => (item=br_netfilter)

TASK [Add sysctl parameters to /etc/sysctl.conf] ********************************************************************************************************************************************************************************************
ok: [172.31.24.228] => (item={'name': 'net.bridge.bridge-nf-call-ip6tables', 'value': '1', 'reload': False})
ok: [172.31.24.228] => (item={'name': 'net.bridge.bridge-nf-call-iptables', 'value': '1', 'reload': False})
ok: [172.31.24.228] => (item={'name': 'net.ipv4.ip_forward', 'value': '1', 'reload': True})

TASK [Install libseccomp2] ******************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Create /etc/containerd] ***************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Create /etc/default/kubelet] **********************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Download cri-containerd-cni] **********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Untar cri-containerd-cni] *************************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Get defaults from containerd] *********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Write defaults to config.toml] ********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [restart containerd] *******************************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Download cri-containerd-cni] **********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Untar cri-containerd-cni] *************************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Get defaults from containerd] *********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Write defaults to config.toml] ********************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [restart containerd] *******************************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Download cri-containerd-cni] **********************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Untar cri-containerd-cni] *************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Get defaults from containerd] *********************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Write defaults to config.toml] ********************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [restart containerd] *******************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Starting and enabling the required services] ******************************************************************************************************************************************************************************************
ok: [172.31.24.228] => (item=docker)
ok: [172.31.24.228] => (item=kubelet)
ok: [172.31.24.228] => (item=containerd)

PLAY RECAP **********************************************************************************************************************************************************************************************************************************
172.31.24.228              : ok=24   changed=12   unreachable=0    failed=0    skipped=29   rescued=0    ignored=0   


PLAY [master] *******************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Validate whether Kubernetes cluster installed] ****************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Reset Kubernetes component] ***********************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [remove etcd directory] ****************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Iniitialize the Kubernetes cluster using kubeadm and containerd] **********************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Initialize the Kubernetes cluster using kubeadm] **************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Create kube directory] ****************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [admin permissions] ********************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Copy kubeconfig to home] **************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Install networking plugin to kubernetes cluster on EGX DIY Stack 3.1 or 4.0 or 4.1 or 4.2] ********************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Install networking plugin to kubernetes cluster on EGX DIY Stack 2.0] *****************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Install networking plugin to kubernetes cluster on EGX DIY Stack 1.2 or 1.3] **********************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Update Networking Plugin for  on EGX DIY Stack 1.2 or 1.3] ****************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Taint the Kubernetes Control Plane node] **********************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Generate join token] ******************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [set_fact] *****************************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Store join command] *******************************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

PLAY [nodes] ********************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Reset Kubernetes component] ***********************************************************************************************************************************************************************************************************
changed: [172.31.24.228]

TASK [Create kube directory] ****************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Copy kubeadm-join command to node] ****************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Get the Active Mellanox NIC on nodes] *************************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

TASK [Copy Mellanox NIC Active File to master] **********************************************************************************************************************************************************************************************
skipping: [172.31.24.228]

PLAY [nodes] ********************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************************************************************************************************************
ok: [172.31.24.228]

TASK [Run kubeadm join] *********************************************************************************************************************************************************************************************************************
fatal: [172.31.24.228]: FAILED! => {"changed": true, "cmd": "kubeadm join 172.31.24.228:6443 --token ddhvan.6rb88wwwx33gxre0 --discovery-token-ca-cert-hash sha256:d3a8c4144c5b08c1c34ef20696db23063682fc5b74dd4d42d184cd33fbba0d2f ", "delta": "0:05:00.161901", "end": "2022-02-19 19:40:31.153453", "msg": "non-zero return code", "rc": 1, "start": "2022-02-19 19:35:30.991552", "stderr": "error execution phase preflight: couldn't validate the identity of the API Server: Get \"https://172.31.24.228:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s\": dial tcp 172.31.24.228:6443: connect: connection refused\nTo see the stack trace of this error execute with --v=5 or higher", "stderr_lines": ["error execution phase preflight: couldn't validate the identity of the API Server: Get \"https://172.31.24.228:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s\": dial tcp 172.31.24.228:6443: connect: connection refused", "To see the stack trace of this error execute with --v=5 or higher"], "stdout": "[preflight] Running pre-flight checks", "stdout_lines": ["[preflight] Running pre-flight checks"]}

PLAY RECAP **********************************************************************************************************************************************************************************************************************************
172.31.24.228              : ok=18   changed=10   unreachable=0    failed=1    skipped=6    rescued=0    ignored=0   
$ systemctl status containerd
● containerd.service - containerd container runtime
     Loaded: loaded (/etc/systemd/system/containerd.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2022-02-19 19:34:38 UTC; 13min ago
       Docs: https://containerd.io
   Main PID: 37046 (containerd)
      Tasks: 24
     Memory: 923.5M
     CGroup: /system.slice/containerd.service
             └─37046 /usr/local/bin/containerd

Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.140155035Z" level=info msg="cleaning up dead shim"
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.150767759Z" level=warning msg="cleanup warnings time=\"2022-02-19T19:35:29Z\" level=info msg=\"starting signal loop\" namespace=k8s.io pid=39253\n"
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.198470266Z" level=info msg="shim disconnected" id=6a5b60d0c24e1a1748c9dd7fc98b94f957967e6a00ea4679be269c67596d18d4
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.198527288Z" level=warning msg="cleaning up after shim disconnected" id=6a5b60d0c24e1a1748c9dd7fc98b94f957967e6a00ea4679be269c67596d18d4 namespace=k8s.io
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.198540466Z" level=info msg="cleaning up dead shim"
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.209116727Z" level=warning msg="cleanup warnings time=\"2022-02-19T19:35:29Z\" level=info msg=\"starting signal loop\" namespace=k8s.io pid=39292\n"
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.209466026Z" level=info msg="TearDown network for sandbox \"6a5b60d0c24e1a1748c9dd7fc98b94f957967e6a00ea4679be269c67596d18d4\" successfully"
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.209491081Z" level=info msg="StopPodSandbox for \"6a5b60d0c24e1a1748c9dd7fc98b94f957967e6a00ea4679be269c67596d18d4\" returns successfully"
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.223992525Z" level=info msg="RemovePodSandbox for \"6a5b60d0c24e1a1748c9dd7fc98b94f957967e6a00ea4679be269c67596d18d4\""
Feb 19 19:35:29 ip-172-31-24-228 containerd[37046]: time="2022-02-19T19:35:29.230819595Z" level=info msg="RemovePodSandbox \"6a5b60d0c24e1a1748c9dd7fc98b94f957967e6a00ea4679be269c67596d18d4\" returns successfully"
$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: inactive (dead) since Sat 2022-02-19 19:35:28 UTC; 12min ago
       Docs: https://kubernetes.io/docs/home/
    Process: 38103 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=0/SUCCESS)
   Main PID: 38103 (code=exited, status=0/SUCCESS)

Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340896   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/8163a2>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340945   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/930396>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340977   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/930396a63ed739fc>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.341011   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.341041   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/1ed545894a1d8d6>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.923817   38103 apiserver.go:52] "Watching apiserver"
Feb 19 19:35:28 ip-172-31-24-228 kubelet[38103]: I0219 19:35:28.147621   38103 reconciler.go:157] "Reconciler: start to sync state"
Feb 19 19:35:28 ip-172-31-24-228 systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
Feb 19 19:35:28 ip-172-31-24-228 systemd[1]: kubelet.service: Succeeded.
Feb 19 19:35:28 ip-172-31-24-228 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
$ journalctl -n100 -u kubelet
-- Logs begin at Sat 2022-02-19 11:33:31 UTC, end at Sat 2022-02-19 19:47:02 UTC. --
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.304412   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.404542   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.504989   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.605633   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.706066   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.806337   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.807446   37621 nodelease.go:49] "Failed to get node when trying to set owner ref to the node lease" err="nodes \"ip-172-31-24-228\" not found" node="ip-172-31-24-228"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: E0219 19:35:18.906865   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:18 ip-172-31-24-228 kubelet[37621]: I0219 19:35:18.911712   37621 kubelet_node_status.go:74] "Successfully registered node" node="ip-172-31-24-228"
Feb 19 19:35:19 ip-172-31-24-228 kubelet[37621]: E0219 19:35:19.007803   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:19 ip-172-31-24-228 kubelet[37621]: E0219 19:35:19.108114   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:19 ip-172-31-24-228 kubelet[37621]: E0219 19:35:19.208318   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:19 ip-172-31-24-228 kubelet[37621]: E0219 19:35:19.308797   37621 kubelet.go:2291] "Error getting node" err="node \"ip-172-31-24-228\" not found"
Feb 19 19:35:20 ip-172-31-24-228 kubelet[37621]: I0219 19:35:20.342050   37621 apiserver.go:52] "Watching apiserver"
Feb 19 19:35:20 ip-172-31-24-228 kubelet[37621]: I0219 19:35:20.515471   37621 reconciler.go:157] "Reconciler: start to sync state"
Feb 19 19:35:21 ip-172-31-24-228 systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
Feb 19 19:35:21 ip-172-31-24-228 systemd[1]: kubelet.service: Succeeded.
Feb 19 19:35:21 ip-172-31-24-228 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Feb 19 19:35:21 ip-172-31-24-228 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Feb 19 19:35:21 ip-172-31-24-228 kubelet[38103]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluste>
Feb 19 19:35:21 ip-172-31-24-228 kubelet[38103]: I0219 19:35:21.882857   38103 server.go:197] "Warning: For remote container runtime, --pod-infra-container-image is ignored in kubelet, which should be set in that remote runtime instead"
Feb 19 19:35:21 ip-172-31-24-228 kubelet[38103]: Flag --cgroup-driver has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluste>
Feb 19 19:35:21 ip-172-31-24-228 kubelet[38103]: I0219 19:35:21.891922   38103 server.go:440] "Kubelet version" kubeletVersion="v1.21.7"
Feb 19 19:35:21 ip-172-31-24-228 kubelet[38103]: I0219 19:35:21.892170   38103 server.go:851] "Client rotation is on, will bootstrap in background"
Feb 19 19:35:21 ip-172-31-24-228 kubelet[38103]: I0219 19:35:21.893803   38103 certificate_store.go:130] Loading cert/key pair from "/var/lib/kubelet/pki/kubelet-client-current.pem".
Feb 19 19:35:21 ip-172-31-24-228 kubelet[38103]: I0219 19:35:21.894711   38103 dynamic_cafile_content.go:167] Starting client-ca-bundle::/etc/kubernetes/pki/ca.crt
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922132   38103 server.go:660] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922355   38103 container_manager_linux.go:278] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922423   38103 container_manager_linux.go:283] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsNam>
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922439   38103 topology_manager.go:120] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922449   38103 container_manager_linux.go:314] "Initializing Topology Manager" policy="none" scope="container"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922456   38103 container_manager_linux.go:319] "Creating device plugin manager" devicePluginEnabled=true
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922539   38103 remote_runtime.go:62] parsed scheme: ""
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922547   38103 remote_runtime.go:62] scheme "" not registered, fallback to default scheme
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922578   38103 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{/run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922588   38103 clientconn.go:948] ClientConn switching balancer to "pick_first"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922644   38103 remote_image.go:50] parsed scheme: ""
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922651   38103 remote_image.go:50] scheme "" not registered, fallback to default scheme
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922659   38103 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{/run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922664   38103 clientconn.go:948] ClientConn switching balancer to "pick_first"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922736   38103 kubelet.go:404] "Attempting to sync node with API server"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922753   38103 kubelet.go:272] "Adding static pod path" path="/etc/kubernetes/manifests"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922812   38103 kubelet.go:283] "Adding apiserver pod source"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.922833   38103 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.933337   38103 kuberuntime_manager.go:222] "Container runtime initialized" containerRuntime="containerd" version="v1.5.8" apiVersion="v1alpha2"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.936118   38103 server.go:1190] "Started kubelet"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.937526   38103 server.go:149] "Starting to listen" address="0.0.0.0" port=10250
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.938794   38103 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.938846   38103 server.go:409] "Adding debug handlers to kubelet server"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.938881   38103 volume_manager.go:271] "Starting Kubelet Volume Manager"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.938912   38103 desired_state_of_world_populator.go:141] "Desired state populator starts to run"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: E0219 19:35:26.940346   38103 cri_stats_provider.go:369] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/containe>
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: E0219 19:35:26.940389   38103 kubelet.go:1306] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.940448   38103 client.go:86] parsed scheme: "unix"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.940800   38103 client.go:86] scheme "unix" not registered, fallback to default scheme
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.941095   38103 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{unix:///run/containerd/containerd.sock  <nil> 0 <nil>}] <nil> <nil>}
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.941242   38103 clientconn.go:948] ClientConn switching balancer to "pick_first"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.965658   38103 kubelet_network_linux.go:56] "Initialized protocol iptables rules." protocol=IPv4
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.981806   38103 kubelet_network_linux.go:56] "Initialized protocol iptables rules." protocol=IPv6
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.981830   38103 status_manager.go:157] "Starting to sync pod status with apiserver"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: I0219 19:35:26.981853   38103 kubelet.go:1846] "Starting kubelet main sync loop"
Feb 19 19:35:26 ip-172-31-24-228 kubelet[38103]: E0219 19:35:26.981906   38103 kubelet.go:1870] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be succ>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.041741   38103 kubelet_node_status.go:71] "Attempting to register node" node="ip-172-31-24-228"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.052702   38103 kubelet_node_status.go:109] "Node was previously registered" node="ip-172-31-24-228"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.052798   38103 kubelet_node_status.go:74] "Successfully registered node" node="ip-172-31-24-228"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: E0219 19:35:27.082391   38103 kubelet.go:1870] "Skipping pod synchronization" err="container runtime status check may not have completed yet"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.086107   38103 cpu_manager.go:199] "Starting CPU manager" policy="none"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.086135   38103 cpu_manager.go:200] "Reconciling" reconcilePeriod="10s"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.086153   38103 state_mem.go:36] "Initialized new in-memory state store"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.086326   38103 state_mem.go:88] "Updated default CPUSet" cpuSet=""
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.086343   38103 state_mem.go:96] "Updated CPUSet assignments" assignments=map[]
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.086352   38103 policy_none.go:44] "None policy: Start"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.090812   38103 manager.go:600] "Failed to retrieve checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.091118   38103 plugin_manager.go:114] "Starting Kubelet Plugin Manager"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.283039   38103 topology_manager.go:187] "Topology Admit Handler"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.283181   38103 topology_manager.go:187] "Topology Admit Handler"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.283244   38103 topology_manager.go:187] "Topology Admit Handler"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.283295   38103 topology_manager.go:187] "Topology Admit Handler"
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340336   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/8163a26df170b890f>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340394   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-pki\" (UniqueName: \"kubernetes.io/host-path/8163a26df170b890f5>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340424   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340538   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340592   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-pki\" (UniqueName: \"kubernetes.io/host-path/930396a63ed739fcfd>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340622   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-data\" (UniqueName: \"kubernetes.io/host-path/91c2306dae8c7092>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340650   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/8163a26df170b890>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340677   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"ca-certs\" (UniqueName: \"kubernetes.io/host-path/930396a63ed739fcf>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340722   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"flexvolume-dir\" (UniqueName: \"kubernetes.io/host-path/930396a63ed>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340753   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/930396a63ed739f>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340799   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-local-share-ca-certificates\" (UniqueName: \"kubernetes.io/host>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340842   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etcd-certs\" (UniqueName: \"kubernetes.io/host-path/91c2306dae8c709>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340896   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/8163a2>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340945   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"etc-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/930396>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.340977   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"k8s-certs\" (UniqueName: \"kubernetes.io/host-path/930396a63ed739fc>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.341011   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"usr-share-ca-certificates\" (UniqueName: \"kubernetes.io/host-path/>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.341041   38103 reconciler.go:224] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kubeconfig\" (UniqueName: \"kubernetes.io/host-path/1ed545894a1d8d6>
Feb 19 19:35:27 ip-172-31-24-228 kubelet[38103]: I0219 19:35:27.923817   38103 apiserver.go:52] "Watching apiserver"
Feb 19 19:35:28 ip-172-31-24-228 kubelet[38103]: I0219 19:35:28.147621   38103 reconciler.go:157] "Reconciler: start to sync state"
Feb 19 19:35:28 ip-172-31-24-228 systemd[1]: Stopping kubelet: The Kubernetes Node Agent...
Feb 19 19:35:28 ip-172-31-24-228 systemd[1]: kubelet.service: Succeeded.
Feb 19 19:35:28 ip-172-31-24-228 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
@mattf
Copy link
Author

mattf commented Feb 19, 2022

cc @erikbohnhorst

@angudadevops
Copy link
Collaborator

@mattf this could be issue with your system firewalls. is this fresh step that can communicate between master and node ?

@mattf
Copy link
Author

mattf commented Mar 13, 2022

@angudadevops it's a fresh system. master == node, so no firewall issue.

@angudadevops
Copy link
Collaborator

@mattf you don't need to add same node in master and nodes group as master is tainted to run workloads. if you only have master just keep in master group and you can run workloads after EGX stack installation completed.

@angudadevops
Copy link
Collaborator

@mattf closing the issue as provided the guidance above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants