Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multimaster Setup - Master 1 corrupting when issues join command on Master-2 #15637

Closed
sunvk opened this issue Aug 2, 2019 · 12 comments
Closed
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@sunvk
Copy link

sunvk commented Aug 2, 2019

Hi,

I am trying to setup a multi master. After configuring Master-1, when I try to join master-2, a timeout happens and master 1 also getting corrupted.

[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[kubelet-check] Initial timeout of 40s passed.
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available

Any help on this please.

Thanks,
Sunish

@neolit123
Copy link
Member

neolit123 commented Aug 2, 2019

hi, any errors logs from api server, kubelet or kubeadm?
/triage support

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Aug 2, 2019
@sunvk
Copy link
Author

sunvk commented Aug 6, 2019

This is what happening.

My setup contains 1 LB & 2 Master nodes with local ETCD.
Below is my kubeadm configuration & I am using kubernetes 15.1

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "192.168.22.49:6443"
networking:
podSubnet: 172.16.0.0/16

After the Kubeadm init on first Master, I am getting response for kubectl get nodes without any issues.

When I issue controlplane join on second master, it is getting stuck on below lines.

[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[kubelet-check] Initial timeout of 40s passed.
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available

Same time if you reissue kubectl get nodes on master 1 </b ., getting the below response & the master 1 is getting corrupted.

[root@k8-1 ~]# kubectl get pods
Unable to connect to the server: EOF
[root@k8-1 ~]# kubectl get nods
Unable to connect to the server: EOF

Thanks,
Sunish

@sunvk sunvk changed the title Multimaster Setup - Hung at kubeadm-certs retreival calls Multimaster Setup - Master 1 corrupting when issues join command on Master-2 Aug 6, 2019
@neolit123
Copy link
Member

neolit123 commented Aug 6, 2019

[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s [kubelet-check] Initial timeout of 40s passed. error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available

etcd not comming up is unusual.

random questions:

  • are you always calling kubeadm reset before kubeadm init?
  • are you using a private container registry where you have a custom etcd image?
  • are you using custom certificates?
  • please share the full log of kubeadm init?

@sunvk
Copy link
Author

sunvk commented Aug 6, 2019

- are you always calling kubeadm reset before kubeadm init?
No. Kubeadm init only calling first. But once it is getting corrupted, i do reset and try again
- are you using a private container registry where you have a custom etcd image?
No. Just a default setup. no custom image. Just followed this https://octetz.com/posts/ha-control-plane-k8s-kubeadm
- are you using custom certificates?
No. Just followed https://octetz.com/posts/ha-control-plane-k8s-kubeadm
- please share the full log of kubeadm init?

[root@k8-1 ~]# kubeadm init \

--config=/etc/kubernetes/kubeadm/kubeadm-config.yaml \
--upload-certs

[init] Using Kubernetes version: v1.15.2
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.1. Latest validated version: 18.09
[WARNING Hostname]: hostname "k8-1" could not be reached
[WARNING Hostname]: hostname "k8-1": lookup k8-1 on 10.215.210.71:53: no such host
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8-1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.22.45 192.168.22.49]
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8-1 localhost] and IPs [192.168.22.45 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8-1 localhost] and IPs [192.168.22.45 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 18.502765 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.15" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
a908f294a2418172b9dbb69227e00ae8a193f9bc0b1c428a8f7bccc070ff0820
[mark-control-plane] Marking the node k8-1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8-1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: sz4ems.eem2wvwqafumvsf2
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

kubeadm join 192.168.22.49:6443 --token sz4ems.eem2wvwqafumvsf2
--discovery-token-ca-cert-hash sha256:32a7a42dcdfc2c77f39d2bc81b1f7fc6e6bedb2b87a55306ba5feb6f011beab1
--control-plane --certificate-key a908f294a2418172b9dbb69227e00ae8a193f9bc0b1c428a8f7bccc070ff0820

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.22.49:6443 --token sz4ems.eem2wvwqafumvsf2
--discovery-token-ca-cert-hash sha256:32a7a42dcdfc2c77f39d2bc81b1f7fc6e6bedb2b87a55306ba5feb6f011beab1

@neolit123
Copy link
Member

No. Just a default setup. no custom image. Just followed this https://octetz.com/posts/ha-control-plane-k8s-kubeadm

please follow our official guide instead https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

/etc/kubernetes/kubeadm/kubeadm-config.yaml

what are contents of your kubeadm config?

When I issue controlplane join on second master, it is getting stuck on below lines.

are you calling kubeadm reset before kubeadm join on the second master?
what is the full output of kubeadm join?

@sunvk
Copy link
Author

sunvk commented Aug 6, 2019

Kubeadm Config:

[centos@k8-2 ~]$ cat /etc/kubernetes/kubeadm/kubeadm-config.yaml

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "192.168.22.45:6443"
networking:
podSubnet: 172.16.0.0/16

are you calling kubeadm reset before kubeadm join on the second master? : NO

what is the full output of kubeadm join?
[root@k8-2 ~]# kubeadm join 192.168.22.49:6443 --token 3jbtq2.5skou70aih9p4xfz \

--discovery-token-ca-cert-hash sha256:bb6775bc262cf3b4f18c16e544ab76c555f1f3053ed11707bf0ac4d5897ab158 \
--control-plane --certificate-key a81797d32201dda02ab1018ea335391c0a5295444383a11ebb05d1992f734499

[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.1. Latest validated version: 18.09
[WARNING Hostname]: hostname "k8-2" could not be reached
[WARNING Hostname]: hostname "k8-2": lookup k8-2 on 10.215.210.71:53: no such host
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8-2 localhost] and IPs [192.168.22.43 127.0.0.1 ::1]
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8-2 localhost] and IPs [192.168.22.43 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8-2 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.22.43 192.168.22.49]
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[kubelet-check] Initial timeout of 40s passed.
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available

@neolit123
Copy link
Member

not sure what is going on.

I am getting response for kubectl get nodes without any issues.

if the first control-plane node has initialized properly you should be able to call:
kubectl get po -A

the kubeconfig is in /etc/kubernetes/admin.conf on that first control-plane node.

so alternatively:
KUBECONFIG=/etc/kubernetes/admin.conf sudo kubectl get no -A

also please share the output of kubeadm join .... --v=2 from the second CP node.

@sunvk
Copy link
Author

sunvk commented Aug 7, 2019

Please find the join output from second CP node with --v=10

[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0807 14:35:31.255265 3172 certs.go:39] creating PKI assets
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master-k8-2 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.22.23 192.168.22.39]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master-k8-2 localhost] and IPs [192.168.22.23 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master-k8-2 localhost] and IPs [192.168.22.23 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
I0807 14:35:33.534943 3172 certs.go:70] creating a new public/private key files for signing service account users
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0807 14:35:34.214731 3172 manifests.go:115] [control-plane] getting StaticPodSpecs
I0807 14:35:34.220998 3172 manifests.go:131] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0807 14:35:34.221015 3172 manifests.go:115] [control-plane] getting StaticPodSpecs
I0807 14:35:34.221764 3172 manifests.go:131] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0807 14:35:34.221779 3172 manifests.go:115] [control-plane] getting StaticPodSpecs
I0807 14:35:34.222280 3172 manifests.go:131] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[check-etcd] Checking that the etcd cluster is healthy
I0807 14:35:34.222711 3172 loader.go:359] Config loaded from file: /etc/kubernetes/admin.conf
I0807 14:35:34.223260 3172 local.go:66] [etcd] Checking etcd cluster health
I0807 14:35:34.223268 3172 local.go:69] creating etcd client that connects to etcd pods
I0807 14:35:34.223337 3172 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, /" -H "User-Agent: kubeadm/v1.15.1 (linux/amd64) kubernetes/4485c6f" 'https://192.168.22.39:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config'
I0807 14:35:34.232232 3172 round_trippers.go:438] GET https://192.168.22.39:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config 200 OK in 8 milliseconds
I0807 14:35:34.232244 3172 round_trippers.go:444] Response Headers:
I0807 14:35:34.232249 3172 round_trippers.go:447] Content-Type: application/json
I0807 14:35:34.232253 3172 round_trippers.go:447] Content-Length: 1008
I0807 14:35:34.232257 3172 round_trippers.go:447] Date: Wed, 07 Aug 2019 18:35:33 GMT
I0807 14:35:34.232277 3172 request.go:947] Response Body: {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"kubeadm-config","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/configmaps/kubeadm-config","uid":"fc7c7426-d05f-4516-bcf5-6623d2823647","resourceVersion":"155","creationTimestamp":"2019-08-07T18:26:36Z"},"data":{"ClusterConfiguration":"apiServer:\n extraArgs:\n authorization-mode: Node,RBAC\n timeoutForControlPlane: 4m0s\napiVersion: kubeadm.k8s.io/v1beta2\ncertificatesDir: /etc/kubernetes/pki\nclusterName: kubernetes\ncontrolPlaneEndpoint: 192.168.22.39:6443\ncontrollerManager: {}\ndns:\n type: CoreDNS\netcd:\n local:\n dataDir: /var/lib/etcd\nimageRepository: k8s.gcr.io\nkind: ClusterConfiguration\nkubernetesVersion: v1.15.2\nnetworking:\n dnsDomain: cluster.local\n podSubnet: 172.16.0.0/16\n serviceSubnet: 10.96.0.0/12\nscheduler: {}\n","ClusterStatus":"apiEndpoints:\n master-k8-1:\n advertiseAddress: 192.168.22.46\n bindPort: 6443\napiVersion: kubeadm.k8s.io/v1beta2\nkind: ClusterStatus\n"}}
I0807 14:35:34.232536 3172 etcd.go:106] etcd endpoints read from pods: https://192.168.22.46:2379
I0807 14:35:34.242778 3172 etcd.go:147] etcd endpoints read from etcd: https://192.168.22.46:2379
I0807 14:35:34.242796 3172 etcd.go:124] update etcd endpoints: https://192.168.22.46:2379
I0807 14:35:34.258382 3172 kubelet.go:105] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I0807 14:35:34.300918 3172 loader.go:359] Config loaded from file: /etc/kubernetes/bootstrap-kubelet.conf
I0807 14:35:34.301284 3172 kubelet.go:131] [kubelet-start] Stopping the kubelet
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace
I0807 14:35:34.308330 3172 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, /" -H "User-Agent: kubeadm/v1.15.1 (linux/amd64) kubernetes/4485c6f" -H "Authorization: Bearer d23i5c.w0vqeq73kv32bi3u" 'https://192.168.22.39:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config-1.15'
I0807 14:35:34.314138 3172 round_trippers.go:438] GET https://192.168.22.39:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config-1.15 200 OK in 5 milliseconds
I0807 14:35:34.314149 3172 round_trippers.go:444] Response Headers:
I0807 14:35:34.314154 3172 round_trippers.go:447] Date: Wed, 07 Aug 2019 18:35:33 GMT
I0807 14:35:34.314158 3172 round_trippers.go:447] Content-Type: application/json
I0807 14:35:34.314161 3172 round_trippers.go:447] Content-Length: 2133
I0807 14:35:34.314184 3172 request.go:947] Response Body: {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"kubelet-config-1.15","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/configmaps/kubelet-config-1.15","uid":"3b2c49ab-f910-4029-a476-7868f978cee3","resourceVersion":"158","creationTimestamp":"2019-08-07T18:26:36Z"},"data":{"kubelet":"address: 0.0.0.0\napiVersion: kubelet.config.k8s.io/v1beta1\nauthentication:\n anonymous:\n enabled: false\n webhook:\n cacheTTL: 2m0s\n enabled: true\n x509:\n clientCAFile: /etc/kubernetes/pki/ca.crt\nauthorization:\n mode: Webhook\n webhook:\n cacheAuthorizedTTL: 5m0s\n cacheUnauthorizedTTL: 30s\ncgroupDriver: cgroupfs\ncgroupsPerQOS: true\nclusterDNS:\n- 10.96.0.10\nclusterDomain: cluster.local\nconfigMapAndSecretChangeDetectionStrategy: Watch\ncontainerLogMaxFiles: 5\ncontainerLogMaxSize: 10Mi\ncontentType: application/vnd.kubernetes.protobuf\ncpuCFSQuota: true\ncpuCFSQuotaPeriod: 100ms\ncpuManagerPolicy: none\ncpuManagerReconcilePeriod: 10s\nenableControllerAttachDetach: true\nenableDebuggingHandlers: true\nenforceNodeAllocatable:\n- pods\neventBurst: 10\neventRecordQPS: 5\nevictionHard:\n imagefs.available: 15%\n memory.available: 100Mi\n nodefs.available: 10%\n nodefs.inodesFree: 5%\nevictionPressureTransitionPeriod: 5m0s\nfailSwapOn: true\nfileCheckFrequency: 20s\nhairpinMode: promiscuous-bridge\nhealthzBindAddress: 127.0.0.1\nhealthzPort: 10248\nhttpCheckFrequency: 20s\nimageGCHighThresholdPercent: 85\nimageGCLowThresholdPercent: 80\nimageMinimumGCAge: 2m0s\niptablesDropBit: 15\niptablesMasqueradeBit: 14\nkind: KubeletConfiguration\nkubeAPIBurst: 10\nkubeAPIQPS: 5\nmakeIPTablesUtilChains: true\nmaxOpenFiles: 1000000\nmaxPods: 110\nnodeLeaseDurationSeconds: 40\nnodeStatusReportFrequency: 1m0s\nnodeStatusUpdateFrequency: 10s\noomScoreAdj: -999\npodPidsLimit: -1\nport: 10250\nregistryBurst: 10\nregistryPullQPS: 5\nresolvConf: /etc/resolv.conf\nrotateCertificates: true\nruntimeRequestTimeout: 2m0s\nserializeImagePulls: true\nstaticPodPath: /etc/kubernetes/manifests\nstreamingConnectionIdleTimeout: 4h0m0s\nsyncFrequency: 1m0s\nvolumeStatsAggPeriod: 1m0s\n"}}
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
I0807 14:35:34.375262 3172 kubelet.go:148] [kubelet-start] Starting the kubelet
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I0807 14:35:35.464439 3172 loader.go:359] Config loaded from file: /etc/kubernetes/kubelet.conf
I0807 14:35:35.477557 3172 loader.go:359] Config loaded from file: /etc/kubernetes/kubelet.conf
I0807 14:35:35.479731 3172 kubelet.go:166] [kubelet-start] preserving the crisocket information for the node
I0807 14:35:35.479746 3172 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master-k8-2" as an annotation
I0807 14:35:35.979986 3172 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, /" -H "User-Agent: kubeadm/v1.15.1 (linux/amd64) kubernetes/4485c6f" 'https://192.168.22.39:6443/api/v1/nodes/master-k8-2'
I0807 14:35:35.986900 3172 round_trippers.go:438] GET https://192.168.22.39:6443/api/v1/nodes/master-k8-2 404 Not Found in 6 milliseconds
I0807 14:35:35.986913 3172 round_trippers.go:444] Response Headers:
I0807 14:35:35.986918 3172 round_trippers.go:447] Content-Type: application/json
I0807 14:35:35.986923 3172 round_trippers.go:447] Content-Length: 192
I0807 14:35:35.986927 3172 round_trippers.go:447] Date: Wed, 07 Aug 2019 18:35:35 GMT
I0807 14:35:35.986949 3172 request.go:947] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes "master-k8-2" not found","reason":"NotFound","details":{"name":"master-k8-2","kind":"nodes"},"code":404}
I0807 14:35:36.479972 3172 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, /" -H "User-Agent: kubeadm/v1.15.1 (linux/amd64) kubernetes/4485c6f" 'https://192.168.22.39:6443/api/v1/nodes/master-k8-2'
I0807 14:35:36.481593 3172 round_trippers.go:438] GET https://192.168.22.39:6443/api/v1/nodes/master-k8-2 200 OK in 1 milliseconds
I0807 14:35:36.481605 3172 round_trippers.go:444] Response Headers:
I0807 14:35:36.481610 3172 round_trippers.go:447] Content-Type: application/json
I0807 14:35:36.481614 3172 round_trippers.go:447] Content-Length: 3537
I0807 14:35:36.481619 3172 round_trippers.go:447] Date: Wed, 07 Aug 2019 18:35:35 GMT
I0807 14:35:36.481653 3172 request.go:947] Response Body: {"kind":"Node","apiVersion":"v1","metadata":{"name":"master-k8-2","selfLink":"/api/v1/nodes/master-k8-2","uid":"61f94c38-e938-47f1-a3e1-d4ddc49e5b44","resourceVersion":"1115","creationTimestamp":"2019-08-07T18:35:35Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"master-k8-2","kubernetes.io/os":"linux"},"annotations":{"node.alpha.kubernetes.io/ttl":"0","volumes.kubernetes.io/controller-managed-attach-detach":"true"}},"spec":{"podCIDR":"172.16.1.0/24","taints":[{"key":"node.kubernetes.io/not-ready","effect":"NoSchedule"}]},"status":{"capacity":{"cpu":"2","ephemeral-storage":"41931756Ki","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"3880388Ki","pods":"110"},"allocatable":{"cpu":"2","ephemeral-storage":"38644306266","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"3777988Ki","pods":"110"},"conditions":[{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletHasNoDiskPressure","message":"kubelet has no disk pressure"},{"type":"PIDPressure","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletHasSufficientPID","message":"kubelet has sufficient PID available"},{"type":"Ready","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletNotReady","message":"runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"}],"addresses":[{"type":"InternalIP","address":"192.168.22.23"},{"type":"Hostname","address":"master-k8-2"}],"daemonEndpoints":{"kubeletEndpoint":{"Port":10250}},"nodeInfo":{"machineID":"cfde29087fa24fefb3a2e456bc775c0b","systemUUID":"41CB3365-4DDD-4C4F-9560-6B4004577237","bootID":"191e7598-eb43-4d6b-9c84-87d2f6e45bf8","kernelVersion":"3.10.0-957.27.2.el7.x86_64","osImage":"OpenShift Enterprise","containerRuntimeVersion":"docker://18.9.8","kubeletVersion":"v1.15.1","kubeProxyVersion":"v1.15.1","operatingSystem":"linux","architecture":"amd64"},"images":[{"names":["k8s.gcr.io/etcd@sha256:17da501f5d2a675be46040422a27b7cc21b8a43895ac998b171db1c346f361f7","k8s.gcr.io/etcd:3.3.10"],"sizeBytes":258116302},{"names":["k8s.gcr.io/kube-apiserver@sha256:5fae387bacf1def6c3915b4a3035cf8c8a4d06158b2e676721776d3d4afc05a2","k8s.gcr.io/kube-apiserver:v1.15.2"],"sizeBytes":206823358},{"names":["k8s.gcr.io/kube-controller-manager@sha256:7d3fc48cf83aa0a7b8f129fa4255bb5530908e1a5b194be269ea8329b48e9598","k8s.gcr.io/kube-controller-manager:v1.15.2"],"sizeBytes":158718526},{"names":["k8s.gcr.io/kube-proxy@sha256:626f983f25f8b7799ca7ab001fd0985a72c2643c0acb877d2888c0aa4fcbdf56","k8s.gcr.io/kube-proxy:v1.15.2"],"sizeBytes":82408284},{"names":["k8s.gcr.io/kube-scheduler@sha256:8fd3c3251f07234a234469e201900e4274726f1fe0d5dc6fb7da911f1c851a1a","k8s.gcr.io/kube-scheduler:v1.15.2"],"sizeBytes":81107582},{"names":["k8s.gcr.io/coredns@sha256:02382353821b12c21b062c59184e227e001079bb13ebd01f9d3270ba0fcbf1e4","k8s.gcr.io/coredns:1.3.1"],"sizeBytes":40303560},{"names":["k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea","k8s.gcr.io/pause:3.1"],"sizeBytes":742472}]}}
I0807 14:35:36.484126 3172 request.go:947] Request Body: {"metadata":{"annotations":{"kubeadm.alpha.kubernetes.io/cri-socket":"/var/run/dockershim.sock"}}}
I0807 14:35:36.484175 3172 round_trippers.go:419] curl -k -v -XPATCH -H "Accept: application/json, /" -H "Content-Type: application/strategic-merge-patch+json" -H "User-Agent: kubeadm/v1.15.1 (linux/amd64) kubernetes/4485c6f" 'https://192.168.22.39:6443/api/v1/nodes/master-k8-2'
I0807 14:35:36.487913 3172 round_trippers.go:438] PATCH https://192.168.22.39:6443/api/v1/nodes/master-k8-2 200 OK in 3 milliseconds
I0807 14:35:36.487924 3172 round_trippers.go:444] Response Headers:
I0807 14:35:36.487929 3172 round_trippers.go:447] Content-Type: application/json
I0807 14:35:36.487934 3172 round_trippers.go:447] Content-Length: 3605
I0807 14:35:36.487938 3172 round_trippers.go:447] Date: Wed, 07 Aug 2019 18:35:35 GMT
I0807 14:35:36.487965 3172 request.go:947] Response Body: {"kind":"Node","apiVersion":"v1","metadata":{"name":"master-k8-2","selfLink":"/api/v1/nodes/master-k8-2","uid":"61f94c38-e938-47f1-a3e1-d4ddc49e5b44","resourceVersion":"1117","creationTimestamp":"2019-08-07T18:35:35Z","labels":{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"master-k8-2","kubernetes.io/os":"linux"},"annotations":{"kubeadm.alpha.kubernetes.io/cri-socket":"/var/run/dockershim.sock","node.alpha.kubernetes.io/ttl":"0","volumes.kubernetes.io/controller-managed-attach-detach":"true"}},"spec":{"podCIDR":"172.16.1.0/24","taints":[{"key":"node.kubernetes.io/not-ready","effect":"NoSchedule"}]},"status":{"capacity":{"cpu":"2","ephemeral-storage":"41931756Ki","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"3880388Ki","pods":"110"},"allocatable":{"cpu":"2","ephemeral-storage":"38644306266","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"3777988Ki","pods":"110"},"conditions":[{"type":"MemoryPressure","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletHasSufficientMemory","message":"kubelet has sufficient memory available"},{"type":"DiskPressure","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletHasNoDiskPressure","message":"kubelet has no disk pressure"},{"type":"PIDPressure","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletHasSufficientPID","message":"kubelet has sufficient PID available"},{"type":"Ready","status":"False","lastHeartbeatTime":"2019-08-07T18:35:36Z","lastTransitionTime":"2019-08-07T18:35:36Z","reason":"KubeletNotReady","message":"runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"}],"addresses":[{"type":"InternalIP","address":"192.168.22.23"},{"type":"Hostname","address":"master-k8-2"}],"daemonEndpoints":{"kubeletEndpoint":{"Port":10250}},"nodeInfo":{"machineID":"cfde29087fa24fefb3a2e456bc775c0b","systemUUID":"41CB3365-4DDD-4C4F-9560-6B4004577237","bootID":"191e7598-eb43-4d6b-9c84-87d2f6e45bf8","kernelVersion":"3.10.0-957.27.2.el7.x86_64","osImage":"OpenShift Enterprise","containerRuntimeVersion":"docker://18.9.8","kubeletVersion":"v1.15.1","kubeProxyVersion":"v1.15.1","operatingSystem":"linux","architecture":"amd64"},"images":[{"names":["k8s.gcr.io/etcd@sha256:17da501f5d2a675be46040422a27b7cc21b8a43895ac998b171db1c346f361f7","k8s.gcr.io/etcd:3.3.10"],"sizeBytes":258116302},{"names":["k8s.gcr.io/kube-apiserver@sha256:5fae387bacf1def6c3915b4a3035cf8c8a4d06158b2e676721776d3d4afc05a2","k8s.gcr.io/kube-apiserver:v1.15.2"],"sizeBytes":206823358},{"names":["k8s.gcr.io/kube-controller-manager@sha256:7d3fc48cf83aa0a7b8f129fa4255bb5530908e1a5b194be269ea8329b48e9598","k8s.gcr.io/kube-controller-manager:v1.15.2"],"sizeBytes":158718526},{"names":["k8s.gcr.io/kube-proxy@sha256:626f983f25f8b7799ca7ab001fd0985a72c2643c0acb877d2888c0aa4fcbdf56","k8s.gcr.io/kube-proxy:v1.15.2"],"sizeBytes":82408284},{"names":["k8s.gcr.io/kube-scheduler@sha256:8fd3c3251f07234a234469e201900e4274726f1fe0d5dc6fb7da911f1c851a1a","k8s.gcr.io/kube-scheduler:v1.15.2"],"sizeBytes":81107582},{"names":["k8s.gcr.io/coredns@sha256:02382353821b12c21b062c59184e227e001079bb13ebd01f9d3270ba0fcbf1e4","k8s.gcr.io/coredns:1.3.1"],"sizeBytes":40303560},{"names":["k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea","k8s.gcr.io/pause:3.1"],"sizeBytes":742472}]}}
I0807 14:35:36.488122 3172 local.go:118] creating etcd client that connects to etcd pods
I0807 14:35:36.488181 3172 round_trippers.go:419] curl -k -v -XGET -H "Accept: application/json, /" -H "User-Agent: kubeadm/v1.15.1 (linux/amd64) kubernetes/4485c6f" 'https://192.168.22.39:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config'
I0807 14:35:36.489662 3172 round_trippers.go:438] GET https://192.168.22.39:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config 200 OK in 1 milliseconds
I0807 14:35:36.489681 3172 round_trippers.go:444] Response Headers:
I0807 14:35:36.489686 3172 round_trippers.go:447] Content-Type: application/json
I0807 14:35:36.489691 3172 round_trippers.go:447] Content-Length: 1008
I0807 14:35:36.489695 3172 round_trippers.go:447] Date: Wed, 07 Aug 2019 18:35:35 GMT
I0807 14:35:36.489718 3172 request.go:947] Response Body: {"kind":"ConfigMap","apiVersion":"v1","metadata":{"name":"kubeadm-config","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/configmaps/kubeadm-config","uid":"fc7c7426-d05f-4516-bcf5-6623d2823647","resourceVersion":"155","creationTimestamp":"2019-08-07T18:26:36Z"},"data":{"ClusterConfiguration":"apiServer:\n extraArgs:\n authorization-mode: Node,RBAC\n timeoutForControlPlane: 4m0s\napiVersion: kubeadm.k8s.io/v1beta2\ncertificatesDir: /etc/kubernetes/pki\nclusterName: kubernetes\ncontrolPlaneEndpoint: 192.168.22.39:6443\ncontrollerManager: {}\ndns:\n type: CoreDNS\netcd:\n local:\n dataDir: /var/lib/etcd\nimageRepository: k8s.gcr.io\nkind: ClusterConfiguration\nkubernetesVersion: v1.15.2\nnetworking:\n dnsDomain: cluster.local\n podSubnet: 172.16.0.0/16\n serviceSubnet: 10.96.0.0/12\nscheduler: {}\n","ClusterStatus":"apiEndpoints:\n master-k8-1:\n advertiseAddress: 192.168.22.46\n bindPort: 6443\napiVersion: kubeadm.k8s.io/v1beta2\nkind: ClusterStatus\n"}}
I0807 14:35:36.489930 3172 etcd.go:106] etcd endpoints read from pods: https://192.168.22.46:2379
I0807 14:35:36.499874 3172 etcd.go:147] etcd endpoints read from etcd: https://192.168.22.46:2379
I0807 14:35:36.499891 3172 etcd.go:124] update etcd endpoints: https://192.168.22.46:2379
I0807 14:35:36.499898 3172 local.go:127] Adding etcd member: https://192.168.22.23:2380
[etcd] Announced new etcd member joining to the existing etcd cluster
I0807 14:35:36.511748 3172 local.go:133] Updated etcd member list: [{master-k8-2 https://192.168.22.23:2380} {master-k8-1 https://192.168.22.46:2380}]
I0807 14:35:36.511774 3172 local.go:135] Creating local etcd static pod manifest file
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
I0807 14:35:36.512488 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 1/8
I0807 14:35:41.531383 3172 etcd.go:356] [etcd] Attempt timed out
I0807 14:35:41.531403 3172 etcd.go:348] [etcd] Waiting 5s until next retry
I0807 14:35:46.532365 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 2/8
I0807 14:35:51.560031 3172 etcd.go:356] [etcd] Attempt timed out
I0807 14:35:51.560050 3172 etcd.go:348] [etcd] Waiting 5s until next retry
I0807 14:35:56.561360 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 3/8
I0807 14:36:01.594799 3172 etcd.go:356] [etcd] Attempt timed out
I0807 14:36:01.594818 3172 etcd.go:348] [etcd] Waiting 5s until next retry
I0807 14:36:06.602359 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 4/8
I0807 14:36:11.625726 3172 etcd.go:356] [etcd] Attempt timed out
I0807 14:36:11.625743 3172 etcd.go:348] [etcd] Waiting 5s until next retry
[kubelet-check] Initial timeout of 40s passed.
I0807 14:36:16.625888 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 5/8
I0807 14:36:21.642588 3172 etcd.go:356] [etcd] Attempt timed out
I0807 14:36:21.642605 3172 etcd.go:348] [etcd] Waiting 5s until next retry
I0807 14:36:26.642757 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 6/8
I0807 14:36:31.661321 3172 etcd.go:356] [etcd] Attempt timed out
I0807 14:36:31.661339 3172 etcd.go:348] [etcd] Waiting 5s until next retry
I0807 14:36:36.661436 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 7/8
I0807 14:36:41.678773 3172 etcd.go:356] [etcd] Attempt timed out
I0807 14:36:41.678792 3172 etcd.go:348] [etcd] Waiting 5s until next retry
I0807 14:36:46.679130 3172 etcd.go:351] [etcd] attempting to see if all cluster endpoints ([https://192.168.22.46:2379 https://192.168.22.23:2379]) are available 8/8
I0807 14:36:51.696464 3172 etcd.go:356] [etcd] Attempt timed out
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available

@neolit123
Copy link
Member

looks like the second etcd member needs more time to join.

please track this issue. it was logged today, but we don't have a solution yet:
kubernetes/kubeadm#1712

/close

@k8s-ci-robot
Copy link
Contributor

@neolit123: Closing this issue.

In response to this:

looks like the second etcd member needs more time to join.

please track this issue. it was logged today, but we don't have a solution yet:
kubernetes/kubeadm#1712

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jiri-kral86
Copy link

jiri-kral86 commented Jul 2, 2020

I know that's old issue but I had this trouble for longer time and wasn't able to solve... Then, I've just noticed that time is a bit different on one of my master servers...so installed NTP service on missing one and after synchronization of time on all master nodes, problem was finally solved! Nowhere (system logs, --v=5, etc...) was any message or notice about some trouble with time...only connection troubles were in messages log.
So try to install NTP service, start it, synchronize and hopefully, it will help someone and ommit several hours/days related to solving issue like this...and it's logical that for synchronize masters there is need to have synchronized time but I was still searching troubles elsewhere. :-)
/Jiri

@neolit123
Copy link
Member

@jiri-kral86

sorry for having to debug this problem for so long. there is no way for kubeadm to detect this because kubeadm does not e.g. SSH to the Node beforehand to check if the clock is well synced and only then "join". this can result in issues related to certificates.

it's the responsibility of the admin to make sure the nodes are in good state before the cluster is created, but i can see this tripping people.

if you think this is worth a mention it can be added to this list:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#before-you-begin

something like:

  • The machines should have accurate time. NTP can be used to synchronize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

4 participants