-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multimaster Setup - Master 1 corrupting when issues join command on Master-2 #15637
Comments
hi, any errors logs from api server, kubelet or kubeadm? |
This is what happening. My setup contains 1 LB & 2 Master nodes with local ETCD. apiVersion: kubeadm.k8s.io/v1beta1 After the Kubeadm init on first Master, I am getting response for kubectl get nodes without any issues. When I issue controlplane join on second master, it is getting stuck on below lines. [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" Same time if you reissue kubectl get nodes on master 1 </b ., getting the below response & the master 1 is getting corrupted. [root@k8-1 ~]# kubectl get pods Thanks, |
etcd not comming up is unusual. random questions:
|
- are you always calling kubeadm reset before kubeadm init? [root@k8-1 ~]# kubeadm init \
[init] Using Kubernetes version: v1.15.2 Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube You should now deploy a pod network to the cluster. You can now join any number of the control-plane node running the following command on each as root: kubeadm join 192.168.22.49:6443 --token sz4ems.eem2wvwqafumvsf2 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.22.49:6443 --token sz4ems.eem2wvwqafumvsf2 |
please follow our official guide instead https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/
what are contents of your kubeadm config?
are you calling |
Kubeadm Config:[centos@k8-2 ~]$ cat /etc/kubernetes/kubeadm/kubeadm-config.yaml apiVersion: kubeadm.k8s.io/v1beta1 are you calling kubeadm reset before kubeadm join on the second master? : NO what is the full output of kubeadm join?
[preflight] Running pre-flight checks |
not sure what is going on.
if the first control-plane node has initialized properly you should be able to call: the kubeconfig is in /etc/kubernetes/admin.conf on that first control-plane node. so alternatively: also please share the output of |
Please find the join output from second CP node with --v=10 [certs] Using certificateDir folder "/etc/kubernetes/pki" |
looks like the second etcd member needs more time to join. please track this issue. it was logged today, but we don't have a solution yet: /close |
@neolit123: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I know that's old issue but I had this trouble for longer time and wasn't able to solve... Then, I've just noticed that time is a bit different on one of my master servers...so installed NTP service on missing one and after synchronization of time on all master nodes, problem was finally solved! Nowhere (system logs, --v=5, etc...) was any message or notice about some trouble with time...only connection troubles were in messages log. |
sorry for having to debug this problem for so long. there is no way for kubeadm to detect this because kubeadm does not e.g. SSH to the Node beforehand to check if the clock is well synced and only then "join". this can result in issues related to certificates. it's the responsibility of the admin to make sure the nodes are in good state before the cluster is created, but i can see this tripping people. if you think this is worth a mention it can be added to this list: something like:
|
Hi,
I am trying to setup a multi master. After configuring Master-1, when I try to join master-2, a timeout happens and master 1 also getting corrupted.
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[kubelet-check] Initial timeout of 40s passed.
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available
Any help on this please.
Thanks,
Sunish
The text was updated successfully, but these errors were encountered: