New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod calico-node Hit error connecting to datastore - i/o timeout #3092
Comments
It seems like the current problem is for node to reach the API server. Do you have any idea why that would be the case? |
Hi Erik, m also going through same issue. Difference is, i have done the setup on bare metal and the calico nodes keep on crashing on every node. I have already modified the calico file with: AUTO DETECTION METHOD and pod cidr but nothing worked. |
@Goeldeepesh Can you provide some logs? Have you tried what I suggested? Can you access the apiserver from a node? |
So, i have setup a kubernetes HA with external ETCD cluster and a HA proxy over masters. After searching for calico issue i cam to know that we need to define etcd endpoints in calico. so i did the same and applied calico for etcd cluster. After this my calico node got up but " calico-kube-controller and core dns are not working. ### Calico-kube-contoller logs I0117 18:12:19.957435 1 trace.go:116] Trace[1357217039]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/client-go@v0.0.0-20191114101535-6c5935290e33/tools/cache/reflector.go:96 (started: 2020-01-17 18:11:49.94948954 +0000 UTC m=+1520.210744032) (total time: 30.007913221s): ### kubectl describe pod coredns-6955765f44-5472t -n kube-system Name: coredns-6955765f44-5472t Warning FailedCreatePodSandBox 45m (x40 over 66m) kubelet, n4tenl-depa0598 (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "a612a10cc2e91572690e05ad7e69e6d63aa95c79558631d9fcc373212572f6e5" network for pod "coredns-6955765f44-5472t": networkPlugin cni failed to set up pod "coredns-6955765f44-5472t_kube-system" network: Get https://[10.96.0.1]:443/api/v1/namespaces/kube-system: dial tcp 10.96.0.1:443: i/o timeout ### kubectl logs kube-proxy-8j5tv -n kube-system E0117 18:16:23.343854 1 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Endpoints: Get https://dockers.airtel.com:6443/api/v1/endpoints?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes") ### kubeadm config apiVersion: kubeadm.k8s.io/v1beta2
networking: I do not see anything running on 443 port rather 6443. I unable to understand where is this 10.96.0.1 come from..? i had provide pod-network-cidr to 192.168.0.0/16. Please Erik i really need to sort this issue as i have been struggling with this setup for two weeks now and still on the same page. |
@Goeldeepesh your issue seems to be quite different. From your kube-proxy logs it looks to be a problem there. Please resolve the issue with kube-proxy before you try to fix any problems with Calico. |
@tmjd Hmm... Thanks anyway. |
Yeah, this definitely looks like a kube-proxy issue. Calico relies on a functioning kube-proxy to access the Kubernetes API server. |
I know this issue has been closed but I am facing the same problem. I am running k8s cluster in AWS using kops. |
I know this ticket is closed but just to help people who find this post i solved the problem when i changed my pod-network-cidr to pod-network-cidr=10.244.0.0/16 |
Neither of the posted solutions are working for me with Packer/Vagrant and RHEL 7 using Kubeadm to bootstrap a v1.18.0 cluster. Some network information:
The The
I can curl that endpoint from the master node but not the worker nodes. Since they communicate via the LAN CIDR mentioned above: Any ideas? |
As mentioned before each calico-node and cni needs to be able to reach the Kubernetes API server. And kube-proxy is responsible for setting up the rules so that the kubernetes service IP redirects correctly. @mohammadasim Are there multiple kubernetes masters and is it possible they are not all reachable or healthy? That might explain why the connection works sometimes and sometimes it does not. You've identified that it is not just a calico issue so you'll need to dig in to why connecting fails. I'd suggest looking at kube-proxy logs and the kubernetes API service endpoints. @zimmertr I would guess that your API server is not configured to use the proper address because of the vagrant creating multiple interfaces and you'll need to change or set the IP address that the API server is listening on. |
If I understand the documentation correctly,
This works on CentOS 7 but does not work on RHEL 7. Alternatively, if I use Kubeadm Configuration files which have the same arguments, it doesn't work for either operating system.
Lastly, here is the Kustomize patch I'm applying to the
|
In my case, after reading a lot about it... my principal problem was the need of bridge-utils (https://wiki.debian.org/es/KVM) |
In my case. its |
all nodes? |
yes。 @itsecforu |
@oldthreefeng doent work for me |
@itsecforu what the value of |
@oldthreefeng set to 0 full:
|
is there any error logs in your calico nodes pod ? |
1 of my master nodes:
anothers are ok |
coredns on master node:
|
Can you elaborate more on what you did please? |
Reviewing my old notes... at the end, the problem is: no comunication between elements (master and nodes..) During configuration of VM (in my case with KVM) i setup 3 (master, 2 nodes) Each vm need... swap / bridge / ipforward (https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/)
And the "trick" came here: https://wiki.debian.org/KVM If you use libvirt to manage your VMs, libvirt provides a NATed bridged network named "default" that allows the host to communicate with the guests. Hope it's help @usersina |
@josejuanmontiel Thanks for sharing! Although I haven't been able to get it to work this way since I'm using microk8s which does most of the networking behind the scenes. More on this if anyone is interested. |
Expected Behavior
Pod calico-node-xxx up and running on each node
Current Behavior
Pods calico-node on worker nodes are in state 'CrashLoopBackOff'
This is output from one of the pod
Steps to Reproduce (for bugs)
kubectl apply -f calico.yaml
on that new nodegroup also using nodeAffinityThis is my edited calico.yaml
This is customization I perform on the manifest:
Context
I try to bypass AWS EKS maximum pod limitation by disabling AWS CNI and use Calico CNI as IP pool.
Your Environment
The text was updated successfully, but these errors were encountered: