New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Master to Pod communication is broken in kube-flannel #535
Comments
Things seem to be working after applying https://github.com/appscode/flannel/commit/b083788405ce2bf3c34b9d4df7b5d77afc865b4e |
@tamalsaha Can you share some of kubernetes commands you were using to repro this? Where are you pinging to and from, is it from your master node to a pod on a different node? |
@tomdee, I was pinging from master to a pod running on a different node. |
@tomdee I just ran a nginx pod, then tried to wget from the master host directly.
|
@tamalsaha, tried your fix but still the pods can't communicate properly with each other: |
@autostatic , can you explain your test case bit more so that I can recreate it? |
I tested this on a small bare metal Ubuntu 16.04 cluster on OpenStack. One master, two nodes, K8s 1.4.6. I used
|
@autostatic , I am not sure that you are having the same issue as I was. The issue I was facing was that pods running on master (with hostNetwork:true in my case) could not connect to pods on regular nodes using Pod IP. From your log, it seems that DNS pod running on regular node can't connect to kube apiserver (https://10.96.0.1:443). So, If I were you, I would first confirm that the flannel network is actually working as intended. One way to check that is to see if you can ping the IP address of the flannel bridge on master from the node running DNS pod. FYI, I also had to make some changes to the cni-conf.json. You can see my changes here: https://github.com/appscode/kubernetes/commit/ee660dc997f7ae5042033f226b4416d4513b5422 . The important thing here was, ensuring Kubernetes was using the bridge created by flannel. Without that, pods will be disconnected from the flannel overlay network. It will be helpful to see the result of If you are unfamiliar with the cni conf option, you will find these docs handy: |
Hello @tamalsaha, thanks for the feedback. I made the changes to the CNI config and then Flannel came up successfully, DNS started working and I could deploy a working Dashboard. I don't have a Flannel bridge on my master though, that could be related to the Hairpin setting?
|
@autostatic I am glad that your cluster is working. The flannel brdige gets created the first time CNI plguin is called. Since kubernetes does not run regular pod on master, It also seems that you don't need my patch. I needed this patch because we run a HAproxy based ingress controller on the master that load balances across pods on regular nodes. So, I needed Haproxy on master to be able to connect to pods on regular nodes. |
Hi @tamalsaha I did some more tests including a couple of fresh deployments and without your patch the cluster is not functional, I can't ping the other nodes from the master. If I do a deployment with a patched Flannel the cluster comes up properly. |
Yes, if you want to ping regular nodes from master, you need this patch. |
#560 fixes my issues. |
I have the same problem: First kubernetes node gets the net address 10.244.0.0 from network 10.244.0.0/16 assigned. Therefore this node is not reachable from other nodes. NodePort services, that I want to reach via the first node are unreachable, when the service itself runs on another node. I can see leaving packets from 10.244.0.0 to other nodes, but I can't see returning packets because they are not routable. |
@mattenklicker, which version are you using? https://github.com/coreos/flannel/releases/tag/v0.7.0 is supposed to fix this issue. |
@tamalsaha v0.7.0 |
Update: Created a github project to create environment - https://github.com/samarjit/vagrant-kubeadm I am using v0.7.0 too. But having same issue master to slave node communication failure.
Ping 10.244.0.2 -> 10.244.1.4 master to slave does not work. In master node, if I query dns it seems to work fine:
If I try tcpdump in slave node no packets are received. I followed testing DNS as described in https://kubernetes.io/docs/admin/dns/. It works!
I am starting kubeadm using the following script.
kube-flanel.yml.
|
My issue was solved. Its vagrant environment specific issue. Same logic was applied in startup command of flannel in kubernetes. Kube-Flannel yaml:
Note:
Shows DNS resolution.
The service is reachable.
|
@samarjit Thanks, I run into the same issue, specifying |
This issue is fixed for me with v7.0 . |
@samarjit ran into the same issue in vagrant environment, specifying --iface= in flannel daemon works for me. Thanks. |
i had the same issue on vagrant; solved now with iface; thanks much |
I am trying to setup a Kubernetes cluster using kube-flannel mode using vxlan backend. Node to Node communication is working. But Master to Pod network is not working. I am not a Linux networking expert. I see that master flannel.1 is assigned the network address. This seems to causing issues with arp.
The problem seems to be that Master
flannel.1
is assigned the first IP of the subnet zero, which is indistinguishable from the network address. Can you please confirm that this will fail Master to Pod communication?I am thinking about using the next Subnet of Node.Spec.PodCIDR in kubeSubnetManager. Will that fix this issue?
cc: @mikedanese
The text was updated successfully, but these errors were encountered: