-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vxlan interface always down , cause calico-node failed to add route for pods on other node #3271
Comments
@weizhoublue
After that restart the NetworkManager and check if the vxlan.calico interface comes up |
3 tasks
caseydavenport
pushed a commit
that referenced
this issue
Jun 10, 2020
nightkr
added a commit
to Appva/kubespray
that referenced
this issue
Dec 15, 2020
See projectcalico/calico#3271 Otherwise Calico can get into a fight with NM about who "owns" the calico.vxlan interface, breaking all pod traffic. Cherry-pick of 2715041c1bbef92cbc2b796eee40c1fc4a51f523
nightkr
added a commit
to Appva/kubespray
that referenced
this issue
Dec 22, 2020
See projectcalico/calico#3271 Otherwise Calico can get into a fight with NM about who "owns" the vxlan.calico interface, breaking all pod traffic.
k8s-ci-robot
pushed a commit
to kubernetes-sigs/kubespray
that referenced
this issue
Dec 23, 2020
See projectcalico/calico#3271 Otherwise Calico can get into a fight with NM about who "owns" the vxlan.calico interface, breaking all pod traffic.
LuckySB
pushed a commit
to southbridgeio/kubespray
that referenced
this issue
Jan 17, 2021
…gs#7037) See projectcalico/calico#3271 Otherwise Calico can get into a fight with NM about who "owns" the vxlan.calico interface, breaking all pod traffic.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
calico version: 3.11.2
k8s version: 1.17
I have a k8s cluster with 3 nodes , which uses calico CNI .
2 calico ippool is set up , both of them use vxvlan mode
the strange thing happened , the vxlan.calico interface on a node always stay down , and I tried to set it up , but after about 3seconds , this tunnel interface goes down again . That make calico-node failed to add route for other 2 node .
After I delete the daemonset calico-node, I can set the vxlan.calico interface up . However, I apply the daemonset calico-node again , the vxlan.calico interface on the same node begin to stay down again , and I could not set it up agian . So I thinkg it is should that calico-node found something wrong and always make the vxlan.calico interface down
However other 2 node2 work fine
BTW, my other Cluster with same calico configuration , has no this issue
Finally , I found a work-around way : delete the tunnel interface , which trigger the calico-node recreate the tunnel interface, then everything go to work fine
this cluster environment may scale up nodes or reinstall calico operation , Maybe the tunnel interface is created by previously with wrong VXLAN configuration for right now . So deleting it can trigger calico to regenerate right one
the calico-node log of the bad node is bellow
the route table of the bad node is bellow , it only have local pod route , and miss subnet route for pods on other node
networkmanager has no configuration for this tunnel
Expected Behavior
tunnel interface should stay up , and route should be added for pod on other nodes
Current Behavior
tunnel interface stay down , and pods on the bad node can not access pods on other nodes
Possible Solution
Steps to Reproduce (for bugs)
Context
Your Environment
The text was updated successfully, but these errors were encountered: