New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
layer2 mode doesn't receive broadcast packets on VM unless promiscuous mode is enabled #253
Comments
This is interesting. This is the first time I've heard of any setup that requires promiscuous mode to make MetalLB layer2 mode work. It specifically does not rely on anything other than broadcast ethernet packets, which all correct ethernet NICs and drivers should make visible to the OS. Based on your kernel version, it looks like you're running on Proxmox. My dev environment is on proxmox, with a bunch of clusters that work fine without the need for promiscuous mode... So there is definitely some way to make it work, now we just need to figure out what's different between your proxmox and mine :) What version of PVE are you running? Can you paste the output of |
Hi, sorry for the delay... The kubelet is running directly on PVE via systemd, not a VM. Here is my PVE version from the web UI at https://192.168.1.40:8006/:
and ip addr output: $ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP group default qlen 1000
link/ether 6c:3b:e5:2d:32:a5 brd ff:ff:ff:ff:ff:ff
3: vmbr0: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 6c:3b:e5:2d:32:a5 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.40/24 brd 192.168.1.255 scope global vmbr0
valid_lft forever preferred_lft forever
inet6 fe80::6e3b:e5ff:fe2d:32a5/64 scope link
valid_lft forever preferred_lft forever
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:8d:52:28:db brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
5: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 0a:58:c0:a8:e0:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.224.1/24 scope global cni0
valid_lft forever preferred_lft forever
inet6 fe80::cc3e:6bff:fea5:e73c/64 scope link
valid_lft forever preferred_lft forever
6: vethc93787bb@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 stateUP group default
link/ether a6:70:3c:83:35:95 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet6 fe80::a470:3cff:fe83:3595/64 scope link
valid_lft forever preferred_lft forever
7: veth43403364@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 stateUP group default
link/ether 2e:65:06:66:13:8a brd ff:ff:ff:ff:ff:ff link-netnsid 1
inet6 fe80::2c65:6ff:fe66:138a/64 scope link
valid_lft forever preferred_lft forever
8: veth8c19b389@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 stateUP group default
link/ether e2:1c:5c:31:0b:5c brd ff:ff:ff:ff:ff:ff link-netnsid 2
inet6 fe80::e01c:5cff:fe31:b5c/64 scope link
valid_lft forever preferred_lft forever
9: vethed5ca8c3@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 stateUP group default
link/ether ee:ca:71:db:62:84 brd ff:ff:ff:ff:ff:ff link-netnsid 3
inet6 fe80::ecca:71ff:fedb:6284/64 scope link
valid_lft forever preferred_lft forever
10: vethac465a18@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether 86:5b:d7:32:a0:91 brd ff:ff:ff:ff:ff:ff link-netnsid 4
inet6 fe80::845b:d7ff:fe32:a091/64 scope link
valid_lft forever preferred_lft forever
12: vethfafa9b33@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether 32:6e:8d:6c:8a:0c brd ff:ff:ff:ff:ff:ff link-netnsid 6
inet6 fe80::306e:8dff:fe6c:8a0c/64 scope link
valid_lft forever preferred_lft forever
14: veth2c501807@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether 46:b7:81:b4:27:90 brd ff:ff:ff:ff:ff:ff link-netnsid 8
inet6 fe80::44b7:81ff:feb4:2790/64 scope link
valid_lft forever preferred_lft forever
15: veth56f5577c@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether 8a:e6:18:d4:f1:3f brd ff:ff:ff:ff:ff:ff link-netnsid 9
inet6 fe80::88e6:18ff:fed4:f13f/64 scope link
valid_lft forever preferred_lft forever
17: vethb1aef139@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether 46:cc:dc:ff:a0:fd brd ff:ff:ff:ff:ff:ff link-netnsid 11
inet6 fe80::44cc:dcff:feff:a0fd/64 scope link
valid_lft forever preferred_lft forever
18: vethc9584dbb@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether 2a:22:b0:4b:81:45 brd ff:ff:ff:ff:ff:ff link-netnsid 12
inet6 fe80::2822:b0ff:fe4b:8145/64 scope link
valid_lft forever preferred_lft forever
19: vethcf81ee97@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni0 state UP group default
link/ether a2:95:af:4d:53:15 brd ff:ff:ff:ff:ff:ff link-netnsid 13
inet6 fe80::a095:afff:fe4d:5315/64 scope link
valid_lft forever preferred_lft forever
22: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
link/ether 3e:df:93:54:5b:9d brd ff:ff:ff:ff:ff:ff
inet 192.168.224.0/32 scope global flannel.1
valid_lft forever preferred_lft forever
inet6 fe80::3cdf:93ff:fe54:5b9d/64 scope link
valid_lft forever preferred_lft forever
$ cat /etc/network/interfaces
auto lo
iface lo inet loopback
iface eno1 inet manual
auto vmbr0
iface vmbr0 inet static
address 192.168.1.40
netmask 255.255.255.0
gateway 192.168.1.1
bridge_ports eno1
bridge_stp off
bridge_fd 0
up /bin/ip link set vmbr0 promisc on
$ I'm using flannel for my CNI plugin, if that helps. Michael. |
Thanks for the info. My hunch is that this is a weirdness of how linux implements bridges. Either it's filtering ARP requests based on L3 layer knowledge, or it's filtering inbound traffic that doesn't match local IPs (but that makes no sense, it would break IP forwarding... Just to be paranoid, does Once the IP is working with promisc on, if you turn promisc off again, does it keep working for a few minutes, or does it stop working immediately? When it's broken, on the client side can you run I'll try to reproduce this locally, but my available time to work on MetalLB right now is very limited, so I can't promise quick progress :( |
So, When I turn off promisc, it does still work to I'll try again tomorrow from the office where I was able to get it to fail before. Will keep you updated. |
As referenced in #284 I got this as well on my Raspberry Pi cluster running Metal LB, all nodes using wlan0 to connect to the local LAN for connectivity (inter node and rest of network). All nodes within the cluster correctly resolved the IP and I was able to see the nginx landing page from the layer 2 tutorial. But PC's on the local network were not able to do the same, no ARP requests were responded to by Metal LB. By enabling promiscuous mode on wlan, all computers on the LAN were able to correctly see the nginx landing page. Here's my log:
The odd thing is MetalLB doesn't say it's started an ARP Responder on wlan0 initially? Interfaces on node01 which has the pod running on it hosting the example nginx service:
|
I'm going to close this out as stale since it hasn't been updated in a couple of years. |
I can confirm that this is still an issue on my Raspberry Pi 4 model B. |
I am encountering this issue as well on k3s nodes running on Proxmox. Thus, I don't think this is a Pi-specific issue. I'm not using a Pi. More, recent reports are available here which I think are the same issue: #284
The same issue happens for me. I am trying to test with the nodes' NICs to promiscuous to see if that will change anything. If this is a known issue, even if it's outside the scope of MetalLB, it's something that many people seem to have encountered and could be added to documentation to help people who are stuck. I don't know the root cause of this problem, however, so can't say "in XYZ circumstances, this is how to fix it." |
@virgil9306 - we tripped over this locally as well. Here's the most succinct description of the problem I can come up with:
Reported workarounds:
I believe this was recently re-triggered by #976 in metallb 0.11.0 because, with that code merged, the requirement that the load balanced IP be on the same host as a pod was relaxed. (Fun story - this one was a pain to debug because the cluster would magically start working whenever I did a Hopefully this summary + workaround is helpful for whoever the next person hits this issue. |
I just ran into this issue as well. Enabling promiscuous mode works. |
Is this a bug report or a feature request?: Feature request, for documentation.
What happened:
I couldn't connect to my LoadBalancer IP from outside the host, and the documentation provided no hints.
What you expected to happen:
Traffic would flow correctly, or some kind of hint would be present in the documentation.
How to reproduce it (as minimally and precisely as possible):
I installed the stable/metallb Helm chart with the following resulting configmap:
The node host's vmbr0 interface is 192.168.1.40/24. I have a second host using 192.168.1.41/24, that I expect to be able to take over the loadbalancer IPs.
When I do "nc -v -z 192.168.1.39 80" from another host, it just hangs. However, when I run 'tcpdump' on the node, the netcats and HTTP connections magically start working. This is because tcpdump puts the interface into promiscuous mode.
I solved this by putting the following into my Debian host's /etc/network/interfaces script, under the configuration for the interface in question:
Environment:
uname -a
): Linux pve2 4.13.16-1-pve Implement BGP add-path #1 SMP PVE 4.13.16-43 (Fri, 16 Mar 2018 19:41:43 +0100) x86_64 GNU/LinuxThe text was updated successfully, but these errors were encountered: