New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BPF Host Routing: do not skip fib lookups #28264
BPF Host Routing: do not skip fib lookups #28264
Conversation
This patch removes the |
/test |
0dc1e30
to
5716ea5
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
config change seems fine to me. No idea whether ENABLE_SKIP_FIB is great or not, but I'm all for reducing the number of code paths we have. Unless we have a strong indication that it's hurtful I'd say go for it!
Ripping out Is there no way to only exclude |
The config can be described as follows: So, does this work if we only undef |
For this specific case I can do a check like this (and only apply to lxc, not host so that nodeport is not affected):
Then if we have more cases like this, we can add an option or extend the check. WDYT @julianwiedmann |
5716ea5
to
22a8fe9
Compare
I've posted a new version which only disables this on EKS. Tested that this actually works. |
/test |
pkg/datapath/linux/config/config.go
Outdated
@@ -1073,6 +1074,11 @@ func (h *HeaderfileWriter) WriteEndpointConfig(w io.Writer, e datapath.EndpointC | |||
return h.writeTemplateConfig(fw, e) | |||
} | |||
|
|||
func enforceFibLookup(e datapath.EndpointConfiguration) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you put this on the type of option.Config
instead? Then we wouldn't add another dependency on the global variable.
Would this situation be prevented if we had proper device detection at runtime, so that Cilium considers multiple devices when they get automatically added on EKS and therefore it needs to turn it off automatically when number grows from 1 to 2+? Cc @joamaki |
Yep, I'd expect with runtime device detection enabled we wouldn't need this workaround (as then we'd reload if we get more devices at runtime). @bimmlerd and I are working on enabling it by default for v1.15. I would still go ahead with this fix (and backport it) and then remove it once it's no longer needed. |
@aspsk this doesn't look like a backport to v1.14. Did you mean to add the |
Can this be merged ? |
24d8ed8
to
dd1ae0f
Compare
/test |
When cilium is running in the BPF Host Routing mode and DirectRoutingDevice is set and cilium is configured to only use a single device, then we have skipped the FIB lookup for packets leaving the pod and forwarded them directly to the aforementioned DirectRoutingDevice network interface. This doesn't work in case of native routing when other, non-cilium-managed, interfaces are present on a node. In such case we, obviously never send traffic to such interfaces, even for a reply, for example, when the DirectRoutingDevice interface is eth0 and another eth1 interface is present, then the following happens: One known case when this happens on practice is EKS. It can happen that a node has multiple interfaces, e.g., eth0, eth1, and packets from some pods (to the VPC private range) are routed through eth1 instead of eth0. For example, # ip rule 20: from all to 192.168.91.2 lookup main 100: from all lookup local 111: from 192.168.86.135 to 192.168.0.0/16 lookup 10 111: from 192.168.80.149 to 192.168.0.0/16 lookup 11 32766: from all lookup main 32767: from all lookup default In this [simplified] output packets from pod 192.168.86.135 to 192.168.0.0/16 (VPC private network) should be routed through the eth0 (table 10 contains a default route through eth0), and similar packets from pod 192.168.80.149 should be routed through eth1. This breaks with BPF host routing because, as described above, all packets go to eth0. In order to fix this we want to disable the ENABLE_SKIP_FIB macro in BPF code. This macro is a big optimization, so in this patch we only try to disable it for configs which are expected to be broken by the change in v1.14.0-snapshot.2 (which allowed to enable endpoint routes and host routing simultaneously). So, we enforce the FIB lookup when endpoint routes are enabled. In general, for such setups users should be using `devices=eth+` and `enableRuntimeDeviceDetection=true`. In future we should make this behaviour default, and once done, to revert this patch. Fixes: cilium#27343 Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
dd1ae0f
to
3dcffe8
Compare
/test |
See the commit message.
Fixes: #27343