Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BPF Host Routing: do not skip fib lookups #28264

Merged

Conversation

aspsk
Copy link
Contributor

@aspsk aspsk commented Sep 25, 2023

See the commit message.

Fixes: #27343

Do not skip FIB lookup when running in BPF Host Routing when Endpoint Routes enabled

@aspsk aspsk requested review from a team as code owners September 25, 2023 15:47
@aspsk aspsk requested a review from lmb September 25, 2023 15:47
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 25, 2023
@aspsk
Copy link
Contributor Author

aspsk commented Sep 25, 2023

This patch removes the ENABLE_SKIP_FIB completely. Won't it be useful to have a flag to enable it for setups when only one interface is present? (cc: @julianwiedmann)

@aspsk aspsk added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Sep 25, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 25, 2023
@aspsk
Copy link
Contributor Author

aspsk commented Sep 25, 2023

/test

@aspsk
Copy link
Contributor Author

aspsk commented Sep 25, 2023

/test

Copy link
Contributor

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config change seems fine to me. No idea whether ENABLE_SKIP_FIB is great or not, but I'm all for reducing the number of code paths we have. Unless we have a strong indication that it's hurtful I'd say go for it!

@aspsk aspsk added backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Sep 26, 2023
@julianwiedmann
Copy link
Member

Ripping out ENABLE_SKIP_FIB completely feels like a very big hammer. And for instance also impacts unrelated areas like N/S LB forwarding to remote backends.

Is there no way to only exclude ENABLE_SKIP_FIB for the regressing config? Or even teach the problematic FIB lookup path (in bpf_lxc) to ignore ENABLE_SKIP_FIB ?

@julianwiedmann julianwiedmann added the sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. label Sep 26, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 26, 2023
@aspsk
Copy link
Contributor Author

aspsk commented Sep 26, 2023

Is there no way to only exclude ENABLE_SKIP_FIB for the regressing config?

The config can be described as follows: no tunnel & there are interfaces not controlled by cilium, so I am not sure how to automate this decision. We currently experience this only on EKS, as far as we know, but this is, of course, applicable to any config as defined above.

So, does this work if we only undef ENABLE_SKIP_FIB for lxc, and leave it as is for other callers?

@aspsk
Copy link
Contributor Author

aspsk commented Sep 27, 2023

So, does this work if we only undef ENABLE_SKIP_FIB for lxc, and leave it as is for other callers?

For this specific case I can do a check like this (and only apply to lxc, not host so that nodeport is not affected):

option.Config.IPAM == ipamOption.IPAMENI

Then if we have more cases like this, we can add an option or extend the check. WDYT @julianwiedmann

@aspsk aspsk force-pushed the aspsk/pr/bpf-routing-do-not-skip-fib-lookup branch from 5716ea5 to 22a8fe9 Compare September 27, 2023 16:25
@aspsk aspsk requested a review from lmb September 27, 2023 16:26
@aspsk
Copy link
Contributor Author

aspsk commented Sep 27, 2023

I've posted a new version which only disables this on EKS. Tested that this actually works.

@aspsk
Copy link
Contributor Author

aspsk commented Sep 28, 2023

/test

@@ -1073,6 +1074,11 @@ func (h *HeaderfileWriter) WriteEndpointConfig(w io.Writer, e datapath.EndpointC
return h.writeTemplateConfig(fw, e)
}

func enforceFibLookup(e datapath.EndpointConfiguration) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put this on the type of option.Config instead? Then we wouldn't add another dependency on the global variable.

@borkmann
Copy link
Member

Would this situation be prevented if we had proper device detection at runtime, so that Cilium considers multiple devices when they get automatically added on EKS and therefore it needs to turn it off automatically when number grows from 1 to 2+? Cc @joamaki

@joamaki
Copy link
Contributor

joamaki commented Sep 29, 2023

Would this situation be prevented if we had proper device detection at runtime, so that Cilium considers multiple devices when they get automatically added on EKS and therefore it needs to turn it off automatically when number grows from 1 to 2+? Cc @joamaki

Yep, I'd expect with runtime device detection enabled we wouldn't need this workaround (as then we'd reload if we get more devices at runtime). @bimmlerd and I are working on enabling it by default for v1.15. I would still go ahead with this fix (and backport it) and then remove it once it's no longer needed.

@bimmlerd
Copy link
Member

bimmlerd commented Sep 29, 2023

@aspsk this doesn't look like a backport to v1.14. Did you mean to add the needs-backport/1.14 label instead of backport/1.14?

@aspsk aspsk removed the backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. label Sep 29, 2023
@sylr
Copy link

sylr commented Nov 7, 2023

Can this be merged ?

@aspsk aspsk added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. and removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Nov 7, 2023
@aspsk aspsk force-pushed the aspsk/pr/bpf-routing-do-not-skip-fib-lookup branch 2 times, most recently from 24d8ed8 to dd1ae0f Compare November 7, 2023 10:08
@aspsk
Copy link
Contributor Author

aspsk commented Nov 7, 2023

/test

@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Nov 8, 2023
When cilium is running in the BPF Host Routing mode and DirectRoutingDevice is
set and cilium is configured to only use a single device, then we have skipped
the FIB lookup for packets leaving the pod and forwarded them directly to the
aforementioned DirectRoutingDevice network interface.

This doesn't work in case of native routing when other, non-cilium-managed,
interfaces are present on a node. In such case we, obviously never send traffic
to such interfaces, even for a reply, for example, when the DirectRoutingDevice
interface is eth0 and another eth1 interface is present, then the following
happens:

One known case when this happens on practice is EKS. It can happen that a node
has multiple interfaces, e.g., eth0, eth1, and packets from some pods (to the
VPC private range) are routed through eth1 instead of eth0. For example,

    # ip rule
    20:     from all to 192.168.91.2 lookup main
    100:    from all lookup local
    111:    from 192.168.86.135 to 192.168.0.0/16 lookup 10
    111:    from 192.168.80.149 to 192.168.0.0/16 lookup 11
    32766:  from all lookup main
    32767:  from all lookup default

In this [simplified] output packets from pod 192.168.86.135 to 192.168.0.0/16
(VPC private network) should be routed through the eth0 (table 10 contains a
default route through eth0), and similar packets from pod 192.168.80.149 should
be routed through eth1. This breaks with BPF host routing because, as described
above, all packets go to eth0.

In order to fix this we want to disable the ENABLE_SKIP_FIB macro in BPF code.
This macro is a big optimization, so in this patch we only try to disable it
for configs which are expected to be broken by the change in v1.14.0-snapshot.2
(which allowed to enable endpoint routes and host routing simultaneously). So,
we enforce the FIB lookup when endpoint routes are enabled.

In general, for such setups users should be using `devices=eth+` and
`enableRuntimeDeviceDetection=true`. In future we should make this behaviour
default, and once done, to revert this patch.

Fixes: cilium#27343

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
@aspsk aspsk force-pushed the aspsk/pr/bpf-routing-do-not-skip-fib-lookup branch from dd1ae0f to 3dcffe8 Compare November 8, 2023 11:59
@aspsk
Copy link
Contributor Author

aspsk commented Nov 8, 2023

/test

@aspsk aspsk added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 8, 2023
@thorn3r thorn3r added this to Needs backport from main in 1.14.5 Nov 9, 2023
@thorn3r thorn3r removed this from Needs backport from main in 1.14.4 Nov 9, 2023
@squeed squeed merged commit f40efb5 into cilium:main Nov 10, 2023
61 checks passed
@maintainer-s-little-helper maintainer-s-little-helper bot removed ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Nov 10, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.4 Nov 10, 2023
@gandro gandro mentioned this pull request Nov 15, 2023
6 tasks
@gandro gandro added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Nov 15, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.4 Nov 15, 2023
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Nov 16, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.4 Nov 16, 2023
@nebril nebril moved this from Needs backport from main to Backport done to v1.14 in 1.14.5 Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.14.5
Backport done to v1.14
Development

Successfully merging this pull request may close these issues.

Broken connectivity when using BFP masquerade, no tunnel and BPF Host Routing on EKS with 1 device