Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-scale IPcache: Chapter 5 #25601

Merged
merged 6 commits into from May 25, 2023
Merged

Conversation

pchaigno
Copy link
Member

This is in the context of the high-scale ipcache feature described at cilium/design-cfps#7.

This pull request adds support for endpoint routes in high-scale ipcache mode. See commits for details.

Updates: #25243.

Support `enable-endpoint-routes` with `enable-high-scale-ipcache`.

@pchaigno pchaigno added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. feature/high-scale-ipcache Relates to the high-scale ipcache feature. labels May 22, 2023
@pchaigno pchaigno force-pushed the hs-ipcache-endpoint-routes branch from 9e102bb to 3c8b31f Compare May 22, 2023 20:19
@pchaigno pchaigno marked this pull request as ready for review May 23, 2023 08:57
@pchaigno pchaigno requested review from a team as code owners May 23, 2023 08:57
bpf/bpf_host.c Outdated Show resolved Hide resolved
Reported-by: Gray Lian <gray.liang@isovalent.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
This commit moves the decapsulation of the special tunnel for high-scale
IPcache mode to the native device. We can't rely on cilium_host to
decapsulate because we won't be going through cilium_host when endpoint
routes are enabled.

As a consequence, native devices are required for Cilium to function in
this mode. Therefore, Cilium agents will attempt to auto-detect the
devices and fatal if they can't.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
In a subsequent commit, we will rely on the endpoints map to retrieve
the security identity of local endpoints. This commit therefore
populates the endpoints map with the security identity of each endpoint.
The map stays the same size because we can reuse the existing padding.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
In high-scale IPcache mode, we can't rely on the IPcache to retrieve the
security identity of local endpoints. When endpoint routes are disabled,
we jump directly from the source to the destination BPF programs and can
therefore pass the security identity via the skb->cb. However, when
endpoint routes are enabled, we must go through the Linux stack.

In that last case, we can rely on the endpoints map to retrieve security
IDs at the destination BPF program. This commit implements that work
around.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
In high-scale IPcache mode, pods are excluded from the IPcache and we
therefore can't rely on the IPcache to know when to encapsulate traffic.
We decide to encapsulate or not based on a user-provided set of CIDRs.

Of course, we never want to encapsulate traffic going to local pods. If
endpoint routes are disabled, that is already the case: we will never go
through the encapsulation logic as we will directly deliver packets to
their destination endpoint with a bpf_redirect.

If endpoint routes are enabled however, we need to check if the
destination is a local endpoint, in which case we don't want to
encapsulate. We do that in this commit using the local endpoint map
which contains only endpoints hosted on the node.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Previous commits fixed our support for endpoint routes with high-scale
IPcache so we can now cover it in tests.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
@pchaigno pchaigno force-pushed the hs-ipcache-endpoint-routes branch from 8df9be0 to 718b8cf Compare May 24, 2023 14:51
@pchaigno
Copy link
Member Author

/test

if (!revalidate_data(ctx, &data, &data_end, &ip6))
return DROP_INVALID;

ep = __lookup_ip6_endpoint((union v6addr *)&ip6->saddr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a stupid question: should it be ip6->daddr?

If the traffic is from remote, local endpoint map doesn't have an entry for the saddr, right? So this lookup can only be useful for local-to-local traffic. In that case, can we perform policy check at from-container's handle_ipv4_from_lxc, where we can get the dst_sec_id from local endpoint map if ipcache doesn't return an entry:

cilium/bpf/bpf_lxc.c

Lines 832 to 834 in 4535f13

info = lookup_ip4_remote_endpoint(ip4->daddr, cluster_id);
if (info && info->sec_identity) {
*dst_sec_identity = info->sec_identity;

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 25, 2023
@squeed squeed merged commit 96dffce into cilium:main May 25, 2023
58 checks passed
@pchaigno
Copy link
Member Author

@squeed Better not trust those MLH labeling blindly. There was an open question from Gray above.

@pchaigno pchaigno deleted the hs-ipcache-endpoint-routes branch May 25, 2023 08:26
@squeed
Copy link
Contributor

squeed commented May 25, 2023

@pchaigno argh, good catch, sorry.

@pchaigno
Copy link
Member Author

No big deal. Everything here is behind a feature flag so we won't break anything else.
I just thought I'd warn given MLH is often sneaky like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/high-scale-ipcache Relates to the high-scale ipcache feature. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants