Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loader: explicitly insert endpoint policy programs into policy maps #29307

Merged
merged 1 commit into from
Nov 22, 2023

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented Nov 21, 2023

BPF ELF loaders generally populate maps in a non-deterministic order when it comes to the order of the maps themselves. During some loads, cilium_calls may be populated first, sometimes cilium_(egress)call_policy gets populated first. The latter case is a problem since existing endpoints will constantly get their cilium_call_policy entry invoked for incoming packets. If its tail call map(s) are not yet populated, that will result in missed tail calls leading to packet drops.

This commit zeroes the call maps' MapSpec.Contents and manually extracts the endpoint id from the ELF section name of the policy progs, both for ingress and egress (l7) handlers.

This calls for major refactoring -- inserting the policy program is equal to attaching an entrypoint to a BPF hook and should be done explicitly by an endpoint manager that has access to the Endpoint object. This would make extracting the endpoint ID from the ELF section name redundant.

Fixes: #27720
Fixes: #26739

Avoid missed tail calls due to inserting policy programs too early during endpoint regeneration

@joe @jrajahalme MBOI

@ti-mo ti-mo requested a review from brb November 21, 2023 10:43
@ti-mo ti-mo requested a review from a team as a code owner November 21, 2023 10:43
@ti-mo ti-mo requested a review from rgo3 November 21, 2023 10:43
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 21, 2023
@ti-mo ti-mo added kind/bug This is a bug in the Cilium logic. sig/loader Impacts the loading of BPF programs into the kernel. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Nov 21, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 21, 2023
@ti-mo ti-mo added needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Nov 21, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.5 Nov 21, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.13.10 Nov 21, 2023
@ti-mo
Copy link
Contributor Author

ti-mo commented Nov 21, 2023

/test

@ti-mo
Copy link
Contributor Author

ti-mo commented Nov 21, 2023

/test

BPF ELF loaders generally populate maps in a non-deterministic order when it
comes to the order of the maps themselves. During some loads, cilium_calls
may be populated first, sometimes cilium_(egress)call_policy gets populated
first. The latter case is a problem since existing endpoints will constantly
get their cilium_call_policy entry invoked for incoming packets. If its tail
call map(s) are not yet populated, that will result in missed tail calls
leading to packet drops.

This commit zeroes the call maps' MapSpec.Contents and manually resolves
[]MapKV to the referenced policy progs, both for ingress and egress (l7)
handlers. Policy programs are explicitly inserted into the respective call
maps after the entrypoint(s) have been attached.

This calls for major refactoring -- inserting the policy program is equal to
attaching an entrypoint to a BPF hook and should be done explicitly by an
endpoint manager that has access to the Endpoint object. This would make
extracting the endpoint ID from the ELF section name redundant.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the tb/manually-insert-policy-progs branch from cb4ee05 to 4098439 Compare November 21, 2023 14:20
@ti-mo
Copy link
Contributor Author

ti-mo commented Nov 21, 2023

/test

@brb
Copy link
Member

brb commented Nov 22, 2023

During some loads, cilium_calls may be populated first, sometimes cilium_(egress)call_policy gets populated first. The latter case is a problem since existing endpoints will constantly get their cilium_call_policy entry invoked for incoming packets. If its tail call map(s) are not yet populated, that will result in missed tail calls leading to packet drops.

@ti-mo Could you elaborate more? In particular, is the following correct?

  1. During reloading, cilium_call_policy global map is recreated. Same goes for cilium_calls global map.
  2. The new cilium_calls references the new cilium_call_policy.
  3. The new cilium_call_policy entries get populated during the async endpoint regeneration.
  4. The new cilium_calls map is used by reloaded bpf_host which might do a tail call via the new cilium_call_policy[ into yet not populated entry.

Your change changes the 3. step in a way that the cilium_call_policy get populated before referenced by the cilium_calls. Is it correct?

both for ingress and egress (l7) handlers

What do you mean by "(l7)"?

@ti-mo
Copy link
Contributor Author

ti-mo commented Nov 22, 2023

@brb Thanks for the questions!

  1. During reloading, cilium_call_policy global map is recreated. Same goes for cilium_calls global map.

cilium_(egress)call_policy are not recreated unless they're resized (based on ENDPOINTS_MAP_SIZE). They're shared across bpf_host, bpf_overlay and bpf_lxc, so resizing means we need to replace all Cilium progs on the system. Currently there's no mechanism that tries to make this seamless (we don't claim it to be), since the agent doesn't explicitly manage these maps. After the first ELF with updated ENDPOINTS_MAP_SIZE is loaded, the old one will be unpinned, removing all endpoints' entries. It's undefined behaviour what will and won't work, this is out of scope of this PR/issue.

cilium_calls_01234 is per-endpoint/device and not global.

  1. The new cilium_calls references the new cilium_call_policy.

There's only supposed to be one cilium_(egress)call_policy, so both old and new endpoint progs will point to the same during endpoint regenerations.

  1. The new cilium_call_policy entries get populated during the async endpoint regeneration.

Not completely sure what you mean by this, but this patch makes it so insertions into cilium_(egress)call_policy are delayed until after the ELF's entrypoints are attached.

  1. The new cilium_calls map is used by reloaded bpf_host which might do a tail call via the new cilium_call_policy[ into yet not populated entry.

Not sure if still relevant due to points above.

Your change changes the 3. step in a way that the cilium_call_policy get populated before referenced by the cilium_calls. Is it correct?

The other way around! cilium_calls is internal to the ELF and needs to be fully populated in order for the programs' control flow to be sound. When this is completed, entrypoints can be attached and cilium_(egress)call_policy can be upserted.

both for ingress and egress (l7) handlers

What do you mean by "(l7)"?

I'm not quite sure what cilium_egresscall_policy's purpose is, but my understanding is these programs are called for packets egressing Envoy. Someone else might need to fill in the gaps here.

@ti-mo ti-mo added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels Nov 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.5 Nov 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.13 in 1.13.10 Nov 22, 2023
Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation and the patch! LGTM.

@brb
Copy link
Member

brb commented Nov 22, 2023

This calls for major refactoring -- inserting the policy program is equal to attaching an entrypoint to a BPF hook and should be done explicitly by an endpoint manager that has access to the Endpoint object. This would make extracting the endpoint ID from the ELF section name redundant.

@ti-mo 👍 Let's create an issue for it, so that we could prioritize it for v1.16.

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 22, 2023
@ti-mo ti-mo merged commit f66a6c2 into cilium:main Nov 22, 2023
61 checks passed
@ti-mo ti-mo deleted the tb/manually-insert-policy-progs branch November 22, 2023 10:58
@ti-mo ti-mo added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Nov 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.14 to Backport done to v1.14 in 1.14.5 Nov 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.10 Nov 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.10 Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/author The backport will be carried out by the author of the PR. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/loader Impacts the loading of BPF programs into the kernel. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
No open projects
1.13.10
Backport done to v1.13
1.14.5
Backport done to v1.14
3 participants