-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loader: explicitly insert endpoint policy programs into policy maps #29307
Conversation
/test |
8cf8a66
to
cb4ee05
Compare
/test |
BPF ELF loaders generally populate maps in a non-deterministic order when it comes to the order of the maps themselves. During some loads, cilium_calls may be populated first, sometimes cilium_(egress)call_policy gets populated first. The latter case is a problem since existing endpoints will constantly get their cilium_call_policy entry invoked for incoming packets. If its tail call map(s) are not yet populated, that will result in missed tail calls leading to packet drops. This commit zeroes the call maps' MapSpec.Contents and manually resolves []MapKV to the referenced policy progs, both for ingress and egress (l7) handlers. Policy programs are explicitly inserted into the respective call maps after the entrypoint(s) have been attached. This calls for major refactoring -- inserting the policy program is equal to attaching an entrypoint to a BPF hook and should be done explicitly by an endpoint manager that has access to the Endpoint object. This would make extracting the endpoint ID from the ELF section name redundant. Signed-off-by: Timo Beckers <timo@isovalent.com>
cb4ee05
to
4098439
Compare
/test |
@ti-mo Could you elaborate more? In particular, is the following correct?
Your change changes the 3. step in a way that the cilium_call_policy get populated before referenced by the cilium_calls. Is it correct?
What do you mean by "(l7)"? |
@brb Thanks for the questions!
There's only supposed to be one cilium_(egress)call_policy, so both old and new endpoint progs will point to the same during endpoint regenerations.
Not completely sure what you mean by this, but this patch makes it so insertions into cilium_(egress)call_policy are delayed until after the ELF's entrypoints are attached.
Not sure if still relevant due to points above.
The other way around! cilium_calls is internal to the ELF and needs to be fully populated in order for the programs' control flow to be sound. When this is completed, entrypoints can be attached and cilium_(egress)call_policy can be upserted.
I'm not quite sure what |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation and the patch! LGTM.
@ti-mo 👍 Let's create an issue for it, so that we could prioritize it for v1.16. |
BPF ELF loaders generally populate maps in a non-deterministic order when it comes to the order of the maps themselves. During some loads, cilium_calls may be populated first, sometimes cilium_(egress)call_policy gets populated first. The latter case is a problem since existing endpoints will constantly get their cilium_call_policy entry invoked for incoming packets. If its tail call map(s) are not yet populated, that will result in missed tail calls leading to packet drops.
This commit zeroes the call maps' MapSpec.Contents and manually extracts the endpoint id from the ELF section name of the policy progs, both for ingress and egress (l7) handlers.
This calls for major refactoring -- inserting the policy program is equal to attaching an entrypoint to a BPF hook and should be done explicitly by an endpoint manager that has access to the Endpoint object. This would make extracting the endpoint ID from the ELF section name redundant.
Fixes: #27720
Fixes: #26739
@joe @jrajahalme MBOI