New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf,init.sh: make netdev bpf filter cleanup less eager #24336
Conversation
|
/test Job 'Cilium-PR-K8s-1.25-kernel-4.19' failed: Click to show.Test NameFailure OutputIf it is a flake and a GitHub issue doesn't already exist to track it, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are here, let's extend the K8sUpdates test to check whether flows got interrupted. In particular, we can reuse https://github.com/cilium/cilium/blob/master/test/k8s/updates.go#L328, and call the function after restarting Cilium.
Do we mean to supplant |
Due to an oversight when updating init.sh to deal with the new tc filter names for bpf progs after the introduction of Go-based loader/netlink attach, all interfaces in the host namespace that didn't contain the word 'cilium' would have their egress and ingress filters stripped. This included lxc interfaces and many others. lxc programs in particular would only be reattached when the endpoint got regenerated, which can take a while on nodes with many Pods. This caused connectivity interruptions in the meantime. This commit changes the tc filter naming convention to converge on the changes introduced in 2e40d67 ("bpf: Finish rename of BPF programs to cil_ prefix"), using the bpf program (function) name containing the 'cil_' prefix. The 'cilium_' prefix is no longer included explicitly, instead opting for the program name suffixed by the interface name, e.g. cil_from_netdev-eth0. init.sh no longer uses the term 'cilium' to trigger a removal of the interface's tc filters. Also switched over to a regex that acts on a word boundary to reduce the chance of a false positive (e.g. a filter pencil_foo installed by another tool should not trigger removal). Fixes commit 2a7cef4 ("init,cleanup: remove TC filters containing 'cilium' in their names"). Signed-off-by: Timo Beckers <timo@isovalent.com>
3a7116f
to
2a4402c
Compare
We would only test service interruptions using up/downgrades, but agent restarts should also be clean. Signed-off-by: Timo Beckers <timo@isovalent.com> Co-authored-by: Martynas Pumputis <m@lambda.lt>
2a4402c
to
1d29bda
Compare
|
/test |
|
Can confirm, no RST or client/server disruption is present after this PR. |
Due to an oversight when updating init.sh to deal with the new tc filter names for bpf progs after the introduction of Go-based loader/netlink attach, all interfaces in the host namespace that didn't contain the word 'cilium' would have their egress and ingress filters stripped. This included lxc interfaces and many others. lxc programs in particular would only be reattached when the endpoint got regenerated, which can take a while on nodes with many Pods. This caused connectivity interruptions in the meantime.
This commit changes the tc filter naming convention to converge on the changes introduced in 2e40d67 ("bpf: Finish rename of BPF programs to cil_ prefix"), using the bpf program (function) name containing the 'cil_' prefix. The 'cilium_' prefix is no longer included explicitly, instead opting for the program name suffixed by the interface name, e.g. cil_from_netdev-eth0.
init.sh no longer uses the term 'cilium' to trigger a removal of the interface's tc filters.
Fixes commit 2a7cef4 ("init,cleanup: remove TC filters containing 'cilium' in their names").
Fixes: #24191