Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.13] Always migrate cilium_calls_* during ELF load #28829

Merged
merged 3 commits into from
Nov 8, 2023

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented Oct 27, 2023

Manual backport of #28740.

Added a bit to init.sh since iproute2 is still used for loading some of the host datapath here.

[ upstream commit 9420217 ]

See code comments, as the change is subtle. This sets things up for subsequent
commits. As a minor benefit, make the code slightly less unreadable.

Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 94ec45a ]

Note: this is obviously a hack. There's a long-standing issue with Cilium's
loader infrastructure where program arrays like cilium_calls_* are reused if
the map's properties in the ELF are compatible with the pinned version present
on the system. This has been the case since the early iproute2 days, and
ebpf-go has the same behaviour by default. This is unsound, but doesn't always
cause issues in practice.

However, as Cilium's eBPF code base grew, along with its contributor base,
increasing code churn and the need to backport fixes, this method started
falling apart. Pinned tail call maps are actively used for handling traffic,
and should be considered fixed, as they are an integral part of the program's
control flow. When another BPF object needs to be attached to a kernel hook,
it's incorrect to replace entries in an existing program array used by a prior
version of the BPF object, since logic in some tail calls may have been enabled
or disabled in the new one, or may have been removed altogether.

A prog array's contents cannot be replaced atomically, meaning that while the
prog array is being repopulated by a new ELF, a packet may be processed by a
chain of programs that are partially old and partially new. This is bad.

This patch fixes the specific case of cilium_calls_*, since it's the only
prog array across all ELFs that consistently acts as the central tail call map.
The main motivation for this approach is backportability as far back as 1.12,
since we're seeing increased connectivity interruptions (leading to test flakes)
during up/downgrades and restarts in that branch. This is caused by other
backports creating code churn, as well as the stark difference in the contents
of tail calls between major Cilium versions.

This code will be replaced by a 'commit' system that does away with the re-
pinning behaviour and keeps new map fd's in memory; only committing them to
bpffs when the necessary endpoint prog(s) were successfully replaced.

Fixes: cilium#20691

Signed-off-by: Timo Beckers <timo@isovalent.com>
[ no upstream commit ]

See previous commit. This one does the same but for the remaining BPF loads
done from init.sh

Signed-off-by: Timo Beckers <timo@isovalent.com>
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. labels Oct 27, 2023
@ti-mo
Copy link
Contributor Author

ti-mo commented Oct 27, 2023

/test-backport-1.13

@margamanterola margamanterola added the dont-merge/waiting-for-user-feedback Waiting for feedback from user before the PR should be merged. label Oct 27, 2023
@margamanterola
Copy link
Member

Marking as "Don't merge", we'll let the changes bake in main for a few days to check that they are safe.

@ti-mo ti-mo removed the dont-merge/waiting-for-user-feedback Waiting for feedback from user before the PR should be merged. label Nov 7, 2023
@ti-mo ti-mo marked this pull request as ready for review November 7, 2023 09:41
@ti-mo ti-mo requested a review from a team as a code owner November 7, 2023 09:41
@ti-mo ti-mo requested review from a team and dylandreimerink and removed request for a team November 7, 2023 12:28
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 8, 2023
@squeed squeed merged commit 8da97f9 into cilium:v1.13 Nov 8, 2023
62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

4 participants