Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13: backport final batch of missed tail call-related fixes #30315

Merged
merged 5 commits into from
Jan 23, 2024

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented Jan 18, 2024

Note: the topmost commit of #30248 was dropped since we can't guarantee no missed tail calls on 1.13 and earlier. In these versions, iproute2 is used for loading all BPF, which reuses tail call maps, which potentially causes missed tail calls if an upgrade introduces a new tail call, or if some code just moves around enough.

Once this PR is merged, a GitHub action will update the labels of these PRs:

 29482 30214 30248

@ti-mo ti-mo added kind/backports This PR provides functionality previously merged into master. backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. labels Jan 18, 2024
@ti-mo ti-mo changed the title v1.13 Backports 2024-01-18 v1.13: backport final batch of missed tail call-related fixes Jan 18, 2024
[ upstream commit aa51114 ]
[ backporter's notes: The 3 return sites in bpf_lxc were moved in 1.14,
  so these had to be amended. No functional differences. ]

bpf_lxc.c contains multiple tail calls into POLICY_CALL_MAP at the HOST_EP_ID
slot. The program in this slot is provided by bpf_host.c. During first agent
startup, there are often multiple Pods pending creation due to no CNI being
available. As soon as the agent's local API becomes available, these outstanding
CNI requests have a chance to be accepted by the API handler.

If one such request is serviced before the host datapath controller has a chance
to grab the compilation lock, the endpoint program will compile and attach first.
If the host firewall is enabled, and this new workload endpoint sends a packet
to the host, the host's ingress policy needs to be enforced. However, because
bpf_host hasn't been loaded yet, this policy program is not yet present in
POLICY_CALL_MAP, resulting in a missed tail call.

One potential solution to this problem would be making sure the host datapath
always attaches before workload endpoints do. There's one problem with this
solution: clustermesh requires data from other clusters in order to correctly
populate the local ipcache, and the ipcache currently needs to be populated for
the host endpoint to finish attaching. It obtains this information through
clustermesh-apiserver, typically deployed onto the local node as a regular Pod.
This means workload endpoints must be able to deploy before the host endpoint.

As a stop gap, tolerate these kinds of drops and assign them a specific meaning,
without letting them spill over into the generic 'missed tail call' bucket. To
stabilize end-to-end tests, we're aiming to enforce zero dropped tail calls in
all CI scenarios, since it leads to packets that mysteriously go missing,
introducing chaos that's impossible to troubleshoot.

Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit a15a463 ]
[ backporter's notes: added a v1.13-specific comment to make sure
  the missed tail call exception is not removed on this branch ]

Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 385dbe5 ]

Allowing replaceDatapath() to be cancelled in the middle of an ongoing map
migration is a potential source of chaos. We've recently seen some flakes
with errors like `Removed pending pinned map, did the agent die unexpectedly?`,
so let's remove this context check to reduce the likelyhood of that happening.

Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 760a109 ]

This has been making ci-ginkgo fail recently. With the removal of map
migrations around the corner (#29333),
and having declared bankruptcy on the Ginkgo test suite, let's not waste more
time chasing this bugbear.

Signed-off-by: Timo Beckers <timo@isovalent.com>
[ upstream commit 4651129 ]

See code comments for a detailed description of the problem. This commit
installs policy programs before attaching tc/xdp hooks since doing things
in the wrong order means dropping tail calls when handling traffic if the
policy programs aren't inserted.

Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the pr/v1.13-backport-2024-01-18 branch from ac0d719 to e67ca86 Compare January 18, 2024 13:27
@ti-mo
Copy link
Contributor Author

ti-mo commented Jan 18, 2024

/test-backport-1.13

@ti-mo ti-mo marked this pull request as ready for review January 18, 2024 15:33
@ti-mo ti-mo requested review from a team as code owners January 18, 2024 15:33
Copy link
Contributor

@brlbil brlbil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI change ✅

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 23, 2024
@squeed squeed merged commit f4f0ff7 into v1.13 Jan 23, 2024
140 checks passed
@squeed squeed deleted the pr/v1.13-backport-2024-01-18 branch January 23, 2024 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

5 participants