Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loader: attach programs using tcx #30103

Merged
merged 6 commits into from
May 2, 2024
Merged

loader: attach programs using tcx #30103

merged 6 commits into from
May 2, 2024

Conversation

rgo3
Copy link
Contributor

@rgo3 rgo3 commented Jan 4, 2024

For more detailed descriptions, please refer to the individual commits.

On a high level, this PR:

  • attaches TC progs using bpf_link (tcx) respecting upgrade and downgrade paths
  • uses per-endpoint bpffs dirs, at e.g. /sys/fs/bpf/cilium/endpoints/12345/links/cil_to_container
  • adds netns-driven tests for attaching skb progs via tcx
  • adds a Helm flag (enableTXC) to optionally disable the feature to ease integration with other tools

Closes #27632

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 4, 2024
pkg/datapath/loader/netlink.go Outdated Show resolved Hide resolved
@rgo3
Copy link
Contributor Author

rgo3 commented Jan 5, 2024

Connectivity tests fail because currently the TCX API in cilium/ebpf doesn't do a feature test, so on kernels without TCX support creating links fails and we don't fall back to the old behavior. Opened cilium/ebpf#1294 to add a feature test in ebpf-go.

The build every commit action will continue to fail because I've separate commits for vendoring the lib changes and fixing the query programs API breakage. I'll keep them separate so the fix can easily be applied to the renovate PR that updates the ebpf-go dependency.

@rgo3
Copy link
Contributor Author

rgo3 commented Feb 27, 2024

/test

@rgo3 rgo3 changed the title Tcx for cilium loader: attach TC programs using bpf_link/TCX Feb 27, 2024
@rgo3 rgo3 force-pushed the tcx-for-cilium branch 2 times, most recently from 07ae5db to 876668e Compare February 27, 2024 10:02
@rgo3
Copy link
Contributor Author

rgo3 commented Feb 27, 2024

/test

@ti-mo ti-mo added sig/loader Impacts the loading of BPF programs into the kernel. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Mar 6, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Mar 6, 2024
@ti-mo
Copy link
Contributor

ti-mo commented Mar 6, 2024

/test

@ti-mo ti-mo added backport/author The backport will be carried out by the author of the PR. needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Mar 6, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.15.2 Mar 6, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.13.13 Mar 6, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.8 Mar 6, 2024
@ti-mo ti-mo changed the title loader: attach TC programs using bpf_link/TCX loader: attach programs using tcx Mar 6, 2024
@jrajahalme jrajahalme removed this from Needs backport from main in 1.15.2 Mar 13, 2024
@thorn3r thorn3r removed this from Needs backport from main in 1.13.13 Mar 13, 2024
@thorn3r thorn3r added this to Needs backport from main in 1.13.13 Mar 13, 2024
@thorn3r thorn3r added this to Needs backport from main in 1.13.14 Mar 13, 2024
@jrfastab
Copy link
Contributor

Some minor comments/questions about the code otherwise lgtm.

Copy link
Contributor

@learnitall learnitall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API changes LGTM, thanks!

rgo3 and others added 5 commits May 2, 2024 11:05
This commit adds the necessary infrastructure to attach bpf programs operating
on sk_buff using the kernel's new tcx hook.

Enabling the functionality in the agent's endpoint attachment path happens in
a follow-up commit.

Signed-off-by: Robin Gögge <r.goegge@isovalent.com>
Co-authored-by: Timo Beckers <timo@isovalent.com>
This commit puts the tcx logic in the endpoint attachment path and gates it
behind a new --enable-tcx agent flag. A follow-up commit will use the flag
in the Helm charts' configmap.

attachSKBProgram() now takes a bool to indicate if the user has requested tcx
attachments and seamlessly migrates programs between tcx and legacy attachment
modes in both directions. Of course, this process is contingent on no other
tcx programs being attached to the interface, as that disables legacy tc
execution.

Signed-off-by: Timo Beckers <timo@isovalent.com>
This commit adds the 'bpf.enableTCX' Helm value to allow disabling tcx
attachments if external tooling integrating with Cilium hasn't caught up
yet, as attaching a tcx program to an interface disables the legacy tc
ingress/egress pipelines.

The agent upgrades and downgrades interfaces seamlessly based on tcx being
enabled or not, so any existing workloads are migrated automatically at
runtime if the config flag is changed and the agent restarted. Rebooting
the node is not necessary.

Signed-off-by: Timo Beckers <timo@isovalent.com>
Extend the agent API to indicate whether Cilium is actually using tcx or
relying on legacy tc so that this can be displayed in `cilium status`.

Status when tcx is active:

  # kubectl exec cilium-4m7nq -- cilium-dbg status
  [...]
  BandwidthManager:        Disabled
  Routing:                 Network: Tunnel [geneve]   Host: Legacy
  Attach Mode:             TCX
  Masquerading:            IPTables [IPv4: Enabled, IPv6: Enabled]
  [...]

Status when inactive:

  # kubectl exec cilium-4m7nq -- cilium-dbg status
  [...]
  BandwidthManager:        Disabled
  Routing:                 Network: Tunnel [geneve]   Host: Legacy
  Attach Mode:             Legacy TC
  Masquerading:            IPTables [IPv4: Enabled, IPv6: Enabled]
  [...]

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add a note on tcx for the 1.16 upgrade guide.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@ti-mo
Copy link
Contributor

ti-mo commented May 2, 2024

/test

@ti-mo ti-mo enabled auto-merge May 2, 2024 09:08
@ti-mo ti-mo added this pull request to the merge queue May 2, 2024
Merged via the queue into cilium:main with commit 3f22794 May 2, 2024
63 of 64 checks passed
@ti-mo ti-mo deleted the tcx-for-cilium branch May 2, 2024 10:21
@joestringer
Copy link
Member

I wonder if this is something we should elevate to release-note/major given the potential performance impacts. This seems like a great thing to tout in the release blog.

@julianwiedmann
Copy link
Member

julianwiedmann commented May 3, 2024

👋 looking at https://github.com/cilium/cilium/actions/runs/8931592322. What's the current upgrade/downgrade impact of this feature - in particular without #32228 backported? Do we need to switch some of the bpf-next downgrade CI tests to bpf.enableTCX=false, until v1.15 is sorted?

@borkmann borkmann added release-note/major This PR introduces major new functionality to Cilium. and removed release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels May 3, 2024
@borkmann
Copy link
Member

borkmann commented May 3, 2024

👋 looking at https://github.com/cilium/cilium/actions/runs/8931592322. What's the current upgrade/downgrade impact of this feature - in particular without #32228 backported? Do we need to switch some of the bpf-next downgrade CI tests to bpf.enableTCX=false, until v1.15 is sorted?

I just did the backport of mentioned PR in #32337 . The ci-ipsec-upgrade test I've seen sporadically fail on various kernels though, not only bpf-next.

@borkmann
Copy link
Member

borkmann commented May 3, 2024

What's the current upgrade/downgrade impact of this feature - in particular without #32228 backported?

Without the backport there is a tiny race for downgrade where tcx gets removed and legacy tc installed so for a short window there is nothing attached. With the backport it's atomic. The upgrade is also atomic, that is, tcx is installed and then tc legacy removed (if tcx is installed successfully it takes precedence and legacy tc is not run). Lets see if the backport helps.

@borkmann
Copy link
Member

borkmann commented May 3, 2024

I just did the backport of mentioned PR in #32337 . The ci-ipsec-upgrade test I've seen sporadically fail on various kernels though, not only bpf-next.

(fyi, merged to v1.15 branch now.)

@julianwiedmann
Copy link
Member

julianwiedmann commented May 3, 2024

What's the current upgrade/downgrade impact of this feature - in particular without #32228 backported?

Without the backport there is a tiny race for downgrade where tcx gets removed and legacy tc installed so for a short window there is nothing attached. With the backport it's atomic. The upgrade is also atomic, that is, tcx is installed and then tc legacy removed (if tcx is installed successfully it takes precedence and legacy tc is not run). Lets see if the backport helps.

I'm at attempt 7 for the bpf-next downgrade from main to v1.15 in https://github.com/cilium/cilium/actions/runs/8927670014, which includes the txc work. And as the workflow is meant to be sensitive enough for such tiny blips, I'm very inclined to blame this race window ;).

Can we please flip the default to bpf.enableTCX=false, until the next v1.15 patch release is through and we can offer a smooth downgrade?

@borkmann
Copy link
Member

borkmann commented May 3, 2024

What's the current upgrade/downgrade impact of this feature - in particular without #32228 backported?

Without the backport there is a tiny race for downgrade where tcx gets removed and legacy tc installed so for a short window there is nothing attached. With the backport it's atomic. The upgrade is also atomic, that is, tcx is installed and then tc legacy removed (if tcx is installed successfully it takes precedence and legacy tc is not run). Lets see if the backport helps.

I'm at attempt 7 for the bpf-next downgrade from main to v1.15 in https://github.com/cilium/cilium/actions/runs/8927670014, which includes the txc work. And as the workflow is meant to be sensitive enough for such tiny blips, I'm very inclined to blame this race window ;).

Can we please flip the default to bpf.enableTCX=false, until the next v1.15 patch release is through and we can offer a smooth downgrade?

Ah yes, I thought the downgrade tests would test against latest v1.15 branch, not patch release. Cc @ti-mo lets default to false until then wdyt?

@borkmann
Copy link
Member

borkmann commented May 3, 2024

(PR is here for checking #32342 )

@rgo3
Copy link
Contributor Author

rgo3 commented May 3, 2024

@julianwiedmann I've retriggered the job you referenced previously and it is now passing: https://github.com/cilium/cilium/actions/runs/8927670014/job/24559664765. Your 7 runs prior to that pulled a v1.15 image before we merged the seamless downgrade path 4h ago. So I don't think bpf.enableTCX=false is necessary anymore. I'll also rerun it a couple times more to gather some additional data points.

@julianwiedmann
Copy link
Member

@julianwiedmann I've retriggered the job you referenced previously and it is now passing: https://github.com/cilium/cilium/actions/runs/8927670014/job/24559664765. Your 7 runs prior to that pulled a v1.15 image before we merged the seamless downgrade path 4h ago. So I don't think bpf.enableTCX=false is necessary anymore. I'll also rerun it a couple times more to gather some additional data points.

🤷 I give up on understanding the downgrade tests. So the minor downgrade is running against the previous minor's branch tip, and not the previous minor's current release. And from a quick git archaeology, that was always the case? Then we should be good indeed. Sorry for the noise. Thank you @rgo3 for keeping me honest!

@rgo3
Copy link
Contributor Author

rgo3 commented May 3, 2024

I feel the same 😅.

But yes, I believe and have seen in the CI logs that we install previous minor's branch tip, in this case v1.15. To me that also makes the most sense for the minor tests, given the previous minor's branch tip will be the code we support minor downgrades to once released. Otherwise, we would have to backport things, wait until it's merged and released, and only then can move forward with the PR for main (something we did in the past, including this PR), which always felt quite tedious to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/major This PR introduces major new functionality to Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/loader Impacts the loading of BPF programs into the kernel.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Attach TC programs using bpf_link tcx instead of netlink
10 participants