Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium: IPv4 BIG TCP support #26172

Merged
merged 12 commits into from Jun 16, 2023
Merged

cilium: IPv4 BIG TCP support #26172

merged 12 commits into from Jun 16, 2023

Conversation

borkmann
Copy link
Member

@borkmann borkmann commented Jun 13, 2023

IPv4 BIG TCP support for Cilium on 6.3+ kernels.

Link: https://lore.kernel.org/netdev/cover.1674921359.git.lucien.xin@gmail.com/

@borkmann borkmann added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/major This PR introduces major new functionality to Cilium. release-blocker/1.14 This issue will prevent the release of the next version of Cilium. labels Jun 13, 2023
@borkmann borkmann force-pushed the pr/big-tcp-ipv4 branch 8 times, most recently from ca412ff to 38430dd Compare June 13, 2023 21:57
@joestringer joestringer added the kind/feature This introduces new functionality. label Jun 14, 2023
@borkmann borkmann force-pushed the pr/big-tcp-ipv4 branch 9 times, most recently from 9a4963c to 86a7c53 Compare June 15, 2023 15:34
@borkmann
Copy link
Member Author

/test

@borkmann borkmann marked this pull request as ready for review June 15, 2023 15:41
@borkmann borkmann requested review from a team as code owners June 15, 2023 15:41
@borkmann
Copy link
Member Author

/test

@borkmann
Copy link
Member Author

Given that the last commit is also in #26205, and seems to be causing some CI issues in there, should we leave it out of this one?

That actually went through fine, see the l4lb where big tcp is part of.

Turns out vishvananda/netlink@acdc658 causes a regression in Cilium. I've reverted that specific commit in our local copy for now.

@borkmann
Copy link
Member Author

/test

@borkmann borkmann requested a review from jrfastab June 15, 2023 18:21
@borkmann borkmann added kind/performance There is a performance impact of this. kind/enhancement This would improve or streamline existing functionality. labels Jun 15, 2023
Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for vendor changes 💯

Copy link
Member

@dylandreimerink dylandreimerink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for my code owners. Thanks for nicely splitting it up into easy-to-review and well-documented commits.

Given we have IPv6 BIG TCP support already, lets also support it
for IPv4.

From the IPv4 BIG TCP work:

  Different from IPv6, IPv4 tot_len is 16-bit long only, and IPv4
  header doesn't have exthdrs(options) for the BIG TCP packets'
  length. To make it simple, as David and Paolo suggested, we set
  IPv4 tot_len to 0 to indicate this might be a BIG TCP packet and
  use skb->len as the real IPv4 total length.

  This will work safely, as all BIG TCP packets are GSO/GRO packets
  and processed on the same host as they were created; There is no
  padding in GSO/GRO packets, and skb->len - network_offset is
  exactly the IPv4 packet total length; Also, before implementing
  the feature, all those places that may get iph tot_len from BIG
  TCP packets are taken care with some new APIs.

The device settings are different from IPv6 ones, so we also need
to explicitly set them via IFLA_GSO_IPV4_MAX_SIZE and
IFLA_GRO_IPV4_MAX_SIZE for all involved devices to take effect.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/netdev/cover.1674921359.git.lucien.xin@gmail.com/
Extend the veth setup to also configure IPv4 GRO/GSO max size
configuration for every new veth for a given Pod.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add status dump field, so that this can be introspected via sysdump.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add generated API code for the status exposure.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Hook up the status field from agent and client side.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add Helm setting for IPv4 BIG TCP (enableIPv4BIGTCP) which defaults
to false. Used "make -C install/kubernetes cilium/values.yaml" to
autogenerate the values.yaml file.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add documentation for the tuning guide. Also improve the IPv6 section
slightly.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Add the IPv4 BIG TCP kernel requirements to the table.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Now that we have BPF IPv6 masquerading, lift the restriction that it
needs to be disabled specifically.

Also remove nodePort.enabled=true as this is redundant with KPR=strict.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
@borkmann
Copy link
Member Author

(CI all green, pushing small improvement)

@borkmann
Copy link
Member Author

/test

Copy link
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good to me, nice work!

@borkmann borkmann added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 16, 2023
@borkmann borkmann merged commit c3eb7b0 into main Jun 16, 2023
61 of 62 checks passed
@borkmann borkmann deleted the pr/big-tcp-ipv4 branch June 16, 2023 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement This would improve or streamline existing functionality. kind/feature This introduces new functionality. kind/performance There is a performance impact of this. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-blocker/1.14 This issue will prevent the release of the next version of Cilium. release-note/major This PR introduces major new functionality to Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants