Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support DSR with Geneve dispatch #23890

Merged
merged 8 commits into from Apr 4, 2023
Merged

Conversation

ysksuzuki
Copy link
Member

@ysksuzuki ysksuzuki commented Feb 20, 2023

For reviewers,

  1. Currently, this implementation doesn't support IPv6 with XDP. It works with IPv4 either in XDP or TC, relying on XDP to TC punt. To support IPv6 in XDP, we need to implement the XDP native encap, which planning to do as a follow-up.
  2. This implementation is compatible with the Geneve tunneling mode, which means that it works with either the direct routing mode or the Geneve tunneling mode. but unfortunately, it doesn't work with VXlan tunnel because there's no space to store service IP/port in the VXlan header.
  3. Picked up an unassigned Geneve option class and type, referencing We have been allocated 0x014B https://www.iana.org/assignments/nvo3/nvo3.xhtml#geneve-option-class.
  4. Trying to mitigate the MTU overhead by encapsulating only SYN packets as the IP option mode does Encapsulate the whole connection, but only add the GENEVE option to the TCP-SYN packets

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Thanks for contributing!

Fixes: #22955

Support DSR with Geneve dispatch in CNI mode

@ysksuzuki ysksuzuki requested review from a team as code owners February 20, 2023 17:08
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 20, 2023
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Feb 20, 2023
@ysksuzuki ysksuzuki marked this pull request as draft February 20, 2023 17:09
@ysksuzuki
Copy link
Member Author

Hi @julianwiedmann! Here is the draft PR on top of #22978. I will update docs tomorrow.

@julianwiedmann julianwiedmann added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. labels Feb 20, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Feb 20, 2023
@julianwiedmann
Copy link
Member

Hi @julianwiedmann! Here is the draft PR on top of #22978. I will update docs tomorrow.

awesome, thanks @ysksuzuki ! Will try to take a first look tomorrow :)

@julianwiedmann
Copy link
Member

/test

Copy link
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ysksuzuki - just a few comments from a quick look. Haven't poked at the XDP-to_TC logic yet. Looking really good so far! Please consider if you can split the patches up somewhat - maybe one for the DSR insertion, one for the DSR extraction, one for the agent ...?

My initial expectation was that we would encapsulate the whole DSR connection in Geneve, not just the TCP-SYN. Seeing this mixed pattern in network traces will be fun when debugging :).

bpf/lib/nodeport.h Outdated Show resolved Hide resolved
bpf/lib/common.h Outdated Show resolved Hide resolved
bpf/lib/nodeport.h Outdated Show resolved Hide resolved
bpf/bpf_overlay.c Outdated Show resolved Hide resolved
@ysksuzuki
Copy link
Member Author

ysksuzuki commented Feb 22, 2023

@julianwiedmann Thank you for the review!

Please consider if you can split the patches up somewhat - maybe one for the DSR insertion, one for the DSR extraction, and one for the agent ...?

Sure, I will reorganize my commits.

My initial expectation was that we would encapsulate the whole DSR connection in Geneve, not just the TCP-SYN. Seeing this mixed pattern in network traces will be fun when debugging :).

I'm trying to mitigate the MTU overhead, but it might be a premature optimization. What do you think? Should we encapsulate the whole DSR connection and keep it simple?

I implemented this because

bpf/bpf_host.c Outdated Show resolved Hide resolved
@julianwiedmann
Copy link
Member

My initial expectation was that we would encapsulate the whole DSR connection in Geneve, not just the TCP-SYN. Seeing this mixed pattern in network traces will be fun when debugging :).

I'm trying to mitigate the MTU overhead, but it might be a premature optimization. What do you think? Should we encapsulate the whole DSR connection and keep it simple?

I implemented this because

* The current IP option-based DSR is doing the same optimization

* There's a need to mitigate tunneling overhead. e.g. [CFP: Implement hybrid tunnel mode  #22898](https://github.com/cilium/cilium/issues/22898)

Avoiding the MTU overhead is definitely a good motivation for this approach, agreed.

Besides the general "packet flow won't look homogenous" thought, I'm a bit concerned about encountering connection-tracking in middle boxes, and them not seeing our TCP-SYN (as it's encapsulated).

@ysksuzuki
Copy link
Member Author

I'm a bit concerned about encountering connection-tracking in middle boxes, and them not seeing our TCP-SYN (as it's encapsulated).

Yeah, I'm worried about it too. So how about this, we encapsulate the whole connection and keep it simple for now, and save this approach until we get feedback from users that the MTU overhead is pretty high and want to avoid it.

@julianwiedmann
Copy link
Member

I'm a bit concerned about encountering connection-tracking in middle boxes, and them not seeing our TCP-SYN (as it's encapsulated).

Yeah, I'm worried about it too. So how about this, we encapsulate the whole connection and keep it simple for now, and save this approach until we get feedback from users that the MTU overhead is pretty high and want to avoid it.

Yep, sounds good!

Copy link
Member

@julianwiedmann julianwiedmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ysksuzuki ! Just a few smaller things, still need to digest the rest. So maybe hold off for a bit ...

Do we need to reduce the MTU accordingly to the max GENEVE option-size? 🤔

bpf/lib/common.h Outdated Show resolved Hide resolved
bpf/lib/nodeport.h Outdated Show resolved Hide resolved
bpf/lib/nodeport.h Outdated Show resolved Hide resolved
bpf/lib/encap.h Outdated Show resolved Hide resolved
@ysksuzuki
Copy link
Member Author

we're trying to get away from ginkgo-based testing

I didn't know about it. Do you have an issue or design docs about it? I'm happy to work on testing DSR with Geneve functionality using the new test framework.

@ysksuzuki ysksuzuki force-pushed the dsr-geneve-on-22978 branch 2 times, most recently from 345dd9b to e951684 Compare April 1, 2023 15:19
@julianwiedmann
Copy link
Member

/test

@ysksuzuki
Copy link
Member Author

k8s-1.26-kernel-net-next failed. Flake: #24687
https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/1614/testReport/junit/Suite-k8s-1/26/K8sUpdates_Tests_upgrade_and_downgrade_from_a_Cilium_stable_image_to_master/

runtime failed. Flake: #24326
https://jenkins.cilium.io/job/Cilium-PR-Runtime-net-next/5851/testReport/junit/(root)/Suite-runtime/RuntimeDatapathPrivilegedUnitTests_Run_Tests/

	 [13:40:21] FAIL: node_linux_test.go:2247: linuxPrivilegedIPv4OnlyTestSuite.TestArpPingHandling
	 [13:40:21] 
	 [13:40:21] node_linux_test.go:2451:
	 [13:40:21]     // deleteNeighbor is invoked async too
	 [13:40:21]     wait(nodev1.Identity(), "veth0", nil, true)
	 [13:40:21] node_linux_test.go:2368:
	 [13:40:21]     c.Assert(err, check.IsNil)
	 [13:40:21] ... value *errors.errorString = &errors.errorString{s:"timeout reached while waiting for condition"} ("timeout reached while waiting for condition")

@ysksuzuki
Copy link
Member Author

Need to rebase or just rerun?

This commit encapsulates the DSR flow with Geneve, inserts DSR specific
option, the service IP/port, into the Geneve option and redirects to
the selected backend. Currently, it doesn't support IPv6 with XDP.

Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
This commit optimizes the tunneling overhead by only adding the GENEVE
option to the TCP-SYN packets, sending the whole connection over
the tunnel

Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
This commit checks the expanded packet length when encapsulating the dsr
flow with GENEVE header and if it exceeds the device MTU, then returns
ICMP Destination Unreachable/Fragmentation needed.

Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
Let the dsr code extract the DSR specific option, the service IP/port,
from the Geneve header and integrate with the DSR implementation on the
receiver side.

Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
This commit adds DSR with Geneve dispatch mode. Geneve dispatch mode is
compatible with the Geneve tunneling mode, which means that it works with
either the direct routing mode or the Geneve tunneling mode. but unfortunately,
it doesn't work with vxlan tunnel because there's no space to store service
IP/port in the vxlan header.

Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
Copy link
Contributor

@ldelossa ldelossa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work.

@joestringer
Copy link
Member

/test

@ysksuzuki
Copy link
Member Author

All Green!

@julianwiedmann
Copy link
Member

@ldelossa was requested for cilium/cli, but for some reason GH didn't resolve that codeowner on his review 🤨.

All reviews in & tests pass, off we go.

@julianwiedmann julianwiedmann added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 4, 2023
@squeed squeed merged commit 40d9787 into cilium:master Apr 4, 2023
57 checks passed
@joestringer
Copy link
Member

@ldelossa was requested for cilium/cli, but for some reason GH didn't resolve that codeowner on his review

Maybe he removed himself from cilium/cli? I don't see him in that github team currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/major This PR introduces major new functionality to Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CFP: Support DSR with Geneve in CNI mode
10 participants