Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib/kind: enable XDP_TX from pod veth #24250

Merged
merged 1 commit into from Mar 22, 2023
Merged

Conversation

lmb
Copy link
Contributor

@lmb lmb commented Mar 8, 2023

Both veth in a pair require an XDP program installed for XDP_TX
to work. Since the host side veth created by kind doesn't have
an XDP program attached we can't run any tests in CI that require
XDP_TX.

The workaround itself is just an ip link set and ethtool away,
the problem is figuring out which interfaces we need to do the
magic to.

Use the approach used by kind-network-plugins and create our own
docker network with a specific name for the bridge device. We
can then iterate all children of the bridge and do our fixups.

We tell kind to use our own network by setting the (undocumented?)
KIND_EXPERIMENTAL_DOCKER_NETWORK environment variable.

See https://github.com/aojea/kind-networking-plugins

Required for #24151.

Enable testing of BPF programs requiring XDP_TX in CI

@lmb lmb added area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI. labels Mar 8, 2023
@lmb
Copy link
Contributor Author

lmb commented Mar 8, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 8, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 8, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 8, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 9, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 13, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 14, 2023

/ci-datapath

1 similar comment
@lmb
Copy link
Contributor Author

lmb commented Mar 14, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 14, 2023

/ci-datapath

@brb
Copy link
Member

brb commented Mar 14, 2023

Please rebase your PR against the latest master. Also, you might want to update .github/workflows/conformance-datapath.yaml instead of the v1.13 equivalent.

@lmb
Copy link
Contributor Author

lmb commented Mar 20, 2023

/ci-datapath

Both veth in a pair require an XDP program installed for XDP_TX
to work. Since the host side veth created by kind doesn't have
an XDP program attached we can't run any tests in CI that require
XDP_TX.

The workaround itself is just an ip link set and ethtool away,
the problem is figuring out which interfaces we need to do the
magic to.

Use the approach used by kind-network-plugins and create our own
docker network with a specific name for the bridge device. We
can then iterate all children of the bridge and do our fixups.

We tell kind to use our own network by setting the (undocumented?)
KIND_EXPERIMENTAL_DOCKER_NETWORK environment variable.

See https://github.com/aojea/kind-networking-plugins

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
@lmb
Copy link
Contributor Author

lmb commented Mar 20, 2023

/ci-datapath

@lmb
Copy link
Contributor Author

lmb commented Mar 20, 2023

I think this works: https://github.com/cilium/cilium/actions/runs/4466990416/jobs/7845896844?pr=24250

The master ci-datapath is failing due to an issue with cilium-cli. P.S.: See #24470

@lmb lmb marked this pull request as ready for review March 20, 2023 13:23
@lmb lmb requested review from a team as code owners March 20, 2023 13:23
@lmb lmb requested review from a team as code owners March 20, 2023 13:23
@lmb lmb requested review from aspsk, nebril and nbusseneau March 20, 2023 13:23
@@ -286,15 +286,15 @@ jobs:
provision: 'false'
cmd: |
cd /host/
./contrib/scripts/kind.sh "" 3 "" "" "${{ matrix.kube-proxy }}" "dual"
./contrib/scripts/kind.sh --xdp "" 3 "" "" "${{ matrix.kube-proxy }}" "dual"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if changing it for 1.13 is sensible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#24470 got merged, so the master failure should be fixed. We can change the v1.13 later on.

Anyway, I'd suggest to add xdp: true to the configuration matrix, and then conditionally to enable it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a downside to always enabling it on CI? My thinking was that I'd add something like lb-acceleration: testing-only to the matrix instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any downside enabling it unconditionally, so yeah I'm fine either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change in 1.13 workflows breaks datapath test in 1.13 backports: https://github.com/cilium/cilium/actions/runs/4544793510

Turns out backporting just this PR to fix it is not enough. Thus, I've sent a partial revert for now: #24611

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to back out this change.

Copy link
Member

@nebril nebril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI changes LGTM, on question left inline.

@@ -389,7 +389,7 @@ jobs:
--sysdump-output-filename "cilium-sysdump-${{ matrix.name }}-<ts>"
./cilium-cli connectivity test --collect-sysdump-on-failure \
--sysdump-output-filename "cilium-sysdump-${{ matrix.name }}-<ts>"
kind delete cluster
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we even need to delete the cluster here. This is the last job in the workflow AFAIU, so we can just skip deletion as node is going to be discarded anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the workflow well enough to make that call. Would it not affect the Fetch artifacts step below it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Fetch artifacts step is going to be executed only if any of the connectivity tests above fails. In that case, the kind delete will be bypassed.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmb Could you remove the test commit? Once it's removed, I will ACK and mark as ready-to-merge.

@lmb
Copy link
Contributor Author

lmb commented Mar 22, 2023

Done.

Copy link
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@brb brb added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 22, 2023
@youngnick youngnick merged commit 4310b5e into master Mar 22, 2023
42 checks passed
@youngnick youngnick deleted the lmb/kind-enable-xdp-tx branch March 22, 2023 23:12
@tklauser tklauser added the needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch label Mar 28, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.13.2 Mar 28, 2023
@tklauser tklauser removed the needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch label Mar 30, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Needs backport from master in 1.13.2 Mar 30, 2023
@julianwiedmann
Copy link
Member

Innocent question - should we also wire this up into the various kind targets, or is there a downside to having it enabled by default?

@lmb
Copy link
Contributor Author

lmb commented Apr 3, 2023

That's what I had done initially, but it's kind of difficult to figure out whether this would break anybodies workflow. If you want to try you could add --xdp to make kind and then ask people to try that branch locally?

@brb brb added the needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch label May 12, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 12, 2023
@brb brb mentioned this pull request May 12, 2023
2 tasks
@brb brb added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels May 12, 2023
@julianwiedmann julianwiedmann added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Jul 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants