New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
contrib/kind: enable XDP_TX from pod veth #24250
Conversation
/ci-datapath |
1df3d5d
to
e9806a8
Compare
/ci-datapath |
e9806a8
to
5c43611
Compare
/ci-datapath |
5c43611
to
f695fbc
Compare
/ci-datapath |
f695fbc
to
033f6b7
Compare
/ci-datapath |
033f6b7
to
d17154e
Compare
/ci-datapath |
d17154e
to
b3c11cd
Compare
/ci-datapath |
1 similar comment
/ci-datapath |
b3c11cd
to
cd67d50
Compare
/ci-datapath |
Please rebase your PR against the latest master. Also, you might want to update |
cd67d50
to
b6882ec
Compare
/ci-datapath |
Both veth in a pair require an XDP program installed for XDP_TX to work. Since the host side veth created by kind doesn't have an XDP program attached we can't run any tests in CI that require XDP_TX. The workaround itself is just an ip link set and ethtool away, the problem is figuring out which interfaces we need to do the magic to. Use the approach used by kind-network-plugins and create our own docker network with a specific name for the bridge device. We can then iterate all children of the bridge and do our fixups. We tell kind to use our own network by setting the (undocumented?) KIND_EXPERIMENTAL_DOCKER_NETWORK environment variable. See https://github.com/aojea/kind-networking-plugins Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
b6882ec
to
d9be701
Compare
/ci-datapath |
I think this works: https://github.com/cilium/cilium/actions/runs/4466990416/jobs/7845896844?pr=24250 The master ci-datapath is failing due to an issue with cilium-cli. P.S.: See #24470 |
@@ -286,15 +286,15 @@ jobs: | |||
provision: 'false' | |||
cmd: | | |||
cd /host/ | |||
./contrib/scripts/kind.sh "" 3 "" "" "${{ matrix.kube-proxy }}" "dual" | |||
./contrib/scripts/kind.sh --xdp "" 3 "" "" "${{ matrix.kube-proxy }}" "dual" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if changing it for 1.13 is sensible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#24470 got merged, so the master failure should be fixed. We can change the v1.13 later on.
Anyway, I'd suggest to add xdp: true
to the configuration matrix, and then conditionally to enable it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a downside to always enabling it on CI? My thinking was that I'd add something like lb-acceleration: testing-only
to the matrix instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any downside enabling it unconditionally, so yeah I'm fine either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change in 1.13 workflows breaks datapath test in 1.13 backports: https://github.com/cilium/cilium/actions/runs/4544793510
Turns out backporting just this PR to fix it is not enough. Thus, I've sent a partial revert for now: #24611
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I forgot to back out this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI changes LGTM, on question left inline.
@@ -389,7 +389,7 @@ jobs: | |||
--sysdump-output-filename "cilium-sysdump-${{ matrix.name }}-<ts>" | |||
./cilium-cli connectivity test --collect-sysdump-on-failure \ | |||
--sysdump-output-filename "cilium-sysdump-${{ matrix.name }}-<ts>" | |||
kind delete cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we even need to delete the cluster here. This is the last job in the workflow AFAIU, so we can just skip deletion as node is going to be discarded anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the workflow well enough to make that call. Would it not affect the Fetch artifacts
step below it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Fetch artifacts
step is going to be executed only if any of the connectivity tests above fails. In that case, the kind delete
will be bypassed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmb Could you remove the test commit? Once it's removed, I will ACK and mark as ready-to-merge
.
d9be701
to
19d3892
Compare
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Innocent question - should we also wire this up into the various |
That's what I had done initially, but it's kind of difficult to figure out whether this would break anybodies workflow. If you want to try you could add |
Both veth in a pair require an XDP program installed for XDP_TX
to work. Since the host side veth created by kind doesn't have
an XDP program attached we can't run any tests in CI that require
XDP_TX.
The workaround itself is just an ip link set and ethtool away,
the problem is figuring out which interfaces we need to do the
magic to.
Use the approach used by kind-network-plugins and create our own
docker network with a specific name for the bridge device. We
can then iterate all children of the bridge and do our fixups.
We tell kind to use our own network by setting the (undocumented?)
KIND_EXPERIMENTAL_DOCKER_NETWORK environment variable.
See https://github.com/aojea/kind-networking-plugins
Required for #24151.