New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: add cluster mesh conformance tests with Kind #23496
Conversation
c918bc3
to
041fa15
Compare
6bedf1e
to
5c6574a
Compare
I'm marking this ready for review, since all new tests passed correctly: https://github.com/cilium/cilium/actions/runs/4235266535/jobs/7358672515 I'm also adding the @reviewers I personally feel it could be appropriate to overwrite the old clusttermesh tests (based on GCP), rather than creating a new suite aside. Still, at the moment I've kept them separate for ease of review. Let me know your preference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome ✔️
e1467e6
to
50dc7a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming the second commit is removed.
Is the plan to remove the existing clustermesh workflow once this one is deemed stable?
Personally I would do so. I wonder if it is better to remove that in this PR or in a later one. |
I would remove in a subsequent PR after giving it a bit of time. New tests are often more flaky than existing one so maybe best to wait a bit and see. |
IIUC, the IPv6 tests were fixed by our PR cilium/cilium-cli#1414. If yes, we could tag cilium-cli v0.13.1 with that fix included. The other option would be to temporarily remove the and add remember to add it back once cilium-cli is fixed. |
@tklauser IMHO Tagging cilium-cli v0.13.1 with the fix would be the best solution, if possible. Then, I could also drop the workarounds which are currently present due to failing tests in IPv6-only clusters. Otherwise I'll drop the renovate comment. |
We'll probably tag v0.13.1 later this week. Currently waiting for a few PRs to get reviewed and merged. |
c68ad0c
to
60d4e48
Compare
Last push bumped the Cilium CLI version to the newly released 1.13.1, and removed the associated temporary fixes. |
4b07d4c
to
fa8289d
Compare
This commit introduces the new kind-based conformance tests for cluster mesh, implementing a test matrix to validate (a subset of) the following combinations: * encryption: "none" | "ipsec" | "wireguard" * tunnel: "disabled" | "vxlan" * ipfamily: "ipv4-only" | "dual-stack" | ipv6-only" * kube-proxy: "iptables" | "kpr" Additional context can be found in the enhancement proposal: #23322 Co-authored-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
fa8289d
to
8b7bbc8
Compare
Changes since last reviews:
--- a/.github/workflows/conformance-clustermesh.yaml
+++ b/.github/workflows/conformance-clustermesh.yaml
@@ -274,7 +274,8 @@ jobs:
CLUSTERMESH_ENABLE_DEFAULTS="--apiserver-image=quay.io/${{ env.QUAY_ORGANIZATION_DEV }}/clustermesh-apiserver-ci \
--apiserver-version=${SHA} --service-type=NodePort"
- CONNECTIVITY_TEST_DEFAULTS="--context=${{ env.contextName1 }} \
+ CONNECTIVITY_TEST_DEFAULTS="--hubble=false \
+ --flow-validation=disabled \
--multi-cluster=${{ env.contextName2 }} \
--external-target=google.com \
--collect-sysdump-on-failure"
--- a/.github/workflows/conformance-clustermesh.yaml
+++ b/.github/workflows/conformance-clustermesh.yaml
config: ./.github/kind-config-cluster2.yaml
wait: 0 # The control-plane never becomes ready, since no CNI is present
+ # Make sure that coredns uses IPv4-only upstream DNS servers also in case of clusters
+ # with IP family dual, since IPv6 ones are not reachable and cause spurious failures.
+ - name: Configure the coredns nameservers
+ if: matrix.ipfamily == 'dual'
+ run: |
+ COREDNS_PATCH="
+ spec:
+ template:
+ spec:
+ dnsPolicy: None
+ dnsConfig:
+ nameservers:
+ - 8.8.4.4
+ - 8.8.8.8
+ "
+
+ kubectl --context ${{ env.contextName1 }} patch deployment -n kube-system coredns --patch="$COREDNS_PATCH"
+ kubectl --context ${{ env.contextName2 }} patch deployment -n kube-system coredns --patch="$COREDNS_PATCH"
+
- name: Wait for images to be available
timeout-minutes: 10
shell: bash Link to successful run: https://github.com/cilium/cilium/actions/runs/4342212579 @tklauser, @pchaigno, @sayboras, @aanm, @YutaroHayakawa Please, have a final look if you'd like. From my side, then, this is ready to be marked as ready to merge. |
8b7bbc8
to
4b068e2
Compare
Ingress related tests failed due to known flake #23960. I'm not re-running them, since this PR does not introduce any code changes. |
It should be fixed in master now, but we don't need to re-run it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still LGTM ✔️, this is good stuff 😲
All required reviews are in, and test failures are unrelated, since this PR is only introducing new tests. |
Looks good to me, the issue with Ingress GHA failure is already addressed in master branch. |
The new kind-based clustermesh workflow introduced in cilium#23496 has now run for a while in parallel with the old one, and it seems comparably stable while covering more scenarios. Hence, let's drop the old one. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
The new kind-based clustermesh workflow introduced in #23496 has now run for a while in parallel with the old one, and it seems comparably stable while covering more scenarios. Hence, let's drop the old one. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit adds a copy of the new clustermesh conformance test initially introduced in #23496 for each of the currently active stable branches (i.e., v1.11, v1.12 and v1.13). Trigger phrases and Kubernetes versions are updated accordingly. In case of v1.11, the direct-routing IPv6 and IPv4+IPv6 tests have been disabled since that configuration is currently affected a bug causing incorrect masquerading of traffic directed to remote pods. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit adds a copy of the new clustermesh conformance test initially introduced in #23496 for each of the currently active stable branches (i.e., v1.11, v1.12 and v1.13). Trigger phrases and Kubernetes versions are updated accordingly. In case of v1.11, the direct-routing IPv6 and IPv4+IPv6 tests have been disabled since that configuration is currently affected a bug causing incorrect masquerading of traffic directed to remote pods. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
do we want to backport this to stable branches? 👀 edit: ... never mind i see we already have workflows for stable branches ✅ |
This PR introduces the new kind-based conformance tests for cluster mesh, implementing a test matrix to validate (a subset of) the following combinations:
Additional context can be found in the enhancement proposal: #23322
Fixes: #23322
Fixes: #9994
Co-authored-by: Yutaro Hayakawa yutaro.hayakawa@isovalent.com
Signed-off-by: Marco Iorio marco.iorio@isovalent.com