Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GHA: Add clustermesh upgrade and downgrade tests #27232

Merged
merged 2 commits into from
Oct 10, 2023

Conversation

giorio94
Copy link
Member

@giorio94 giorio94 commented Aug 3, 2023

This PR introduces a new GitHub actions workflow which validates the Cilium upgrade and downgrade paths when clustermesh is enabled, ensuring that this process does not disrupt long living connections.

More specifically, the test initially deploys a mixed version cluster mesh, with one cluster running the latest stable version of Cilium, and the other the current tip of main. It performs a subset of connectivity tests and deploys the application sensitive to connection disruption. At this point, the first cluster is upgraded to the tip of main, enabling kvstoremesh at the same time. Connectivity tests are again executed, additionally checking that no long living connection was dropped. Finally, the first cluster is downgraded again to the latest stable version of Cilium, disabling kvstoremesh. Connectivity tests and connection disruption checks are executed on more time.

To reduce the total amount of time required by this workflow, only a limited subset of tests is enabled, while the full suite is run in the conformance-clustermesh workflow.

Link to successful run: https://github.com/cilium/cilium/actions/runs/6379977851

@giorio94 giorio94 added area/CI Continuous Integration testing issue or flake area/clustermesh Relates to multi-cluster routing functionality in Cilium. release-note/ci This PR makes changes to the CI. labels Aug 3, 2023
@giorio94 giorio94 self-assigned this Aug 7, 2023
@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch 4 times, most recently from 7616fb5 to 70059c4 Compare August 14, 2023 08:59
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from 70059c4 to 6430f43 Compare August 21, 2023 12:50
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from 14a0a78 to 6770c5b Compare August 21, 2023 13:29
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from 6770c5b to 7290657 Compare August 21, 2023 13:49
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from c960d3e to 67215a2 Compare August 21, 2023 14:13
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from 67215a2 to 5eea16a Compare August 21, 2023 14:35
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from 5eea16a to c514a96 Compare August 21, 2023 14:44
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from c514a96 to 2282ab9 Compare August 21, 2023 15:45
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from 2282ab9 to e53397d Compare August 21, 2023 16:33
@giorio94
Copy link
Member Author

/ci-multicluster

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from e53397d to 187a75c Compare August 22, 2023 08:38
@giorio94
Copy link
Member Author

giorio94 commented Oct 4, 2023

@aanm @brb I'd personally propose to merge this PR, to start collecting signals about the new workflow and prevent regressions, and re-evaluate the usage of the --short suite in a subsequent phase. This workflow with the current configuration completed in ~16 minutes, which is comparable to the other ones. Does it make sense to you?

@giorio94 giorio94 added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 4, 2023
@giorio94
Copy link
Member Author

giorio94 commented Oct 4, 2023

Marking as ready to merge, as approvals are in, and this PR only touches a GHA workflow (which passed successfully).

@giorio94 giorio94 removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 5, 2023
@giorio94 giorio94 marked this pull request as draft October 5, 2023 07:50
@giorio94
Copy link
Member Author

giorio94 commented Oct 5, 2023

Converting back to draft while integrating with #28180

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from dbd29dc to c5888ef Compare October 6, 2023 07:31
@giorio94
Copy link
Member Author

giorio94 commented Oct 6, 2023

/ci-clustermesh

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from a10cfc6 to e0ab9d5 Compare October 6, 2023 07:36
@giorio94
Copy link
Member Author

giorio94 commented Oct 6, 2023

/ci-clustermesh

@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from e0ab9d5 to 976c924 Compare October 6, 2023 08:12
@giorio94
Copy link
Member Author

giorio94 commented Oct 6, 2023

/ci-clustermesh

@giorio94
Copy link
Member Author

giorio94 commented Oct 6, 2023

Converting back to draft while integrating with #28180

@aanm PTAL if the integration looks OK to you. I'm keeping this one in draft until #28434 gets merged, as it currently include also those commits.

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect

.github/actions/helm-default/action.yaml Outdated Show resolved Hide resolved
@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch 2 times, most recently from b411212 to e5b6752 Compare October 6, 2023 13:46
@giorio94
Copy link
Member Author

giorio94 commented Oct 6, 2023

/test

Let's allow to override the default chart-dir, in case it is located
outside of the standard path. This can happen, for instance, when
checking out multiple Cilium versions in upgrade/downgrade workflows.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit introduces a new GitHub actions workflow which validates the
Cilium upgrade and downgrade paths when clustermesh is enabled, ensuring
that this process does not disrupt long living connections.

More specifically, the test initially deploys a mixed version cluster
mesh, with one cluster running Cilium from the tip of the latest stable
branch, and the other the current tip of main. It performs a subset of
connectivity tests and deploys the application sensitive to connection
disruption.

At this point, the first cluster is upgraded to the tip of main,
enabling kvstoremesh at the same time. Connectivity tests are again
executed, checking that no long living connection was dropped. As an
additional stress test, the clustermesh-apiserver deployment in both
clusters is scaled to zero replicas, all agents restarted, and then
scaled back to one replica, while checking that no long living
connection was dropped.

Finally, the first cluster is downgraded again to the tip of the
latest stable version, disabling kvstoremesh. Connectivity tests and
connection disruption checks are executed one more time.

To reduce the total amount of time required by this workflow, only a
limited subset of tests is enabled, while the full suite is run in
the conformance-clustermesh workflow.

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
@giorio94 giorio94 force-pushed the pr/giorio94/gha-clustermesh-upgrade branch from e5b6752 to a459251 Compare October 9, 2023 08:55
@giorio94
Copy link
Member Author

giorio94 commented Oct 9, 2023

Rebased onto main to pick CI fixes.

@giorio94
Copy link
Member Author

giorio94 commented Oct 9, 2023

/test

@giorio94
Copy link
Member Author

Travis CI hit #24216

@giorio94
Copy link
Member Author

Marking as ready to merge, reviews are in, and CI failures are unrelated as no code changes are introduced.

@giorio94 giorio94 added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 10, 2023
@squeed squeed merged commit a608db6 into main Oct 10, 2023
207 of 209 checks passed
@squeed squeed deleted the pr/giorio94/gha-clustermesh-upgrade branch October 10, 2023 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake area/clustermesh Relates to multi-cluster routing functionality in Cilium. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants