-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GHA: Add clustermesh upgrade and downgrade tests #27232
Conversation
7616fb5
to
70059c4
Compare
/ci-multicluster |
70059c4
to
6430f43
Compare
/ci-multicluster |
6430f43
to
14a0a78
Compare
/ci-multicluster |
14a0a78
to
6770c5b
Compare
/ci-multicluster |
6770c5b
to
7290657
Compare
/ci-multicluster |
c960d3e
to
67215a2
Compare
/ci-multicluster |
67215a2
to
5eea16a
Compare
/ci-multicluster |
5eea16a
to
c514a96
Compare
/ci-multicluster |
c514a96
to
2282ab9
Compare
/ci-multicluster |
2282ab9
to
e53397d
Compare
/ci-multicluster |
e53397d
to
187a75c
Compare
@aanm @brb I'd personally propose to merge this PR, to start collecting signals about the new workflow and prevent regressions, and re-evaluate the usage of the |
Marking as ready to merge, as approvals are in, and this PR only touches a GHA workflow (which passed successfully). |
Converting back to draft while integrating with #28180 |
dbd29dc
to
c5888ef
Compare
/ci-clustermesh |
a10cfc6
to
e0ab9d5
Compare
/ci-clustermesh |
e0ab9d5
to
976c924
Compare
/ci-clustermesh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect
b411212
to
e5b6752
Compare
/test |
Let's allow to override the default chart-dir, in case it is located outside of the standard path. This can happen, for instance, when checking out multiple Cilium versions in upgrade/downgrade workflows. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
This commit introduces a new GitHub actions workflow which validates the Cilium upgrade and downgrade paths when clustermesh is enabled, ensuring that this process does not disrupt long living connections. More specifically, the test initially deploys a mixed version cluster mesh, with one cluster running Cilium from the tip of the latest stable branch, and the other the current tip of main. It performs a subset of connectivity tests and deploys the application sensitive to connection disruption. At this point, the first cluster is upgraded to the tip of main, enabling kvstoremesh at the same time. Connectivity tests are again executed, checking that no long living connection was dropped. As an additional stress test, the clustermesh-apiserver deployment in both clusters is scaled to zero replicas, all agents restarted, and then scaled back to one replica, while checking that no long living connection was dropped. Finally, the first cluster is downgraded again to the tip of the latest stable version, disabling kvstoremesh. Connectivity tests and connection disruption checks are executed one more time. To reduce the total amount of time required by this workflow, only a limited subset of tests is enabled, while the full suite is run in the conformance-clustermesh workflow. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
e5b6752
to
a459251
Compare
Rebased onto main to pick CI fixes. |
/test |
Travis CI hit #24216 |
Marking as ready to merge, reviews are in, and CI failures are unrelated as no code changes are introduced. |
This PR introduces a new GitHub actions workflow which validates the Cilium upgrade and downgrade paths when clustermesh is enabled, ensuring that this process does not disrupt long living connections.
More specifically, the test initially deploys a mixed version cluster mesh, with one cluster running the latest stable version of Cilium, and the other the current tip of main. It performs a subset of connectivity tests and deploys the application sensitive to connection disruption. At this point, the first cluster is upgraded to the tip of main, enabling kvstoremesh at the same time. Connectivity tests are again executed, additionally checking that no long living connection was dropped. Finally, the first cluster is downgraded again to the latest stable version of Cilium, disabling kvstoremesh. Connectivity tests and connection disruption checks are executed on more time.
To reduce the total amount of time required by this workflow, only a limited subset of tests is enabled, while the full suite is run in the conformance-clustermesh workflow.
Link to successful run: https://github.com/cilium/cilium/actions/runs/6379977851