New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow operators to settle after MCPs are done upgrading #25755
allow operators to settle after MCPs are done upgrading #25755
Conversation
test/e2e/upgrade/upgrade.go
Outdated
@@ -439,6 +439,29 @@ func clusterUpgrade(f *framework.Framework, c configv1client.Interface, dc dynam | |||
return err | |||
} | |||
|
|||
framework.Logf("Waiting for operators to settle after upgrade") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrap in a
disruption.RecordJUnit( if err := disruption.RecordJUnit(
f, f,
"[sig-mco] Machine config pools complete upgrade", "[sig-mco] Machine config pools complete upgrade",
func() error {
block like above please
test/e2e/upgrade/upgrade.go
Outdated
// should match failure cases checked in versionMonitor Describe() in monitor.go | ||
available := findCondition(co.Status.Conditions, configv1.OperatorAvailable) | ||
degraded := findCondition(co.Status.Conditions, configv1.OperatorDegraded) | ||
if !conditionHasStatus(available, configv1.ConditionTrue) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
track all the ones not settled for the junit message.
ugly, but workable. put it in the junit block so you get info on how long it takes and I would take it. |
5926dd3
to
83dc9fa
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k, sjenning The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@sjenning: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
The upgrade test currently just waits for the MCPs to complete their rollout before immediately checking if all ClusterOperators are available and not degraded. This is causing quite a few false failures in CI,
authentication
andopenshift-apiserver
in particular.Example job
https://prow.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-azure-upgrade-4.7/1336793136516567040
The PR checks the ClusterOperators every 10s for up to 5m to see if they have settled after the MCP rollout. If they don't settle within 5m, the test fails.
@deads2k @fabianvf