New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-4.13] OCPBUGS-23016: Introduce upgrading label to block concurrent upgrades #1931
[release-4.13] OCPBUGS-23016: Introduce upgrading label to block concurrent upgrades #1931
Conversation
Skipping CI for Draft Pull Request. |
/jira cherrypick OCPBUGS-22984 |
@jrvaldes: Jira Issue OCPBUGS-22984 has been cloned as Jira Issue OCPBUGS-23016. Will retitle bug to link to clone. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@jrvaldes: This pull request references Jira Issue OCPBUGS-23016, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test azure-e2e-upgrade |
azure-e2e-upgrade failed with timeout while waiting for Windows nodes to come up as shown in WMCO logs
|
06667a3
to
6187bb9
Compare
/test azure-e2e-upgrade |
/hold |
This change introduces the concept of maximum number of parallel upgrades that takes place concurrently for Windows nodes during reconciliation. The `windowsmachineconfig.openshift.io/upgrading` label is proposed as the locking mechanism among the Windows nodes to account for how many instances can perform an upgrade under following a threshold i.e. MaxParallelUpgrades which is fixed to 1. (cherry picked from commit f1dd8f5) (cherry picked from commit f7dd118)
This commit introduces a test to check the maximum allowed numbers of Windows nodes upgrading in parallel. The test is divided in two phases, 1) setup and 2) test, where the setup phase deploys a job with a fixed name that constantly fetch the number of Windows nodes with the `windowsmachineconfig.openshift.io/upgrading` label and fail if is greater than the maximum allowed. The polling frequency is set to 5 seconds. The latter test, checks the number of failed pods for the checker job and require no failures, otherwise fails the e2e test. A new service account is proposed in the test namespace to hold the RBAC required by the checker job to list the nodes in the test cluster. The test is designed to run as a separate job due to the structure of the new upgrade test in vSphere (vsphere-e2e-upgrade) that is scattered between the steps in the release repo and code in the WMCO test suite. (cherry picked from commit e03c792) (cherry picked from commit 48f08a8)
fc57982
to
d10c7fb
Compare
/jira refresh |
@jrvaldes: This pull request references Jira Issue OCPBUGS-23016, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jrvaldes, mansikulkarni96 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest-required |
/jira refresh |
@jrvaldes: This pull request references Jira Issue OCPBUGS-23016, which is valid. The bug has been moved to the POST state. 6 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold cancel |
/test remaining-required |
/unassign |
@jrvaldes: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
72dd116
into
openshift:release-4.13
@jrvaldes: Jira Issue OCPBUGS-23016: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-23016 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This is a manual cherry-pick of #1901