New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-24705: consider currentImage and desiredImage annotations #4135
OCPBUGS-24705: consider currentImage and desiredImage annotations #4135
Conversation
@cheesesashimi: This pull request references Jira Issue OCPBUGS-24705, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Skipping CI for Draft Pull Request. |
92d3818
to
c9a46b8
Compare
/jira refresh |
@cheesesashimi: This pull request references Jira Issue OCPBUGS-24705, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
The `isNodeUnavailable()` code was not taking the `currentImage` / `desiredImage` annotations into consideration when it determines if the node should be considered unavailable. While opting into on-cluster builds, the following would happen: 1. A node gets updated with the `desiredImage` annotation, indicating that the MCD should transition it to an image. 2. There is a lag between when this annotation is applied and the MCD transitions from "Done" -> "Working". 3. During this lag period, the NodeController may re-enter the sync loop. 4. Because the MCD state has not transitioned yet to "Working" yet and we were not considering the currentImage / desiredImage annotations, the NodeController will apply the `desiredImage` annotation to another node instead. This adds a check for the presence of the `currentImage` / `desiredImage` annotations. If neither annotation is present, the NodeController will not take these annotations into consideration. However, if either of them is present, it will test their equality before making a determination that a node is "Done". This PR also adds additional test cases in an attempt to verify additional unforeseen edge-cases.
c9a46b8
to
124dd45
Compare
/jira refresh |
@cheesesashimi: This pull request references Jira Issue OCPBUGS-24705, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
/jira refresh |
@cheesesashimi: This pull request references Jira Issue OCPBUGS-24705, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@cheesesashimi: This pull request references Jira Issue OCPBUGS-24705, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good! just one nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Logic is very clear, and great test coverage!
/override ci/prow/e2e-gcp-op-single-node pre-emptively |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cdoern, cheesesashimi, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@yuqi-zhang: Overrode contexts on behalf of yuqi-zhang: ci/prow/e2e-gcp-op-single-node In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold Revision 124dd45 was retested 3 times: holding |
/hold cancel |
/hold cancel |
@cheesesashimi: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
523ea84
into
openshift:master
@cheesesashimi: Jira Issue OCPBUGS-24705: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-24705 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[ART PR BUILD NOTIFIER] This PR has been included in build openshift-proxy-pull-test-container-v4.16.0-202402020041.p0.g523ea84.assembly.stream for distgit openshift-proxy-pull-test. |
Fix included in accepted release 4.16.0-0.nightly-2024-02-02-224339 |
- What I did
The
isNodeUnavailable()
code was not taking thecurrentImage
/desiredImage
annotations into consideration when it determines if the node should be considered unavailable. While opting into on-cluster builds, the following would happen:desiredImage
annotation, indicating that the MCD should transition it to an image.desiredImage
annotation to another node instead.This adds a check for the presence of the
currentImage
/desiredImage
annotations. If neither annotation is present, the NodeController will not take these annotations into consideration. However, if either of them is present, it will test their equality before making a determination that a node is "Done". This PR also adds additional test cases in an attempt to verify additional unforeseen edge-cases.- How to verify it
onclustertesting
helpers.master
pool into on-cluster builds by adding the appropriate label:$ oc label mcp/master 'machineconfiguration.openshift.io/layering-enabled='
.- Description for the changelog
Consider currentImage / desiredImage annotations when determining if a node is unavailable