New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-32980: [release-4.14]:fix extra-reboot on upgrade with paused mcp worker #1053
OCPBUGS-32980: [release-4.14]:fix extra-reboot on upgrade with paused mcp worker #1053
Conversation
This PR fixes an extra-reboot when upgrading OpenShift with paused MCP "worker", for example during EUS-to-EUS upgrade. The extra-reboot occurred because PerformanceProfile controller was reconciling against rendered MC appearing in the MCP status. While this MC reflects current MCP conditions, it does not reflect the latest planned state when the MCP is paused. Unpausing the MCP lead to one reboot when applying the target MC, and one additional reboot after the performance profile was reconciled against it. This change eliminates this additional reboot because PerformanceProfile is now reconciling against the latest planned MC before the MCP is unpaused Signed-off-by: Vitaly Grinberg <v.g@redhat.com>
@vitus133: This pull request references Jira Issue OCPBUGS-32980, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/cc @mrniranjan |
/test e2e-upgrade |
/jira refresh |
@mrniranjan: This pull request references Jira Issue OCPBUGS-32980, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@yanirq: This pull request references Jira Issue OCPBUGS-32980, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vitus133, yanirq The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @mrniranjan |
/label cherry-pick-approved |
/test e2e-upgrade |
1 similar comment
/test e2e-upgrade |
@vitus133: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm |
d969426
into
openshift:release-4.14
@vitus133: Jira Issue OCPBUGS-32980: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-32980 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[ART PR BUILD NOTIFIER] This PR has been included in build cluster-node-tuning-operator-container-v4.14.0-202405081737.p0.gd969426.assembly.stream.el9 for distgit cluster-node-tuning-operator. |
This PR fixes an extra-reboot when upgrading OpenShift with paused MCP "worker", for example during EUS-to-EUS upgrade. The extra-reboot occurred because PerformanceProfile controller was reconciling against rendered MC appearing in the MCP status. While this MC reflects current MCP conditions, it does not reflect the latest planned state when the MCP is paused.
Unpausing the MCP lead to one reboot when applying the target MC, and one additional reboot after the performance profile was reconciled against it.
This change eliminates this additional reboot because PerformanceProfile is now reconciling against the latest planned MC before the MCP is unpaused.
This is a manual cherry-pick from #1049
/cc @MarSik @yanirq