New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1917537: Fix time comparison in CSV reconcile loop #1974
Bug 1917537: Fix time comparison in CSV reconcile loop #1974
Conversation
@hasbro17: This pull request references Bugzilla bug 1917537, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@hasbro17: This pull request references Bugzilla bug 1917537, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest |
@@ -1097,7 +1097,7 @@ func (a *Operator) syncClusterServiceVersion(obj interface{}) (syncError error) | |||
} | |||
|
|||
// status changed, update CSV | |||
if !(outCSV.Status.LastUpdateTime == clusterServiceVersion.Status.LastUpdateTime && | |||
if !(outCSV.Status.LastUpdateTime.Equal(clusterServiceVersion.Status.LastUpdateTime) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot that this could be a nil dereference if outCSV.Status.LastUpdateTime == nil
. Going to add a check for that first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, never mind. I keep forgetting a method is orthogonal to the underlying data and LastUpdateTime.Equal()
is fine even if LastUpdateTime == nil
.
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ecordell, hasbro17 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-4.6 |
@ecordell: once the present PR merges, I will cherry-pick it on top of release-4.6 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
/retest |
/retest Please review the full test history for this PR and help us cut down flakes. |
@hasbro17: All pull requests linked via external trackers have merged: Bugzilla bug 1917537 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@ecordell: new pull request created: #1987 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Description of the change:
Fix the comparison check for
csv.status.LastUpdateTime
in the CSV reconcile from==
to Time.Equal().Motivation for the change:
See the following bug for background on the reconcile hotloop: https://bugzilla.redhat.com/show_bug.cgi?id=1917537
After installing an operator on an openshift
4.6.z
cluster (e.g4.6.13
) the OLM operator can get into a loop where the CSV object is being continuously being reconciled and updated roughly every second.This results in increased CPU usage and requests to the APIserver.
While it's unclear how the operator gets into this reconcile hotloop when installing certain operators (OCS, ACM) and not others (etcd, cert-manager), the comparison check
outCSV.Status.LastUpdateTime == clusterServiceVersion.Status.LastUpdateTime
is incorrect as it always resultsfalse
since the values being compared are*metav1.Time
pointer addresses.This results in the OLM operator always updating the CSV object with no real status change and triggering another reconcile event.
With the proposed fix the reconcile hotloop does go away since the CSV is no longer being updated when there are no status changes.
TODO:
Reviewer Checklist
/docs