New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1889921: Reporting degraded if not available #644
Bug 1889921: Reporting degraded if not available #644
Conversation
@ricardomaraschini: This pull request references Bugzilla bug 1889921, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
pkg/operator/status.go
Outdated
@@ -294,6 +295,13 @@ func (c *Controller) syncStatus(cr *imageregistryv1.Config, deploy *appsapi.Depl | |||
|
|||
updateCondition(cr, operatorapiv1.OperatorStatusTypeProgressing, operatorProgressing) | |||
|
|||
if isDeploymentStatusAvailable(deploy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we look at operatorAvailable.Status
here, instead of calling out to isDeploymentStatusAvailable
again? And do we need .deployUnavailableSince
when we should already have LastTransitionTime
on the Available
status (although maybe not on operatorAvailable
until we use some tooling syncs it into the existing Conditions
object).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this makes total sense. I will try to adjust according to your input, thanks for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, PTAL.
pkg/operator/status.go
Outdated
if time.Since(c.deployIncompleteSince) > time.Minute { | ||
operatorDegraded.Status = operatorapiv1.ConditionTrue | ||
operatorDegraded.Message = "The registry has minimum availability" | ||
operatorDegraded.Reason = "MinimumAvailability" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reason/message sounds positive. If we are going to set Degraded=True, we should say something negative to motivate the unhappy status. Something like Registry deployment has timed out progressing
. But... can we lean on the Deployment controller for this and look at ProgressDeadlineExceeded
? See my rotted-due-to-lack-of-interest openshift/library-go#575 for one possible implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, PTAL.
Report this operator as Degraded if there is no replica or less replicas then specified on the configuration.
Covering Controller.syncStatus with tests.
/assign @dmage |
pkg/operator/status.go
Outdated
} else if deploy == nil { | ||
operatorDegraded.Status = operatorapiv1.ConditionTrue | ||
operatorDegraded.Message = "The deployment does not exist" | ||
operatorDegraded.Reason = "DeploymentNotFound" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we shouldn't immediately react on absence of the deployment. Most likely the operator will fix this problem in a second.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are absolutely right. I have adjusted it by means of removing it, now it will take one minute as well. PTAL.
if cond.Type != appsapi.DeploymentProgressing { | ||
continue | ||
} | ||
if cond.Reason != "ProgressDeadlineExceeded" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to rely on progressDeadlineSeconds
of the registry deployment, I'd say we should be more opinionated and set it to 1 minute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, PTAL.
Setting a default value of 60 seconds before setting the deployment as failed with ProgressDeadlineExceeded.
If the deployment does not exist we wait one minute before setting the operator to degraded.
/retest |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dmage, ricardomaraschini The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
5 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@ricardomaraschini: All pull requests linked via external trackers have merged: Bugzilla bug 1889921 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Report this operator as Degraded if there is no replica or less replicas
then specified on the configuration for more than one minute.