New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-20024: Ignore max unavailable for status #386
OCPBUGS-20024: Ignore max unavailable for status #386
Conversation
@candita: This pull request references Jira Issue OCPBUGS-20024, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
d062cb9
to
3bc984b
Compare
/test e2e-aws-ovn |
/test e2e-aws-ovn |
/assign |
/test e2e-aws-ovn |
/test e2e-aws-ovn |
@Miciah PTAL when you get a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental change is good. I have some questions about tests and error handling.
Also, regarding the title, maxUnavailable
was never modifiable by the cluster-admin. Maybe it would be clearer to say, "Ignore max unavailable for status" or, "Report degraded if >10% unavailable".
{ | ||
name: "node-resolver invalid MaxUnavailable is ok", | ||
clusterIP: "172.30.0.10", | ||
dnsDaemonset: makeDaemonSet(10, 9, intstr.FromString("10%")), | ||
nrDaemonset: makeDaemonSet(6, 0, intstr.FromString("TEST")), | ||
expected: operatorv1.ConditionFalse, | ||
}, | ||
{ | ||
name: "DNS invalid MaxUnavailable (string with digits without a percent sign)", | ||
clusterIP: "172.30.0.10", | ||
dnsDaemonset: makeDaemonSet(6, 6, intstr.IntOrString{Type: intstr.String, StrVal: "10"}), | ||
nrDaemonset: makeDaemonSet(6, 6, intstr.FromString("33%")), | ||
expected: operatorv1.ConditionUnknown, | ||
}, | ||
{ | ||
name: "node-resolver invalid MaxUnavailable (string with digits without a percent sign) is ok", | ||
clusterIP: "172.30.0.10", | ||
dnsDaemonset: makeDaemonSet(6, 6, intstr.FromString("10%")), | ||
nrDaemonset: makeDaemonSet(6, 6, intstr.IntOrString{Type: intstr.String, StrVal: "33"}), | ||
expected: operatorv1.ConditionFalse, | ||
}, | ||
{ | ||
name: "DNS invalid MaxUnavailable (string with letters)", | ||
clusterIP: "172.30.0.10", | ||
dnsDaemonset: makeDaemonSet(6, 6, intstr.IntOrString{Type: intstr.String, StrVal: "TEST"}), | ||
nrDaemonset: makeDaemonSet(6, 6, intstr.FromString("33%")), | ||
expected: operatorv1.ConditionUnknown, | ||
}, | ||
{ | ||
name: "node-resolver invalid MaxUnavailable (string with letters) is ok", | ||
clusterIP: "172.30.0.10", | ||
dnsDaemonset: makeDaemonSet(6, 6, intstr.FromString("10%")), | ||
nrDaemonset: makeDaemonSet(6, 6, intstr.IntOrString{Type: intstr.String, StrVal: "TEST"}), | ||
expected: operatorv1.ConditionFalse, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting that we still had these test cases for invalid maxUnavailable
values for the node-resolver daemonset. We kept these test cases when we changed the status reporting to ignore the node-resolver daemon set in #273. I would guess the reasoning was that we wanted the test cases to verify that the status reporting code really was not basing status on the daemon set'smaxUnavailable
parameter. Does the same reasoning still apply? If so, it would make sense to keep (some of) these test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be an oversight that I didn't clean these units tests then. We entirely removed the checking of node resolver daemonset and its maxUnavailable
parameter. It doesn't make sense anymore to check anything having to do with node resolver when computing the degraded condition: https://github.com/openshift/cluster-dns-operator/pull/273/files#diff-32495132facf7e0819a407af732514958b198d9e40236656bc43b093345d1539L111
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I should remove the nrDaemonset
field from the test cases. Does that make more sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be an oversight that I didn't clean these units tests then. We entirely removed the checking of node resolver daemonset and its
maxUnavailable
parameter. It doesn't make sense anymore to check anything having to do with node resolver when computing the degraded condition
Are you sure it was an oversight? My speculation was that it was intentional: If parameter X did affect result Y, and you are changing the logic so that X doesn't affect Y, then it does make sense for tests to verify that X doesn't affect Y, right? More generally, I'd rather err on the side of keeping possibly redundant test cases over risking removing test cases that could possibly detect faults. However, I'll defer to your judgment if you decide these test cases really should be removed and the test simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am quite sure it was an oversight. I removed the parameters haveNodeResolverDaemonset
and nodeResolverDaemonset
from the signature of computeDNSDegradedCondition
, because we no longer use them in the function. We still use them in computeDNSProgressingCondition
, but not computeDNSDegradedCondition
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, I can live with that. Test coverage is still good overall.
0e2c2aa
to
5759e70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple minor things.
case intstrErr != nil: | ||
// This should not happen, but is included just to safeguard against future changes. | ||
degradedReasons = append(degradedReasons, "InvalidDNSMaxUnavailable") | ||
messages = append(messages, fmt.Sprintf("The DNS daemonset has an invalid MaxUnavailable value: %v", intstrErr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason to move this case up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, forgot to push.
@@ -14,6 +14,10 @@ import ( | |||
"k8s.io/apimachinery/pkg/util/intstr" | |||
) | |||
|
|||
var ( | |||
maxUnavailable = intstr.FromInt(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to declare maxUnavailable
here instead of inside makeDaemonSet
? We usually tend towards making tests and helpers self-contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used in multiple functions and I removed it as a parameter to makeDaemonSet, so I declared it globally. It's used in TestDNSStatusConditions, TestComputeDNSDegradedCondition, TestComputeDNSProgressingCondition, and TestSkippingStatusUpdates.
For consistency, removed an extra declaration of it in TestDNSStatusConditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right, I won't insist on changing it.
@@ -201,7 +205,8 @@ func TestDNSStatusConditions(t *testing.T) { | |||
// TestComputeDNSDegradedCondition verifies the computeDNSDegradedCondition has | |||
// the expected behavior. | |||
func TestComputeDNSDegradedCondition(t *testing.T) { | |||
makeDaemonSet := func(desired, available int, maxUnavailable intstr.IntOrString) *appsv1.DaemonSet { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra blank line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Forgot to push.
Followup to PR openshift#384. - pkg/operator/controller/controller_dns_node_resolver_daemonset.go - small update to a comment - pkg/operator/controller/dns_status.go - hardcode maxUnavailable to 10% of desiredNumberScheduled - pkg/operator/controller/dns_status_test.go - remove maxUnavailable format testing and cleanup noderesolver testing
5759e70
to
9155787
Compare
Thanks! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Miciah The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@candita: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@candita: Jira Issue OCPBUGS-20024: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-20024 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Fix included in accepted release 4.15.0-0.nightly-2023-11-01-040931 |
[ART PR BUILD NOTIFIER] This PR has been included in build ose-cluster-dns-operator-container-v4.15.0-202311202349.p0.g25020f4.assembly.stream for distgit ose-cluster-dns-operator. |
/cherry-pick release-4.14 |
@candita: new pull request created: #400 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@Miciah: Jira Issue OCPBUGS-20024: Some pull requests linked via external trackers have merged:
The following pull requests linked via external trackers have not merged: These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with Jira Issue OCPBUGS-20024 has not been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@Miciah: Jira Issue OCPBUGS-20024: All pull requests linked via external trackers have merged:
Jira Issue OCPBUGS-20024 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Fix included in accepted release 4.15.0-0.nightly-2024-01-05-151121 |
Followup to PR #384.