New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1915079: Disable canary route churn by default #525
Bug 1915079: Disable canary route churn by default #525
Conversation
@sgreene570: This pull request references Bugzilla bug 1915079, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
must-gather failed |
d87f3ae
to
f3ca919
Compare
/retest |
/test e2e-aws-operator |
/test e2e-aws |
/test e2e-aws-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold
I made a few small suggestions, but feel free to release the hold and let it merge as is (assuming CI starts passing).
/retest |
f3ca919
to
14cd9ee
Compare
/hold cancel |
@Miciah I pushed fixes for your suggestions despite CI being green since recently merged changes to this repo would have triggered retests anyways. PTAL when you can. |
@@ -44,6 +45,12 @@ const ( | |||
// canaryCheckFailureCount is how many successive failing canary checks should | |||
// be observed before the default ingress controller goes degraded. | |||
canaryCheckFailureCount = 5 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a user-facing annotation, should this annoation be documented elsewhere? Also, please add that the user must set the value as true or false, i.e. the annotation implies false if its value is not set to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enhanced the go-doc per your request.
We don't expect users to want to opt-in to this functionality given the performance impact on the default cluster routes, so I think the documentation here is sufficient. This annotation would be useful for debugging a customer case via bugzilla or our support engineers, or for internal debugging / testing.
14cd9ee
to
48ab4ea
Compare
// from the default ingress controller. | ||
ic := &operatorv1.IngressController{} | ||
if err := r.client.Get(context.TODO(), request.NamespacedName, ic); err != nil { | ||
errors = append(errors, fmt.Errorf("failed to get ingress controller %s: %v", ic.Name, err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ic.Name
necessarily set if the get failed? (Doesn't really matter much given that there's only one ingresscontroller that it could be.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im going to assume it does not get set......fixed.
Thanks!
test/e2e/canary_test.go: Add an e2e test to verify that annotating the default ingress controller with the canary route rotation annotation works as intended.
Forcing the router to reload every 5 minutes for canary testing has a heavy performance impact for larger clusters. By default, disable the canary route rotation logic. Canary route rotation can be enabled for internal use or debugging by setting the new "ingress.operator.openshift.io/rotate-canary-route" annotation to "true" on the default ingress controller. pkg/operator/controller/canary/controller.go: Add a mutex to the canary controller's reconciler type to make the new reconciler `enableCanaryRouteRotation` field go-routine safe. Set this field to true when the default ingress controller has the new canary route rotation annotation set to true. Use this field in the canary check polling loop to determine if the canary route endpoint should be rotated.
48ab4ea
to
d5ac2b9
Compare
Latest changes look good. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Minor nits/observations but nothing to hold LGTM for.
} | ||
|
||
func setDefaultIngressControllerRotationAnnotation(t *testing.T, val bool) error { | ||
t.Helper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary given we only appear to access t
for logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Miciah suggested adding this:
You can call t.Helper() at the start of this function to cause any log output that this function generates to have the caller's line number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was only useful when you want to fail or error. No big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log messages will have line numbers, right? The line numbers of the call sites are more specific than line numbers in the helper.
} | ||
// If the canaryRoute and updatedCanaryRoute do not have the same targetPort, | ||
// then the canary route rotation annotation is working. | ||
if cmp.Equal(canaryRoute.Spec.Port.TargetPort, updatedCanaryRoute.Spec.Port.TargetPort) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: collapse the multiple returns to the result of the Equal
expression.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: frobware, Miciah, sgreene570 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
/retest Please review the full test history for this PR and help us cut down flakes. |
hitting VPC limits |
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/test e2e-aws-operator |
/retest Please review the full test history for this PR and help us cut down flakes. |
11 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@sgreene570: All pull requests linked via external trackers have merged: Bugzilla bug 1915079 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
canary: Disable canary route rotation by default
Forcing the router to reload every 5 minutes for canary testing has a
heavy performance impact for larger clusters. By default, disable the
canary route rotation logic. Canary route rotation can be enabled for
internal use or debugging by setting the new
"ingress.operator.openshift.io/rotate-canary-route" annotation to
"true" on the default ingress controller.
pkg/operator/controller/canary/controller.go:
Add a mutex to the canary controller's reconciler type
to make the new reconciler
enableCanaryRouteRotation
fieldgo-routine safe. Set this field to true when the default ingress
controller has the new canary route rotation annotation
set to true. Use this field in the canary check polling loop
to determine if the canary route endpoint should be rotated.
test/e2e: Add canary route rotation annotation test
test/e2e/canary_test.go: Add an e2e test to verify
that annotating the default ingress controller with the
canary route rotation annotation works as intended.