New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-18054: Emit node events only when retry failure #1837
OCPBUGS-18054: Emit node events only when retry failure #1837
Conversation
Nodes obj is configured via distributed software components and previous to this patch, we are sending numerous kubernetes events of error level warning when infact everything is proceeding normally.. Only emit warning events when we fail to configure a node. This is after 15 retry attempts - ~7m currently. We continue logging every node add/update/delete failure to logs. Signed-off-by: Martin Kennelly <mkennell@redhat.com> (cherry picked from commit 8889f47) (cherry picked from commit dada90d)
@martinkennelly: This pull request references Jira Issue OCPBUGS-18054, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@martinkennelly: This pull request references Jira Issue OCPBUGS-18054, which is valid. The bug has been moved to the POST state. 6 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest-required |
/assign @ricky-rav |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dcbw, martinkennelly, ricky-rav The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/label backport-risk-assessed |
/test e2e-aws-ovn-hypershift Failed because it was unable to find a container image - retesting as it looks transient. |
Looks like mode migration jobs are permafailing for a few weeks - will ask the team if they know about this. |
Mode migration jobs are failing and its unrelated to this PR however I should look into why before looking for override. This will take some time as I am currently busy. |
@jluhrsen I hear you're fixing the mode migration jobs in CNO in 4.12. |
I am going to look for override - pointless waiting any longer and wasting $ on jobs failing. |
/override ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration |
@dcbw: Overrode contexts on behalf of dcbw: ci/prow/e2e-aws-ovn-local-to-shared-gateway-mode-migration, ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Lots of image pull issues:
/retest |
pull-ci-openshift-ovn-kubernetes-release-4.12-4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgradeFailed because disruption budget limit is 1s and in the CI it was 2s 4.12-upgrade-from-stable-4.11-local-gateway-e2e-aws-ovn-upgradeWhen kapi client attempted to pull a CRD to check openshift.io specific crds, it got back a bad reply from API server (?): |
Both errors unrelated to this PR. |
/retest-required |
/hold Revision f0f710b was retested 3 times: holding |
/test e2e-aws-ovn-upgrade-local-gateway |
/unhold Dont know why the overrides are now gone @dcbw can you reapply? |
GW mode migration will be fixed by https://issues.redhat.com/browse/OCPBUGS-17391. |
/override ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration |
@jcaamano: Overrode contexts on behalf of jcaamano: ci/prow/e2e-aws-ovn-local-to-shared-gateway-mode-migration, ci/prow/e2e-aws-ovn-shared-to-local-gateway-mode-migration In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest BM CI was bad this morning. |
@martinkennelly: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
87440f4
into
openshift:release-4.12
@martinkennelly: Jira Issue OCPBUGS-18054: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-18054 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Fix included in accepted release 4.12.0-0.nightly-2023-09-28-010903 |
Nodes obj is configured via distributed software
components and previous to this patch, we are
sending numerous kubernetes events of error level warning when infact everything is proceeding normally..
Only emit warning events when we fail to configure a node. This is after 15 retry attempts - ~7m currently.
We continue logging every node add/update/delete failure to logs.
Signed-off-by: Martin Kennelly mkennell@redhat.com
(cherry picked from commit 8889f47) (cherry picked from commit dada90d)