New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 2005464: [4.8z] Fixes skipping pods accidentally in retry #758
Bug 2005464: [4.8z] Fixes skipping pods accidentally in retry #758
Conversation
we have seen instances where the pod delete event gets queued up and the pod's DeleteFunc() callback is delayed. in the meantime, we are trying to do pod setup of the failed pod again and again. since the pod is gone we fail in setting pod annotations in addLogicalPort(). we add the pod back to the retryLoop and retry again after 1 minute. we continue this until the pod's DeleteFunc() gets called and it removes the deleted pod from the retryPod map. (cherry picked from commit f5ec566)
The code was using the initially stored pod object to determine if the pod was scheduled or not, rather than the current state of the pod. Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit 151eb77)
@trozet: This pull request references Bugzilla bug 2005464, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla refresh |
@trozet: This pull request references Bugzilla bug 2005464, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dcbw, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
10 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/bugzilla refresh Recalculating validity in case the underlying Bugzilla bug has changed. |
@openshift-bot: This pull request references Bugzilla bug 2005464, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
12 similar comments
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/retest-required Please review the full test history for this PR and help us cut down flakes. |
/hold @trozet Is the e2e-metal-ipi-ovn-dualstack job important for this patch or do we want to override that job and merge it? I'm adding a hold for now to stop the bot from retrying indefinitely. |
[patch-manager] ⌛ This pull request was not picked by the patch manager for the current z-stream window and have to wait for the next window. skipped for today
NOTE: This message was automatically generated, if you have questions please ask on #forum-release |
@dhellmann we dont need to override dualstack, its not a required job right? This PR is not specific to dualstack. The failures in the dualstack job: [sig-cli] oc explain should contain proper spec+status for CRDs [Suite:openshift/conformance/parallel] looks like they have nothing to do with ovn-kube or this PR. I'm not sure why the job was removed as required. @stbenjam are you aware of this? |
I'm not sure why the bot kept retrying, or whether the PR would merge if I approved it without skipping the failing job. I'm happy to apply whatever labels we need, if someone confirms it's safe to take manual action. |
/retest |
@dhellmann its just retrying cause it doesnt have cherry-pick approved. Its a bug with the bot. Even if all jobs were passed, it would keep just printing /retest-required forever. |
OK, let's see if this goes through, then. |
@trozet: All pull requests linked via external trackers have merged: Bugzilla bug 2005464 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/bugzilla cc-qa |
@anuragthehatter: Bugzilla bug 2005464 is in an unrecognized state (MODIFIED) and will not be moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
The code was using the initially stored pod object to determine if the
pod was scheduled or not, rather than the current state of the pod.