Bug 2005464: [4.8z] Fixes skipping pods accidentally in retry #758

trozet · 2021-09-17T18:09:44Z

The code was using the initially stored pod object to determine if the
pod was scheduled or not, rather than the current state of the pod.

we have seen instances where the pod delete event gets queued up and the pod's DeleteFunc() callback is delayed. in the meantime, we are trying to do pod setup of the failed pod again and again. since the pod is gone we fail in setting pod annotations in addLogicalPort(). we add the pod back to the retryLoop and retry again after 1 minute. we continue this until the pod's DeleteFunc() gets called and it removes the deleted pod from the retryPod map. (cherry picked from commit f5ec566)

The code was using the initially stored pod object to determine if the pod was scheduled or not, rather than the current state of the pod. Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit 151eb77)

openshift-ci · 2021-09-17T18:09:49Z

@trozet: This pull request references Bugzilla bug 2005464, which is invalid:

expected dependent Bugzilla bug 2005462 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 2005464: [4.8z] Fixes skipping pods accidentally in retry

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

trozet · 2021-09-17T21:22:11Z

/bugzilla refresh

openshift-ci · 2021-09-17T21:22:14Z

@trozet: This pull request references Bugzilla bug 2005464, which is invalid:

expected dependent Bugzilla bug 2005462 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dcbw · 2021-09-18T19:10:17Z

/lgtm
/retest

openshift-ci · 2021-09-18T19:10:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcbw, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [dcbw,trozet]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2021-09-18T20:59:23Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-18T21:47:22Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-18T22:48:26Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-18T23:01:24Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-18T23:14:23Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-18T23:27:22Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-18T23:40:24Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-18T23:53:23Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-19T00:06:21Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-19T00:45:22Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-19T00:58:23Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-19T01:00:29Z

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

openshift-ci · 2021-09-19T01:00:32Z

@openshift-bot: This pull request references Bugzilla bug 2005464, which is invalid:

expected dependent Bugzilla bug 2005462 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is ON_QA instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-bot · 2021-09-19T01:24:25Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-19T01:50:23Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-19T02:03:23Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T09:57:11Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T10:22:05Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T10:46:07Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T10:58:10Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T11:22:06Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T11:59:07Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T12:36:36Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T12:48:04Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T14:01:14Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T14:50:06Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T15:16:21Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T16:59:06Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-09-23T17:51:07Z

/retest-required

Please review the full test history for this PR and help us cut down flakes.

dhellmann · 2021-09-23T19:44:42Z

/hold

@trozet Is the e2e-metal-ipi-ovn-dualstack job important for this patch or do we want to override that job and merge it?

I'm adding a hold for now to stop the bot from retrying indefinitely.

dhellmann · 2021-09-23T19:58:10Z

[patch-manager] ⌛ This pull request was not picked by the patch manager for the current z-stream window and have to wait for the next window.

skipped for today

Score: 1.70
Reason: priority request from OVN team, but CI is not passing

NOTE: This message was automatically generated, if you have questions please ask on #forum-release

trozet · 2021-09-23T20:55:43Z

@dhellmann we dont need to override dualstack, its not a required job right? This PR is not specific to dualstack. The failures in the dualstack job:

[sig-cli] oc explain should contain proper spec+status for CRDs [Suite:openshift/conformance/parallel]
[Conformance][sig-api-machinery][Feature:APIServer] local kubeconfig "localhost.kubeconfig" should be present on all masters and work [Suite:openshift/conformance/parallel/minimal]

looks like they have nothing to do with ovn-kube or this PR. I'm not sure why the job was removed as required. @stbenjam are you aware of this?

trozet · 2021-09-23T20:56:27Z

@jluhrsen ^

dhellmann · 2021-09-23T20:57:41Z

I'm not sure why the bot kept retrying, or whether the PR would merge if I approved it without skipping the failing job. I'm happy to apply whatever labels we need, if someone confirms it's safe to take manual action.

trozet · 2021-09-23T21:07:21Z

/retest

trozet · 2021-09-23T21:10:45Z

@dhellmann its just retrying cause it doesnt have cherry-pick approved. Its a bug with the bot. Even if all jobs were passed, it would keep just printing /retest-required forever.

dhellmann · 2021-09-23T21:18:57Z

OK, let's see if this goes through, then.

openshift-ci · 2021-09-23T21:19:41Z

@trozet: All pull requests linked via external trackers have merged:

openshift/ovn-kubernetes#758

Bugzilla bug 2005464 has been moved to the MODIFIED state.

In response to this:

Bug 2005464: [4.8z] Fixes skipping pods accidentally in retry

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

anuragthehatter · 2021-09-23T21:25:46Z

/bugzilla cc-qa

openshift-ci · 2021-09-23T21:25:49Z

@anuragthehatter: Bugzilla bug 2005464 is in an unrecognized state (MODIFIED) and will not be moved to the MODIFIED state.

In response to this:

/bugzilla cc-qa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

anuragthehatter · 2021-09-23T21:26:24Z

/lgtm
/label qe-approved

girishmg and others added 2 commits September 17, 2021 13:55

Fixes skipping pods accidentally in retry

94341b4

The code was using the initially stored pod object to determine if the pod was scheduled or not, rather than the current state of the pod. Signed-off-by: Tim Rozet <trozet@redhat.com> (cherry picked from commit 151eb77)

openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Sep 17, 2021

openshift-ci bot requested review from abhat and JacobTanenbaum September 17, 2021 18:10

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 17, 2021

openshift-ci bot assigned dcbw Sep 18, 2021

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 18, 2021

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 23, 2021

dhellmann added cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Sep 23, 2021

openshift-merge-robot merged commit f1ee037 into openshift:release-4.8 Sep 23, 2021

openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Sep 23, 2021

Bug 2005464: [4.8z] Fixes skipping pods accidentally in retry #758

Bug 2005464: [4.8z] Fixes skipping pods accidentally in retry #758

Conversation

trozet commented Sep 17, 2021

openshift-ci bot commented Sep 17, 2021

trozet commented Sep 17, 2021

openshift-ci bot commented Sep 17, 2021

dcbw commented Sep 18, 2021

openshift-ci bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 18, 2021

openshift-bot commented Sep 19, 2021

openshift-bot commented Sep 19, 2021

openshift-bot commented Sep 19, 2021

openshift-bot commented Sep 19, 2021

openshift-ci bot commented Sep 19, 2021

openshift-bot commented Sep 19, 2021

openshift-bot commented Sep 19, 2021

openshift-bot commented Sep 19, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

openshift-bot commented Sep 23, 2021

dhellmann commented Sep 23, 2021

dhellmann commented Sep 23, 2021

trozet commented Sep 23, 2021

trozet commented Sep 23, 2021

dhellmann commented Sep 23, 2021

trozet commented Sep 23, 2021

trozet commented Sep 23, 2021

dhellmann commented Sep 23, 2021

openshift-ci bot commented Sep 23, 2021

anuragthehatter commented Sep 23, 2021

openshift-ci bot commented Sep 23, 2021

anuragthehatter commented Sep 23, 2021