Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.11] OCPBUGS-4475: Handle expired entry while handling dns update #1417

Conversation

pperiyasamy
Copy link
Member

When nextQueryTime for dns entry is already expired and update goroutine is trying to use negative duration to reset ticker. This causes process crash on the ovnk master. This change avoids proces crash and ensures to trigger ticker in shortest possible time (1 ms) so that update happens immediately on expired dns entry.

Signed-off-by: Periyasamy Palanisamy pepalani@redhat.com
(cherry picked from commit 3784254) (cherry picked from commit c68ba05)

When nextQueryTime for dns entry is already expired and update goroutine is trying
to use negative duration to reset ticker. This causes process crash on the ovnk
master. This change avoids proces crash and ensures to trigger ticker in shortest
possible time (1 ms) so that update happens immediately on expired dns entry.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
(cherry picked from commit 3784254)
(cherry picked from commit c68ba05)
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 5, 2022

@pperiyasamy: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

In response to this:

[release-4.11] OCPBUGS-4475: Handle expired entry while handling dns update

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Dec 5, 2022
@openshift-ci-robot
Copy link
Contributor

@pperiyasamy: This pull request references Jira Issue OCPBUGS-4475, which is valid. The bug has been moved to the POST state.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.11.z) matches configured target version for branch (4.11.z)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
  • dependent bug Jira Issue OCPBUGS-3977 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE))
  • dependent Jira Issue OCPBUGS-3977 targets the "4.12.0" version, which is one of the valid target versions: 4.12.0
  • bug has dependents

Requesting review from QA contact:
/cc @huiran0826

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

When nextQueryTime for dns entry is already expired and update goroutine is trying to use negative duration to reset ticker. This causes process crash on the ovnk master. This change avoids proces crash and ensures to trigger ticker in shortest possible time (1 ms) so that update happens immediately on expired dns entry.

Signed-off-by: Periyasamy Palanisamy pepalani@redhat.com
(cherry picked from commit 3784254) (cherry picked from commit c68ba05)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dcbw
Copy link
Contributor

dcbw commented Dec 7, 2022

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 7, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 7, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcbw, pperiyasamy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 7, 2022
@dcbw
Copy link
Contributor

dcbw commented Dec 7, 2022

AWS install problems...

/retest-required

@dcbw dcbw added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Dec 7, 2022
@pperiyasamy
Copy link
Member Author

/retest-required

1 similar comment
@pperiyasamy
Copy link
Member Author

/retest-required

@anuragthehatter
Copy link

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Dec 8, 2022
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 086de98 and 2 for PR HEAD ad0a282 in total

@pperiyasamy
Copy link
Member Author

/retest-required

@dcbw
Copy link
Contributor

dcbw commented Dec 13, 2022

/override ci/prow/e2e-metal-ipi-ovn-dualstack

https://coreos.slack.com/archives/CFP6ST0A3/p1670860628776879

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 13, 2022

@dcbw: Overrode contexts on behalf of dcbw: ci/prow/e2e-metal-ipi-ovn-dualstack

In response to this:

/override ci/prow/e2e-metal-ipi-ovn-dualstack

https://coreos.slack.com/archives/CFP6ST0A3/p1670860628776879

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 54d844f and 1 for PR HEAD ad0a282 in total

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 13, 2022

@pperiyasamy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn ad0a282 link false /test e2e-azure-ovn
ci/prow/e2e-hypershift ad0a282 link false /test e2e-hypershift
ci/prow/e2e-vsphere-windows ad0a282 link false /test e2e-vsphere-windows
ci/prow/4.11-upgrade-from-stable-4.10-local-gateway-e2e-aws-ovn-upgrade ad0a282 link false /test 4.11-upgrade-from-stable-4.10-local-gateway-e2e-aws-ovn-upgrade

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dcbw
Copy link
Contributor

dcbw commented Dec 13, 2022

/override ci/prow/4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade

Transient error pulling image metadata from Cloudfront by the node:

ns/openshift-etcd pod/revision-pruner-8-ip-10-0-186-233.us-west-1.compute.internal node/ip-10-0-186-233.us-west-1.compute.internal - 289.14 seconds after deletion - reason/FailedCreatePodSandBox Failed to create pod sandbox: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_revision-pruner-8-ip-10-0-186-233.us-west-1.compute.internal_openshift-etcd_046bf0f7-4c8f-4fc6-aa6c-6e301f3284ed_0": parsing image configuration: Get "https://d11wsypyfft1aa.cloudfront.net/docker/registry/v2/blobs/sha256/50/505bef2cb0cc4d2c825de4a8307ab9081cf77ced68f8cb6e68d0bb63d907149c/data?Expires=1670912940&Signature=wr0addOiMh2g1ORAlwTY1nO0YEd5RB4lYZT2pPXXy3MN8l7P2bLzzio6hNEmWWFcg9L9gdi0fXWy0tz-oPY00W24SmohI7gYfJseoXw0iGmZafqD~XuTsFKtTIejQQMbd8ghsuLO8S0YFjZHBp62RYh4bZVOW5VwgeiaPFdO7cDsDWtI~i~UYpv9z2uz4Ma96LtZMxX8aF~FodUA9LEolJRjXSUy5AGjwPVXB7cxdJneGd-dJYrEPc48bBW71zI5N8vt645RCHp2DA-dlMAjf6qYX1MJrNw5GuirdiNAGCnyG4lxoCAiOFEi9tRwKnqkvIxi9sB0343~YISr7MAcdA__&Key-Pair-Id=K35IN2XBI3W7KZ": read tcp 10.0.186.233:60060->18.155.204.58:443: read: connection reset by peer}

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 13, 2022

@dcbw: Overrode contexts on behalf of dcbw: ci/prow/4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade

In response to this:

/override ci/prow/4.11-upgrade-from-stable-4.10-e2e-aws-ovn-upgrade

Transient error pulling image metadata from Cloudfront by the node:

ns/openshift-etcd pod/revision-pruner-8-ip-10-0-186-233.us-west-1.compute.internal node/ip-10-0-186-233.us-west-1.compute.internal - 289.14 seconds after deletion - reason/FailedCreatePodSandBox Failed to create pod sandbox: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_revision-pruner-8-ip-10-0-186-233.us-west-1.compute.internal_openshift-etcd_046bf0f7-4c8f-4fc6-aa6c-6e301f3284ed_0": parsing image configuration: Get "https://d11wsypyfft1aa.cloudfront.net/docker/registry/v2/blobs/sha256/50/505bef2cb0cc4d2c825de4a8307ab9081cf77ced68f8cb6e68d0bb63d907149c/data?Expires=1670912940&Signature=wr0addOiMh2g1ORAlwTY1nO0YEd5RB4lYZT2pPXXy3MN8l7P2bLzzio6hNEmWWFcg9L9gdi0fXWy0tz-oPY00W24SmohI7gYfJseoXw0iGmZafqD~XuTsFKtTIejQQMbd8ghsuLO8S0YFjZHBp62RYh4bZVOW5VwgeiaPFdO7cDsDWtI~i~UYpv9z2uz4Ma96LtZMxX8aF~FodUA9LEolJRjXSUy5AGjwPVXB7cxdJneGd-dJYrEPc48bBW71zI5N8vt645RCHp2DA-dlMAjf6qYX1MJrNw5GuirdiNAGCnyG4lxoCAiOFEi9tRwKnqkvIxi9sB0343~YISr7MAcdA__&Key-Pair-Id=K35IN2XBI3W7KZ": read tcp 10.0.186.233:60060->18.155.204.58:443: read: connection reset by peer}

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot openshift-merge-robot merged commit fbc34a5 into openshift:release-4.11 Dec 13, 2022
@openshift-ci-robot
Copy link
Contributor

@pperiyasamy: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-4475 has been moved to the MODIFIED state.

In response to this:

When nextQueryTime for dns entry is already expired and update goroutine is trying to use negative duration to reset ticker. This causes process crash on the ovnk master. This change avoids proces crash and ensures to trigger ticker in shortest possible time (1 ms) so that update happens immediately on expired dns entry.

Signed-off-by: Periyasamy Palanisamy pepalani@redhat.com
(cherry picked from commit 3784254) (cherry picked from commit c68ba05)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet