Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2001364: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor #719

Merged

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Sep 5, 2021

Under heavy load, monitor gets dropped silently and we stop receiving update messages from ovsdbserver. This leads to stale cache and inconsistency with what's in the DB leading to transaction errors. While this is originally a ovsdbserver bug, a workwround fix here would be to trigger reconnect to recreate the monitor and start receiving update messages, when a transaction error occurs.

Under heavy load, ovsdb monitor gets dropped silently and we
stop receiving update messages from ovsdbserver. This leads
to stale cache and inconsistency with what's in the DB
leading to transaction errors. While this is originally
a ovsdbserver bug, a workwround fix here would be to
trigger reconnect to recreate the monitor and start
receiving update messages, when a transaction error occurs.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Co-Authored-By: trozet@redhat.com
(cherry picked from commit 231bffe)
(cherry picked from commit a3b90dd)
Adds support for handling monitor_cancel msg
Otherwise a monitor might be closed by ovsdb-server without the client
knowing, leading to missed updates.

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit eab2abc)
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 5, 2021

@trozet: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 2001364: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Sep 5, 2021
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 5, 2021
@trozet
Copy link
Contributor Author

trozet commented Sep 5, 2021

@dcbw

@trozet
Copy link
Contributor Author

trozet commented Sep 5, 2021

/assign @tssurya

@trozet
Copy link
Contributor Author

trozet commented Sep 5, 2021

/retest

@dcbw
Copy link
Member

dcbw commented Sep 6, 2021

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 6, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 6, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcbw, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 7, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@trozet
Copy link
Contributor Author

trozet commented Sep 7, 2021

/test ci/prow/images

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 7, 2021

@trozet: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test e2e-aws-ovn
  • /test e2e-aws-ovn-windows
  • /test e2e-azure-ovn
  • /test e2e-gcp-ovn
  • /test e2e-gcp-ovn-upgrade
  • /test e2e-metal-ipi-ovn-ipv6
  • /test e2e-ovn-hybrid-step-registry
  • /test e2e-vsphere-ovn
  • /test e2e-vsphere-windows
  • /test images
  • /test okd-images

The following commands are available to trigger optional jobs:

  • /test 4.7-upgrade-from-stable-4.6-e2e-aws-ovn-upgrade
  • /test e2e-metal-ipi-ovn-dualstack
  • /test okd-e2e-gcp-ovn

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-aws-ovn
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-aws-ovn-windows
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-azure-ovn
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-gcp-ovn
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-gcp-ovn-upgrade
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-metal-ipi-ovn-dualstack
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-metal-ipi-ovn-ipv6
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-ovn-hybrid-step-registry
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-vsphere-ovn
  • pull-ci-openshift-ovn-kubernetes-release-4.7-e2e-vsphere-windows
  • pull-ci-openshift-ovn-kubernetes-release-4.7-images
  • pull-ci-openshift-ovn-kubernetes-release-4.7-okd-e2e-gcp-ovn
  • pull-ci-openshift-ovn-kubernetes-release-4.7-okd-images

In response to this:

/test ci/prow/images

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@trozet
Copy link
Contributor Author

trozet commented Sep 7, 2021

/test images

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 8, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@abhat
Copy link
Contributor

abhat commented Sep 8, 2021

/bugzilla cc-qa

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 8, 2021

@abhat: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla cc-qa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 9, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 10, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 11, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is ON_QA instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 12, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is ON_QA instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 13, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is invalid:

  • expected dependent Bugzilla bug 2001363 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is ON_QA instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Sep 14, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 14, 2021

@openshift-bot: This pull request references Bugzilla bug 2001364, which is valid.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.z) matches configured target release for branch (4.7.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 2001363 is in the state VERIFIED, which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE))
  • dependent Bugzilla bug 2001363 targets the "4.8.z" release, which is one of the valid target releases: 4.8.0, 4.8.z
  • bug has dependents

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (murali@redhat.com), skipping review request.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Sep 14, 2021
@hardys hardys added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Sep 15, 2021
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 9d6f153 into openshift:release-4.7 Sep 16, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 16, 2021

@trozet: Bugzilla bug 2001364 is in an unrecognized state (VERIFIED) and will not be moved to the MODIFIED state.

In response to this:

Bug 2001364: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tssurya
Copy link
Contributor

tssurya commented Feb 25, 2022

/cherry-pick release-4.6

@openshift-cherrypick-robot

@tssurya: new pull request created: #973

In response to this:

/cherry-pick release-4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants