Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.6] Bug 2058705: Ensure client handling of canceled/dropped OVSDB monitor #973

Conversation

openshift-cherrypick-robot

This is an automated cherry-pick of #719

/assign tssurya

Under heavy load, ovsdb monitor gets dropped silently and we
stop receiving update messages from ovsdbserver. This leads
to stale cache and inconsistency with what's in the DB
leading to transaction errors. While this is originally
a ovsdbserver bug, a workwround fix here would be to
trigger reconnect to recreate the monitor and start
receiving update messages, when a transaction error occurs.

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>
Co-Authored-By: trozet@redhat.com
(cherry picked from commit 231bffe)
(cherry picked from commit a3b90dd)
Adds support for handling monitor_cancel msg
Otherwise a monitor might be closed by ovsdb-server without the client
knowing, leading to missed updates.

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit eab2abc)
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 25, 2022

@openshift-cherrypick-robot: Bugzilla bug 2001364 has been cloned as Bugzilla bug 2058705. Retitling PR to link against new bug.
/retitle [release-4.6] Bug 2058705: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor

In response to this:

[release-4.6] Bug 2001364: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot changed the title [release-4.6] Bug 2001364: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor [release-4.6] Bug 2058705: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor Feb 25, 2022
@openshift-ci openshift-ci bot added bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Feb 25, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 25, 2022

@openshift-cherrypick-robot: This pull request references Bugzilla bug 2058705, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.z) matches configured target release for branch (4.6.z)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 2001364 is in the state CLOSED (ERRATA), which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE))
  • dependent Bugzilla bug 2001364 targets the "4.7.z" release, which is one of the valid target releases: 4.7.0, 4.7.z
  • bug has dependents

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (murali@redhat.com), skipping review request.

In response to this:

[release-4.6] Bug 2058705: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tssurya
Copy link
Contributor

tssurya commented Feb 25, 2022

/retitle [release-4.6] Bug 2058705: Ensure client handling of canceled/dropped OVSDB monitor

@openshift-ci openshift-ci bot changed the title [release-4.6] Bug 2058705: [4.7z] Ensure client handling of canceled/dropped OVSDB monitor [release-4.6] Bug 2058705: Ensure client handling of canceled/dropped OVSDB monitor Feb 25, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 25, 2022

@openshift-cherrypick-robot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.6-upgrade-from-stable-4.5-e2e-aws-ovn-upgrade 2e70284 link false /test 4.6-upgrade-from-stable-4.5-e2e-aws-ovn-upgrade

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@squeed
Copy link
Contributor

squeed commented Feb 28, 2022

There doesn't seem to be an update in go.mod? I don't think we can cherry-pick this...

@trozet
Copy link
Contributor

trozet commented Feb 28, 2022

There doesn't seem to be an update in go.mod? I don't think we can cherry-pick this...

@squeed it was never merged upstream eBay/go-ovn#153

We have been carrying it since 4.9 downstream only. eBay/go-ovn is pretty much dead now anyway. If we do need to update the lib via go mod then we will need to get this fixed upstream.

@trozet
Copy link
Contributor

trozet commented Feb 28, 2022

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Feb 28, 2022
@trozet
Copy link
Contributor

trozet commented Feb 28, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 28, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 28, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-cherrypick-robot, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 28, 2022
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

6 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

25 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@zhaozhanqi
Copy link

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Mar 2, 2022
@openshift-merge-robot openshift-merge-robot merged commit b9dd381 into openshift:release-4.6 Mar 2, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 2, 2022

@openshift-cherrypick-robot: All pull requests linked via external trackers have merged:

Bugzilla bug 2058705 has been moved to the MODIFIED state.

In response to this:

[release-4.6] Bug 2058705: Ensure client handling of canceled/dropped OVSDB monitor

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants