Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2014332: [4.8z] Scale fixes for pods/exgws #798

Merged
merged 5 commits into from Nov 5, 2021

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Oct 15, 2021

Performance and scale fixes with pods and multiple external gateways.

trozet and others added 2 commits October 14, 2021 21:35
Address set operations like add and remove are idempotent. We can get
away with only RLocking there, which will greatly improve pod add
performance. There is also no need to store the ips in the addressSet
struct.

Signed-off-by: Tim Rozet <trozet@redhat.com>
This happens when the pod was already created but a new event of the pod
is generated. I managed to see it after a ovnkube-master manual restart.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
(cherry picked from commit 7828dff)
@openshift-ci openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Oct 15, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 15, 2021

@trozet: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead
  • expected dependent Bugzilla bug 1997072 to target a release in 4.9.0, but it targets "4.9.z" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 2014332: [4.8z] Scale fixes for pods/exgws

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@trozet
Copy link
Contributor Author

trozet commented Oct 15, 2021

/assign @dcbw

Dan please review this closely, there were quite a few merge conflicts.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 15, 2021
fedepaol and others added 3 commits October 15, 2021 10:22
When a gw pod gets the external gateway annotation, it adds the specific
routes to the external gateway for existing pods, but it does not remove
the SNAT that was added when the pod was created.

Signed-off-by: Federico Paolinelli <fpaoline@redhat.com>
(cherry picked from commit 8783628)
Previously nsInfo was holding not only a map of gateways per namespace,
but all of the routes per pod in an external gateway enabled namespace.
This means that during all external gateway route adds/deletes nsInfo
would need to be locked. This creates heavy contention in cluster
specifically using external gateway functionality.

This breaks out the pod routes portion into its own cache, which has
individual locks on a per pod basis. This allows exgw routes to be added
and removed without needing nsInfo lock. Additionally, since locks are
on a per pod basis, it provides less overall contention across the
cache.

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit c6db422)
When a pod n number of gateways there will be n number of calls to create
the same 501 policy. This commit reduces it to a single call.

Signed-off-by: Tim Rozet <trozet@redhat.com>
(cherry picked from commit dd836a7)
@trozet
Copy link
Contributor Author

trozet commented Oct 15, 2021

/assign @fedepaol

@trozet
Copy link
Contributor Author

trozet commented Oct 16, 2021

/retest

1 similar comment
@trozet
Copy link
Contributor Author

trozet commented Oct 16, 2021

/retest

@fedepaol
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 18, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 18, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fedepaol, trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 19, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@anuragthehatter
Copy link

/bugzilla cc-qa

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 19, 2021

@anuragthehatter: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla cc-qa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mffiedler
Copy link

Failed verification on a cluster-bot cluster built from this PR (similar to the 4.9 bug 1997072 - see that bz for must-gather).

Cluster was a 120 node OVN cluster on AWS and the workload was node-density light. Many FailedCreatePodSandBox events with reason "timed out waiting for annotations" are seen and pods take a long time for all to go Running.

On 4.10 latest nightly, the issue can not be reproduced - no annotation timeout events for node-density light in the same cluster configuration

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

2 similar comments
@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 29, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 30, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 31, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 1, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 2, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 3, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 4, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is POST instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mffiedler
Copy link

Tested on 4.8 cluster-bot cluster using the workload from https://bugzilla.redhat.com/show_bug.cgi?id=2014332#c6
/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Nov 4, 2021
@openshift-bot
Copy link
Contributor

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 5, 2021

@openshift-bot: This pull request references Bugzilla bug 2014332, which is invalid:

  • expected dependent Bugzilla bug 1997072 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE), but it is ON_QA instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/bugzilla refresh

Recalculating validity in case the underlying Bugzilla bug has changed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dcbw
Copy link
Contributor

dcbw commented Nov 5, 2021

/bugzilla refresh

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Nov 5, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 5, 2021

@dcbw: This pull request references Bugzilla bug 2014332, which is valid.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.z) matches configured target release for branch (4.8.z)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 1997072 is in the state VERIFIED, which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), CLOSED (CURRENTRELEASE))
  • dependent Bugzilla bug 1997072 targets the "4.9.z" release, which is one of the valid target releases: 4.9.0, 4.9.z
  • bug has dependents

Requesting review from QA contact:
/cc @anuragthehatter

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Nov 5, 2021
@trozet
Copy link
Contributor Author

trozet commented Nov 5, 2021

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Nov 5, 2021
@mffiedler
Copy link

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Nov 5, 2021
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit f1d74e3 into openshift:release-4.8 Nov 5, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 5, 2021

@trozet: All pull requests linked via external trackers have merged:

Bugzilla bug 2014332 has been moved to the MODIFIED state.

In response to this:

Bug 2014332: [4.8z] Scale fixes for pods/exgws

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants