Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.11] [manual] OCPBUGS-3182: set RPS for veth on host level only #508

Conversation

Tal-or
Copy link
Contributor

@Tal-or Tal-or commented Nov 6, 2022

RPS handling on pod container level using crio-hooks causes long delay times when running the low latency script to set the RPS mask (https://bugzilla.redhat.com/show_bug.cgi?id=2109965)

For RAN low latency solution it might be sufficient only to set the RPS on the host level and avoid setting it on the container level while utilizing RSS behavior.

In the past the low latency hook was added with RPS additional settings on virtual devices since there was an issue where the start and shutdown of big amount of pods will initiate the creation of the systemd service that should update the new interfaces rps_cpus mask and can create an additional CPU load under the cluster (openshift-kni/performance-addon-operators#659)
This might not be the case any more thus we need to examine how the revert of the aforementioned PR will behave now.

Co-authored-by: Yanir Quinn yquinn@redhat.com
Signed-off-by: Talor Itzhak titzhak@redhat.com

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 6, 2022

@Tal-or: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

In response to this:

[release-4.11] [manual] OCPBUGS-3182: set RPS for veth on host level only

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Nov 6, 2022
@openshift-ci-robot
Copy link
Contributor

@Tal-or: This pull request references Jira Issue OCPBUGS-3182, which is invalid:

  • expected the bug to target the "4.11.z" version, but no target version was set
  • expected Jira Issue OCPBUGS-3182 to depend on a bug targeting a version in 4.12.0 and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

RPS handling on pod container level using crio-hooks causes long delay times when running the low latency script to set the RPS mask (https://bugzilla.redhat.com/show_bug.cgi?id=2109965)

For RAN low latency solution it might be sufficient only to set the RPS on the host level and avoid setting it on the container level while utilizing RSS behavior.

In the past the low latency hook was added with RPS additional settings on virtual devices since there was an issue where the start and shutdown of big amount of pods will initiate the creation of the systemd service that should update the new interfaces rps_cpus mask and can create an additional CPU load under the cluster (openshift-kni/performance-addon-operators#659)
This might not be the case any more thus we need to examine how the revert of the aforementioned PR will behave now.

Co-authored-by: Yanir Quinn yquinn@redhat.com
Signed-off-by: Talor Itzhak titzhak@redhat.com

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from MarSik and yanirq November 6, 2022 15:24
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 6, 2022

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@Tal-or: This pull request references Jira Issue OCPBUGS-3182, which is invalid:

  • expected Jira Issue OCPBUGS-3182 to depend on a bug targeting a version in 4.12.0 and in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), but no dependents were found

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Tal-or Tal-or force-pushed the manual_backport_oci_hook_bug branch from e3b05f6 to 611b268 Compare November 6, 2022 16:18
@yanirq
Copy link
Contributor

yanirq commented Nov 22, 2022

/retest

@Tal-or Tal-or force-pushed the manual_backport_oci_hook_bug branch 2 times, most recently from 1787603 to 9359fb3 Compare November 23, 2022 14:48
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 23, 2022

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@Tal-or: This pull request references Jira Issue OCPBUGS-3182, which is invalid:

  • bug is open, matching expected state (open)
  • bug target version (4.11.z) matches configured target version for branch (4.11.z)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
  • bug has dependents
  • dependent bug OCPBUGSM-47141 is not in the required OCPBUGS project

All dependent bugs must be part of the OCPBUGS project. If you are backporting a fix that was originally tracked in Bugzilla, follow these steps to handle the backport:

  1. Create a new bug in the OCPBUGS Jira project to match the original bugzilla bug. The important fields that should match are the title, description, target version, and status.
  2. Use the Jira UI to clone the Jira bug, then in the clone bug:
    a. Set the target version to the release you are cherrypicking to.
    b. Add an issue link “is blocked by”, which links to the original jira bug
  3. Use the cherrypick github command to create the cherrypicked PR. Once that new PR is created, retitle the PR and replace the BUG XXX: with OCPBUGS-XXX: to match the new Jira story.

Note that the mirrored bug in OCPBUGSM should not be involved in this process at all.

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 23, 2022

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@Tal-or: This pull request references Jira Issue OCPBUGS-3182, which is invalid:

  • expected dependent Jira Issue OCPBUGS-4043 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), but it is ON_QA instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

RPS handling on pod container level using crio-hooks causes long delay times when running the low latency script to set the RPS mask (https://bugzilla.redhat.com/show_bug.cgi?id=2109965)

For RAN low latency solution it might be sufficient only to set the RPS on the host level and avoid setting it on the container level while utilizing RSS behavior.

In the past the low latency hook was added with RPS additional settings on virtual devices since there was an issue where the start and shutdown of big amount of pods will initiate the creation of the systemd service that should update the new interfaces rps_cpus mask and can create an additional CPU load under the cluster (openshift-kni/performance-addon-operators#659)
This might not be the case any more thus we need to examine how the revert of the aforementioned PR will behave now.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
@Tal-or Tal-or force-pushed the manual_backport_oci_hook_bug branch from 0fa2bef to 0ab3c5a Compare November 23, 2022 17:26
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 24, 2022

/test e2e-upgrade

@yanirq
Copy link
Contributor

yanirq commented Nov 24, 2022

/label backport-risk-assessed

@openshift-ci openshift-ci bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Nov 24, 2022
@yanirq
Copy link
Contributor

yanirq commented Nov 24, 2022

if there are relevant e2e-tests we can downport in the same PR it will be good for coverage.

@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 24, 2022

/hold
I want to see if we can add some e2e

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 24, 2022
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 24, 2022

/hold cancel
E2E tests were not merged to the master branch yet: #501

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 24, 2022
@yanirq
Copy link
Contributor

yanirq commented Nov 24, 2022

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 24, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Tal-or, yanirq

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 24, 2022
@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 24, 2022

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@Tal-or: This pull request references Jira Issue OCPBUGS-3182, which is invalid:

  • expected dependent Jira Issue OCPBUGS-4043 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), but it is ON_QA instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 24, 2022

@Tal-or: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 24, 2022

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Nov 24, 2022
@openshift-ci-robot
Copy link
Contributor

@Tal-or: This pull request references Jira Issue OCPBUGS-3182, which is valid. The bug has been moved to the POST state.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.11.z) matches configured target version for branch (4.11.z)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)
  • dependent bug Jira Issue OCPBUGS-4043 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE))
  • dependent Jira Issue OCPBUGS-4043 targets the "4.12.0" version, which is one of the valid target versions: 4.12.0
  • bug has dependents

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@yanirq
Copy link
Contributor

yanirq commented Nov 24, 2022

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 24, 2022
@mrniranjan
Copy link
Contributor

/label cherry-pick-approved

@openshift-ci openshift-ci bot added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Nov 24, 2022
@openshift-merge-robot openshift-merge-robot merged commit 9c8fd3d into openshift:release-4.11 Nov 24, 2022
@openshift-ci-robot
Copy link
Contributor

@Tal-or: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-3182 has been moved to the MODIFIED state.

In response to this:

RPS handling on pod container level using crio-hooks causes long delay times when running the low latency script to set the RPS mask (https://bugzilla.redhat.com/show_bug.cgi?id=2109965)

For RAN low latency solution it might be sufficient only to set the RPS on the host level and avoid setting it on the container level while utilizing RSS behavior.

In the past the low latency hook was added with RPS additional settings on virtual devices since there was an issue where the start and shutdown of big amount of pods will initiate the creation of the systemd service that should update the new interfaces rps_cpus mask and can create an additional CPU load under the cluster (openshift-kni/performance-addon-operators#659)
This might not be the case any more thus we need to examine how the revert of the aforementioned PR will behave now.

Co-authored-by: Yanir Quinn yquinn@redhat.com
Signed-off-by: Talor Itzhak titzhak@redhat.com

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Tal-or Tal-or deleted the manual_backport_oci_hook_bug branch November 24, 2022 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants