Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNF-8808: perfruntime: add shared cpus to container's cgroup #7502

Merged
merged 6 commits into from Dec 7, 2023

Conversation

Tal-or
Copy link
Contributor

@Tal-or Tal-or commented Nov 21, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

Given an annotation CRI-O appends additional CPUs (named shared CPUs)
into the container's cgroups.

The operation is done via a performance runtime hook.

This will allow the container to run light-weight tasks
on the shared cpus, while all the other task that
requires full isolation will keep running on the guaranteed CPUs.

It also injects env variables in to the container
in order to distinguish between the isolated and shared CPUs.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

Related to:

Depends on:

EP: openshift/enhancements@76ea48a

Does this PR introduce a user-facing change?

@Tal-or Tal-or requested a review from mrunalp as a code owner November 21, 2023 14:55
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 21, 2023
@openshift-ci-robot
Copy link

@Tal-or: This pull request references CNF-8808 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

What type of PR is this?

/kind feature

What this PR does / why we need it:

Given an annotation CRI-O appends additional CPUs (named shared CPUs)
into the container's cgroups.

The operation is done via a performance runtime hook.

This will allow the container to run light-weight tasks
on the shared cpus, while all the other task that
requires full isolation will keep running on the guaranteed CPUs.

It also injects env variables in to the container
in order to distinguish between the isolated and shared CPUs.

Which issue(s) this PR fixes:

None

Special notes for your reviewer:

Related to:

Depends on:

EP: openshift/enhancements@76ea48a

Does this PR introduce a user-facing change?

None

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. labels Nov 21, 2023
@openshift-ci openshift-ci bot added kind/feature Categorizes issue or PR as related to a new feature. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 21, 2023
Copy link
Contributor

openshift-ci bot commented Nov 21, 2023

Hi @Tal-or. Thanks for your PR.

I'm waiting for a cri-o member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Tal-or
Copy link
Contributor Author

Tal-or commented Nov 21, 2023

Depends on: #7485

Copy link

codecov bot commented Nov 26, 2023

Codecov Report

Merging #7502 (14e6af6) into main (87c53ac) will decrease coverage by 0.18%.
Report is 13 commits behind head on main.
The diff coverage is 12.50%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7502      +/-   ##
==========================================
- Coverage   48.69%   48.52%   -0.18%     
==========================================
  Files         145      145              
  Lines       15926    16055     +129     
==========================================
+ Hits         7755     7790      +35     
- Misses       7236     7327      +91     
- Partials      935      938       +3     

@Tal-or Tal-or force-pushed the lbv2_mixed_cpus branch 9 times, most recently from 36db7f4 to 2dd2faa Compare November 27, 2023 12:37
@ffromani
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 29, 2023
@Tal-or Tal-or force-pushed the lbv2_mixed_cpus branch 2 times, most recently from 9bdb2a6 to d16cec9 Compare November 29, 2023 13:54
@@ -26,6 +27,38 @@ var (
cgroupIsV2Err error
)

// CgroupHierarchy is a cgroup version agnostic struct that allows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is best here. This package has a pretty particular use (for initializing variables that CRI-O needs to react to on the node). If it's not being used anywhere else, I'd prefer it in runtime_handler_hooks

The common case is that mixed CPUs will be used
along side load-balancing disable.

prep up a cgroup-child as part of the `setSharedCPUs` to
save extra work and more conditioning when the `setCPULoadBalancing`
function is being called.

Signed-off-by: Talor Itzhak <titzhak@redhat.com>
@Tal-or
Copy link
Contributor Author

Tal-or commented Dec 6, 2023

/retest

2 similar comments
@Tal-or
Copy link
Contributor Author

Tal-or commented Dec 7, 2023

/retest

@haircommander
Copy link
Member

/retest

@haircommander
Copy link
Member

haircommander commented Dec 7, 2023

Failed to download metadata for repo 'rhel-9-for-x86_64-appstream-rpms': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried

looks unrelated

@haircommander
Copy link
Member

needs #7566

@haircommander
Copy link
Member

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 7, 2023
Copy link
Contributor

openshift-ci bot commented Dec 7, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, Tal-or

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed lgtm Indicates that a PR is ready to be merged. labels Dec 7, 2023
@haircommander
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 7, 2023
@haircommander
Copy link
Member

/hold cancel

@openshift-ci openshift-ci bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note-none Denotes a PR that doesn't merit a release note. labels Dec 7, 2023
@haircommander
Copy link
Member

/retest

@haircommander
Copy link
Member

/override ci/prow/ci-fedora-integration

Copy link
Contributor

openshift-ci bot commented Dec 7, 2023

@haircommander: Overrode contexts on behalf of haircommander: ci/prow/ci-fedora-integration

In response to this:

/override ci/prow/ci-fedora-integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@haircommander
Copy link
Member

/override ci/prow/e2e-gcp-ovn

aws passed

Copy link
Contributor

openshift-ci bot commented Dec 7, 2023

@haircommander: Overrode contexts on behalf of haircommander: ci/prow/e2e-gcp-ovn

In response to this:

/override ci/prow/e2e-gcp-ovn

aws passed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-bot openshift-merge-bot bot merged commit f40da54 into cri-o:main Dec 7, 2023
88 of 95 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has DCO signed all their commits. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants