Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add initial support for updating systemd slices #88404

Closed
wants to merge 3 commits into from

Conversation

odinuge
Copy link
Member

@odinuge odinuge commented Feb 21, 2020

What type of PR is this?

/kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #88197

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Add support for updating resources parameters for the different qos classes when using cgroup driver systemd

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 21, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: odinuge
To complete the pull request process, please assign thockin
You can assign the PR to them by writing /assign @thockin in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/dependency Issues or PRs related to dependency changes area/kubelet area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 21, 2020
@k8s-ci-robot k8s-ci-robot requested review from dchen1107, resouer and a team February 21, 2020 11:18
@odinuge
Copy link
Member Author

odinuge commented Feb 21, 2020

/retest

@rphillips
Copy link
Member

cc @sjenning

@rphillips
Copy link
Member

I think the update should be in the opencontainers/runc Manager. Is that not possible?

This makes it possible to update existing systemd slices. This was
previously not working, and the "top" slices for the different qos
levels were always set to nodeCapacity. This will update the correct
slice values using a dbus connection to systemd.

We still have to run "setSupportedSubsystems" since systemd slices don't
support all the resouces we use. This currently only applies to huge
pages.
@odinuge
Copy link
Member Author

odinuge commented Feb 23, 2020

@rphillips I think the update should be in the opencontainers/runc Manager. Is that not possible?

I do agree that this code may fit better outside the k/k repo, but I am not sure if it would fit in runc/libcontainer. The way we use runc/libcontainer (in this case) for controling cgroups and systemd slices is outside the scope of runc as I understand it, so adding it there would require some time and extra work for the initial implementation. From my experience getting things like this into runc is hard and cumbersome, and we already do quite a lot of the work ourselves inside cgroup_manager_linux. Eg. we have to update the cgroup subsystems manually ref. comment https://github.com/kubernetes/kubernetes/pull/88404/files#diff-fa76bb6ae2d9b4bb3d023737fe5e6029L347.

Some of the code in cgroup_manager_linux can now be migrated into code using run/libcontainer, but I think the best approach would be to fix it in k/k and write some proper e2e tests for it, making it easier to switch to using vendored code, and adding the functionality to runc/libcontainer. There is currently no e2e tests running with --cgroup-driver=systemd (unsure about openshift tho), so imo. that would be the first place to start.

But that is just my thoughts tho.

@odinuge
Copy link
Member Author

odinuge commented Feb 26, 2020

/cc derekwaynecarr

@odinuge
Copy link
Member Author

odinuge commented Feb 27, 2020

Ref. cgroupv2 implemenetation discussions:
/cc @AkihiroSuda @giuseppe

@k8s-ci-robot
Copy link
Contributor

@odinuge: GitHub didn't allow me to request PR reviews from the following users: AkihiroSuda, giuseppe.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Ref. cgroupv2 implemenetation discussions:
/cc AkihiroSuda giuseppe

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 26, 2020
@k8s-ci-robot
Copy link
Contributor

@odinuge: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AkihiroSuda
Copy link
Member

The amount of the code copied from runc is relatively huge, so I prefer this PR to be applied to runc repo as well, but not a blocker, I think.

newProp("MemoryLimit", uint64(c.Resources.Memory)))
}

if c.Resources.CpuShares != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't work on cgroup v2.

We would need to set the CPUWeight= property (#89567 moves the conversion to a new function so should be easier to use)

newProp("CPUQuotaPerSecUSec", cpuQuotaPerSecUSec))
}

if c.Resources.BlkioWeight != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it currently used by the kubelet or can we drop it? For cgroup v2 we would need to convert to a different range and use the IOWeight= property

@odinuge
Copy link
Member Author

odinuge commented Apr 25, 2020

Fixed in runc, closing in favor of #90489

@odinuge odinuge closed this Apr 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dependency Issues or PRs related to dependency changes area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kubelet incorrectly configures kubepods.slice systemd unit when kube-reserved and/or system-reserved is used
5 participants