Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet emits warning rather than exiting on invalid cgroup setup #79245

Merged
merged 1 commit into from
Jun 21, 2019

Conversation

dashpole
Copy link
Contributor

What type of PR is this?
/kind bug

What this PR does / why we need it:
This maintains the prior behavior of the kubelet when the cgroup setup is unsupported. It emits an error rather than exiting. This is important for those who run kubelets locally to test changes unrelated to monitoring or cgroup setup.

Which issue(s) this PR fixes:
Fixes #78950

Does this PR introduce a user-facing change?:

NONE

cc @mattjmcnaughton @roycaihw

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 20, 2019
@dashpole
Copy link
Contributor Author

/sig node
/priority important-soon

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/kubelet and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 20, 2019
@dashpole
Copy link
Contributor Author

/retest

@roycaihw
Copy link
Member

Thanks @dashpole. I can confirm this fixes #78950!

Copy link
Contributor

@mattjmcnaughton mattjmcnaughton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks for this quick fix @dashpole !

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 21, 2019
@stealthybox
Copy link
Member

I just ran into this in an LXC container running systemd where I've split up the cpu and cpuacct cgroup mountpoints.
I am struggling to get systemd to assign the unit to matching slices, so I can't run the kubelet service.

Here's more context on this error:
https://github.com/kubernetes/kubernetes/blob/d30fbab/pkg/kubelet/cm/container_manager_linux.go#L802-L815
It seems this was done intentionally.

How critical is this functionality?
Does this disable CPU/MemoryAccounting for all Pods?

@dashpole
Copy link
Contributor Author

@stealthybox I responded in slack as well, but i'll repeat it here for anyone else who happens across this.
This specific functionality is only used for monitoring. If you rely on the kubelet or runtime system containers in the /stats/summary API and see the errors this PR is fixing or adding, then those metrics will be missing from the summary API.

@derekwaynecarr
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dashpole, derekwaynecarr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 21, 2019
@stealthybox
Copy link
Member

stealthybox commented Jun 21, 2019

I can confirm I built this for arm64 and it solved my issue.

@stealthybox
Copy link
Member

Please note that this also affects docker:

W0621 18:17:18.104998    1600 server.go:628] failed to get the kubelet's cgroup: cpu and memory cgroup hierarchy not unified.  cpu: /, memory: /user.slice.  Kubelet system container metrics may be missing.
W0621 18:17:18.105420    1600 server.go:635] failed to get the container runtime's cgroup: failed to get container name for docker process: cpu and memory cgroup hierarchy not unified.  cpu: /, memory: /system.slice/docker.service. Runtime system container metrics may be missing.
I0621 18:17:18.107885    1600 manager.go:146] cAdvisor running in container: "/sys/fs/cgroup/cpu"

@dashpole
Copy link
Contributor Author

@stealthybox yes, this issue is independent of the container runtime used.

@k8s-ci-robot k8s-ci-robot merged commit f8d5ff2 into kubernetes:master Jun 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hack/local-up-cluster fails to bring up kubelet and a node
6 participants