Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only collect metrics for cgroups required by the summary API #72787

Merged
merged 1 commit into from May 31, 2019

Conversation

@dashpole
Copy link
Contributor

commented Jan 10, 2019

What type of PR is this?
/kind feature

What this PR does / why we need it:
This makes the kubelet more efficient, as it only collects metrics for cgroups within:

  • The runtime cgroup
  • The kubelet cgroup
  • The system (misc) cgroup
  • The allocatable (/kubepods) cgroup

This includes all pod and container cgroups, as those are within the allocatable cgroup. The node cgroup is always collected.

Which issue(s) this PR fixes:
Fixes google/cadvisor#2129

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

The kubelet only collects metrics for the node, container runtime, kubelet, pods, and containers.
@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented Jan 10, 2019

/priority important-soon

@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented Jan 10, 2019

/assign @andyxning

@dashpole dashpole force-pushed the dashpole:cadvisor_prefix_whitelist branch from efbefaf to 4569cca Jan 10, 2019

@dashpole dashpole force-pushed the dashpole:cadvisor_prefix_whitelist branch from 4569cca to 85c1a2f Jan 18, 2019

@fejta-bot

This comment has been minimized.

Copy link

commented Apr 18, 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@BenTheElder

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

/remove-lifecycle stale
poke!
/test pull-kubernetes-e2e-kind

@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented Apr 25, 2019

/assign @tallclair
This makes cAdvisor only collect metrics for cgroups subtrees it manages.

@feiskyer
Copy link
Member

left a comment

lgtm

s.RuntimeCgroups,
s.SystemCgroups,
s.KubeletCgroups,
s.CgroupRoot,

This comment has been minimized.

Copy link
@tallclair

tallclair May 1, 2019

Member

So if !CgroupsPerQOS, this could be ""? Is that OK?

Or should --cgroup-root always default to /?

This comment has been minimized.

Copy link
@dashpole

dashpole May 1, 2019

Author Contributor

I decided to leave s.CgroupRoot alone, and just fall back to collecting all cgroups if s.CgroupRoot isn't specified.

@dashpole dashpole force-pushed the dashpole:cadvisor_prefix_whitelist branch from 85c1a2f to 8fe7134 May 1, 2019

@dashpole dashpole force-pushed the dashpole:cadvisor_prefix_whitelist branch from 8fe7134 to f8dff6b May 10, 2019

@k8s-ci-robot k8s-ci-robot added size/M and removed size/S labels May 10, 2019

@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

@tallclair I went ahead and followed the logic of the container manager implementation to get the correct cgroup values. This should correctly handles all cases where flags are not specified.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 10, 2019

@dashpole: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-local-e2e-containerized 85c1a2f link /test pull-kubernetes-local-e2e-containerized

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@tallclair

This comment has been minimized.

Copy link
Member

commented May 21, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label May 21, 2019

@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented May 29, 2019

/assign @yujuhong
for approval

@dashpole

This comment has been minimized.

Copy link
Contributor Author

commented May 30, 2019

/retest

@yujuhong

This comment has been minimized.

Copy link
Member

commented May 30, 2019

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 30, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dashpole, yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit f49fe2a into kubernetes:master May 31, 2019

21 checks passed

cla/linuxfoundation dashpole authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Job succeeded.
Details
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details

@dashpole dashpole deleted the dashpole:cadvisor_prefix_whitelist branch May 31, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.