Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slice sharing bug in cgroup manager #70678

Merged
merged 1 commit into from
Nov 7, 2018

Conversation

dashpole
Copy link
Contributor

@dashpole dashpole commented Nov 6, 2018

What type of PR is this?
/kind bug

What this PR does / why we need it:
See https://play.golang.org/p/QQOELuf56v8 for an example of what can happen when sharing underlying slices. Slices are only resized when the slice cannot hold the additional elements, and are increased by powers of two. So appending to a slice may or may not create a new slice to hold the input slice.

We just happened to be lucky that the default for --cgroup-root didn't trigger this behavior because getting pod cgroup slices ([]string{"kubepods", "burstable"} -> []string{"kubepods", "burstable", "<pod_uid>"} always triggered a resize of the former slice, as the size goes from 2 -> 4 each time. But when you specify a cgroup root ([]string{"root", "kubepods", "burstable"} -> []string{"root", "kubepods", "burstable", "<pod_uid>"}, the initial slice is already size 4 (because it was constructed by appending to a size 2 slice), and thus does not trigger a resize, and re-uses the parent slice for each child slice. In this way, after appending one pod uid to the QoS cgroup, the second append produces []string{"root", "kubepods", "burstable", "<pod_uid_1>", "<pod_uid_2>"}, which is a slice resize (4->8).

All of this is to say that we should create a new slice and copy the parent into it when we create a new cgroup. I included my reproduction case as a test case to ensure we don't break this in the future.

Which issue(s) this PR fixes:
Fixes #68416

Does this PR introduce a user-facing change?:

Fixes a bug in previous releases where a pod could be placed inside another pod's cgroup when specifying --cgroup-root

cc @sreis @derekwaynecarr @filbranden @dchen1107 @Random-Liu @yujuhong
/assign @derekwaynecarr @Random-Liu

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Nov 6, 2018
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. area/kubelet needs-priority Indicates a PR lacks a `priority/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 6, 2018
@dashpole
Copy link
Contributor Author

dashpole commented Nov 6, 2018

/sig node
/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 6, 2018
@dashpole
Copy link
Contributor Author

dashpole commented Nov 6, 2018

this should have the 1.13 milestone

@Random-Liu
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 6, 2018
Copy link
Member

@feiskyer feiskyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please create a cherry pick PR to branch-1.12 after this.

@dashpole
Copy link
Contributor Author

dashpole commented Nov 6, 2018

/retest

1 similar comment
@dashpole
Copy link
Contributor Author

dashpole commented Nov 6, 2018

/retest

Copy link
Contributor

@filbranden filbranden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dashpole for tracking this one down!

Ugh, what an ugly interface in Golang... Time to write another Golang rant about it...

I have a tiny nitpick, but what you have is definitely fine, so OK to ignore my suggestion.

Cheers,
Filipe

pkg/kubelet/cm/cgroup_manager_linux.go Show resolved Hide resolved
@yujuhong
Copy link
Contributor

yujuhong commented Nov 7, 2018

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dashpole, yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 7, 2018
@yujuhong
Copy link
Contributor

yujuhong commented Nov 7, 2018

What supported versions are affected by this bug?

@dashpole
Copy link
Contributor Author

dashpole commented Nov 7, 2018

this affects 1.11 and 1.12

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

2 similar comments
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit f1bf9be into kubernetes:master Nov 7, 2018
@pires
Copy link
Contributor

pires commented Nov 7, 2018

Why wasn't this marked for backport?

@nikopen
Copy link
Contributor

nikopen commented Nov 7, 2018

/lgtm
/milestone v1.13

@pires opening cherry picks for 1.11 and 1.12 soon

is 1.10 maybe affected as well? 3 major versions are supported at a time

@pires
Copy link
Contributor

pires commented Nov 8, 2018

@nikopen according to our testing only 1.11+ is affected.

@dashpole dashpole deleted the fix_cgroup_manager branch November 9, 2018 18:01
k8s-ci-robot added a commit that referenced this pull request Dec 3, 2018
…8-upstream-release-1.11

Automated cherry pick of #70678: fix slice sharing bug in cgroup manager
k8s-ci-robot added a commit that referenced this pull request Jan 5, 2019
…8-upstream-release-1.12

Automated cherry pick of #70678: fix slice sharing bug in cgroup manager
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cgroups per qos not working
10 participants