Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --cgroup_parent flag to Kubelet to set the parent cgroup for pods #7277

Merged
merged 1 commit into from May 1, 2015

Conversation

guenter
Copy link
Contributor

@guenter guenter commented Apr 24, 2015

@bgrant0607
Copy link
Member

cc @vishh @vmarmol

@@ -511,6 +511,9 @@ func (dm *DockerManager) runContainer(pod *api.Pod, container *api.Container, op
if len(opts.DNSSearch) > 0 {
hc.DNSSearch = opts.DNSSearch
}
if len(opts.CgroupParent) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this option will be silently ignored with older versions of docker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I tested with Docker 1.4.1. Containers just launch under the default /docker cgroup.

@vmarmol
Copy link
Contributor

vmarmol commented Apr 24, 2015

Overall looks good, only big issue is handling older versions of Docker.

@guenter
Copy link
Contributor Author

guenter commented Apr 24, 2015

@vmarmol see above, older versions of Docker silently ignore it.

@yifan-gu
Copy link
Contributor

/cc @jonboulle

@guenter
Copy link
Contributor Author

guenter commented Apr 24, 2015

Updated naming according to commends

IpcMode: ipcMode,
NetMode: netMode,
IpcMode: ipcMode,
CgroupRoot: kl.cgroupRoot,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are introducing this, should we also introduce the pod-level cgroup?

/kubernetes/pod-uid/container-id

And do we want to distinguish pods that asked for limits from pods that didnt? And is that CPU, memory, or the tuple of both?

/kubernetes/t1/pod-uid/container-id

Let's also put thought into how to future-proof this. Changing this schema has been an unending source of pain internally.

Also, let's carefully consider how important this is before 1.0. It's fun, but we want to be careful about what we undertake in the near-term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thockin I think the pod-level cgroup should be added in a separate PR, and probably post-1.0 since it needs some discussion and is orthogonal to this change. In its current form the PR is a minimal change that we need for k8s on Mesos and is backward (and forward) compatible.

The only other question is, should the kubelet resource container live under the cgroup root as well? I'd say yes, to help with resource accounting. So the hierarchy could look like:

/kubernetes/kubelet
/kubernetes/pods/pod-uid/container-id
...

If there is agreement I can make that change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 towards moving kubelet to be under a configurable root cgroup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's do cgroups for multiple QoS tiers and for pods in subsequent PRs. I don't think we need much more discussion on them, but we will need an (additive) API change, to add resources at the pod level.

Given where we are, I don't think we need to be that rigorous about changing the hierarchy. If we evacuate the node when we update the kubelet, which we need to be able to do for kernel upgrades, we'll be fine.

As for Kubelet, we're working on making it run as a container. Eventually I'd like to even run it as a pod of the highest QoS tier, but putting it into a custom cgroup root seems fine for now.

@vmarmol
Copy link
Contributor

vmarmol commented Apr 24, 2015

@thockin AFAIK this feature right now is primarily to allow the Kubelet to run well in Mesos where all containers have to be created within the allocation given to the Kubelet. This mechanism should be generic enough to support the future additions we need. I see us in the future:

<cgroup_parent>/[<tier cgroups>]/<pod cgroup>/<container>

For most users, cgroup_parent=/ the root cgroup.

+1 on not for v1.0, but this PR shouldn't move us further from those goals.

@vmarmol
Copy link
Contributor

vmarmol commented Apr 24, 2015

LGTM, only that one nit on the naming. @thockin let me know if you have any strong feelings against this. From your comment, it seems like we're okay to go.

@vishh
Copy link
Contributor

vishh commented Apr 24, 2015

+1 to @vmarmol's comments. This PR should help provide resource isolation in mesos.
LGTM for my side as well!

@ConnorDoyle
Copy link
Contributor

Thanks for the speedy and thorough reviews folks, we're pretty excited for this one to land 👍

@guenter
Copy link
Contributor Author

guenter commented Apr 24, 2015

+1 @vmarmol
I just updated the commit message with the cgroup_root naming to avoid confusion. Should be ready to merge now.

@vishh
Copy link
Contributor

vishh commented Apr 30, 2015

LGTM. Will merge once tests pass!

@vishh vishh added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 30, 2015
vishh added a commit that referenced this pull request May 1, 2015
Add --cgroup_parent flag to Kubelet to set the parent cgroup for pods
@vishh vishh merged commit cadfde0 into kubernetes:master May 1, 2015
@guenter guenter deleted the cgroup-parent branch May 1, 2015 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform/mesos lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants