Kubelet: Create a top-level container for pods #5671

vmarmol · 2015-03-19T21:54:21Z

Thanks to @vishh's awesome work, moby/moby#11428 went in with cgroup_parent support! With Docker 1.6 we'll be able to make top-level containers for pods.

dchen1107 · 2015-03-19T22:01:18Z

Great! I assume you are talking about one top-level kubernetes container as parent for all kubernetes containers to protect our daemons, not per pod container here.

vishh · 2015-03-19T22:10:08Z

A top level cgroup for placing a bounding bucket across all pods mill most
definitely help with node stability.

On Thu, Mar 19, 2015 at 3:01 PM, Dawn Chen notifications@github.com wrote:

Great! I assume you are talking about one top-level kubernetes container
as parent for all kubernetes containers to protect our daemons, not per pod
container here.

—
Reply to this email directly or view it on GitHub
#5671 (comment)
.

vmarmol · 2015-03-19T22:18:49Z

I was referring to the per-pod containers :) I think if we have that we won't need the container for all kubernetes containers.

dchen1107 · 2015-03-19T22:20:19Z

Then this is not for v1.0. In v1.0 there is no resource limit on pod, it doesn't help on node stability at all.

vishh · 2015-03-19T22:21:28Z

If users don't set any limits on pods, then per-pod cgroup doesn't provide much benefit. Reserving some resource for the system daemons will benefit node stability in the short-term though. The bounding box will help us implement QOS tiers in the long run.

vmarmol · 2015-03-19T22:23:33Z

The bounding box is gonna be a nightmare to migrate off of when we do do per pod limits. We can set aside space today for daemons and allocate the other space for the containers.

I don't disagree that this is not v1, but I don't think we need bounding-box for v1 either.

bgrant0607 · 2015-03-19T23:46:52Z

@vmarmol Would we need to eliminate the bounding box when implementing per-pod limits?

I'd also like a bounding box to protect pods/containers with limits from those without, by putting the latter into a box.

bgrant0607 · 2015-03-19T23:47:37Z

Ref #147, #168.

vishh · 2015-03-20T00:08:56Z

I discussed with @vmarmol offline and explained the rationale behind nested
bounding boxes.

On Thu, Mar 19, 2015 at 4:48 PM, Brian Grant notifications@github.com
wrote:

Ref #147 #147,
#168 #168.

—
Reply to this email directly or view it on GitHub
#5671 (comment)
.

vmarmol · 2015-03-20T00:35:59Z

For representing QoS tiers, the bounding boxes make sense. Having a top-level container for all containers doesn't buy is much in terms of protecting the node and only partially aligns to our longer term strategies. I just want us to be careful rolling out a hierarchy plan and having that change in the medium term.

I also spoke with @dchen1107 and we agreed that short term this doesn't buy us much. The model with limit and non-limit (batch) containers makes sense as our first step (the model you mentioned). As far as we know, that is not a v1 work item and there are a few things missing from the scheduler to enable it.

@bgrant0607 no we shouldn't need to remove the bounding boxes for pod-level limits. I think whether those are container or pod-level won't affect the overall plans.

ConnorDoyle · 2015-04-09T06:21:30Z

Here's one more use case to consider. When Kubernetes is run as an Apache Mesos framework, we're currently spinning up at most one Kubelet per host in the same process as a custom Mesos executor (framework component responsible for interpreting and executing tasks). Mesos already manages a container that encloses the executor process, and that container's resource limits grow and shrink as tasks (pods) come and go. From our perspective, it would be great if we could pass the executor's cgroup to the Kubelet to use as its parent. That way the Kubelet could still manage a hierarchy of containers, and Mesos could provide accurate global resource accounting and isolation. Could this (or something like it) be worked into the design?

jdef · 2015-04-09T12:01:29Z

xref mesosphere/kubernetes-mesos#68

vishh · 2015-04-09T14:46:04Z

@ConnorDoyle: IIUC you want the kubelet to be the parent of all the docker containers it runs. As of now we were thinking of running all system daemons, which includes the kubelet, in a separate hierarchy from that of the docker containers. I

jdef · 2015-04-09T15:01:03Z

@vishh @ConnorDoyle I think what would work really well for the kubernetes-mesos integration is to be able to specify a parent cgroup in the kubelet's configuration, so that the containers launched by kubelet are parented under that cgroup. The current integration blends the kubelet and mesos executor implementations into a single executable: the mesos executor actually configures the Kubelet instance directly:

https://github.com/mesosphere/kubernetes-mesos/blob/master/pkg/executor/service/service.go#L244

It would be great if we could pass in some sort of parentCgroup parameter there.

vmarmol · 2015-04-09T15:36:39Z

@jdef being able to specify where Kubelet makes top-level containers seems perfectly reasonable (and defaulting to root "/"). That may be the same as that Kubelet or another cgroup. That shouldn't be a problem.

ConnorDoyle · 2015-04-09T15:36:49Z

@jdef, yes that's what I had in mind but your explanation is clearer.
@vishh, that would be OK as long as everything (k8s daemons, and the top-level container for the pod containers) could be in the executor's cgroup hierarchy.

vishh · 2015-04-09T15:58:31Z

@jdef @ConnorDoyle: SGTM

jdef · 2015-08-31T01:06:03Z

we've started implementing something along these lines. the currently exposed flags of the kubelet aren't quite sufficient since all kubelet-configured container paths, for example DockerDaemonContainer, aren't exposed. we've hacked around this for now, but it would be nice if future changes to the kubelet kept in mind that the kubelet may not be the only service on the host and that other management/provisioning tooling may want the ultimate say re: host-level configuration changes.

bgrant0607 · 2016-05-17T16:31:19Z

cc @derekwaynecarr

vishh · 2016-05-23T19:49:48Z

cc @dubstack Your pending proposal on pod level cgroups is tied to this upstream issue.

@vishh

Automatic merge from submit-queue Kubelet: Pod level Resource Management This proposal outlines our plan for improving resource management in Kubernetes by having a Cgroup hierarchy with QoS and Pod level Cgroups. This is the initial proposal which broadly covers our goals and how we plan to achieve it. At this point we would really appreciate feedback from the community. This is tied to the upstream issue #5671. So i would request @vishh @dchen1107 @bgrant0607 @jdef PTAL. [![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()

goltermann · 2016-09-07T17:36:32Z

@vishh in plan for 1.4, or punt to 1.5?

vishh · 2016-09-07T17:38:28Z

@goltermann Yes. This feature has been punted to v1.5

dims · 2016-11-15T21:33:55Z

looks like we have to punt to 1.6 :)

vishh · 2017-03-09T20:22:00Z

This feature is completed as of v1.6. cc @derekwaynecarr

vmarmol added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 19, 2015

vmarmol added this to the v1.0 milestone Mar 19, 2015

dchen1107 mentioned this issue Mar 19, 2015

Use docker's "parent cgroup" feature #5674

Closed

dchen1107 removed this from the v1.0 milestone Mar 19, 2015

dchen1107 added priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Mar 19, 2015

bgrant0607 added area/isolation status/help-wanted labels Mar 19, 2015

jayunit100 mentioned this issue Apr 14, 2015

[mesos] Report and enforce resource constraints #6828

Closed

guenter mentioned this issue Apr 17, 2015

Add support for new Docker 1.6 cgroup parent flag #6991

Closed

mikedanese mentioned this issue Jul 29, 2015

Custom root Cgroup for a pod #11986

Closed

dchen1107 added this to the v1.1 milestone Jul 31, 2015

dchen1107 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Jul 31, 2015

dchen1107 modified the milestones: v1.2, v1.1 Sep 25, 2015

bgrant0607 modified the milestones: v1.2, v1.2-candidate Oct 9, 2015

dchen1107 modified the milestones: next-candidate, v1.2-candidate Feb 19, 2016

vishh self-assigned this May 9, 2016

dubstack mentioned this issue Jun 2, 2016

Kubelet: Pod level Resource Management #26751

Merged

vishh modified the milestones: v1.4, next-candidate Jun 21, 2016

bgrant0607 removed the help-wanted label Aug 30, 2016

vishh modified the milestones: v1.5, v1.4 Sep 7, 2016

dims modified the milestones: v1.6, v1.5 Nov 15, 2016

vishh closed this as completed Mar 9, 2017

eddytruyen mentioned this issue Aug 16, 2020

Disable cpu quota(use only cpuset) for pod Guaranteed #70585

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubelet: Create a top-level container for pods #5671

Kubelet: Create a top-level container for pods #5671

vmarmol commented Mar 19, 2015

dchen1107 commented Mar 19, 2015

vishh commented Mar 19, 2015

vmarmol commented Mar 19, 2015

dchen1107 commented Mar 19, 2015

vishh commented Mar 19, 2015 via email

vmarmol commented Mar 19, 2015

bgrant0607 commented Mar 19, 2015

bgrant0607 commented Mar 19, 2015

vishh commented Mar 20, 2015

vmarmol commented Mar 20, 2015

ConnorDoyle commented Apr 9, 2015

jdef commented Apr 9, 2015

vishh commented Apr 9, 2015

jdef commented Apr 9, 2015

vmarmol commented Apr 9, 2015

ConnorDoyle commented Apr 9, 2015

vishh commented Apr 9, 2015

jdef commented Aug 31, 2015

bgrant0607 commented May 17, 2016

vishh commented May 23, 2016

goltermann commented Sep 7, 2016

vishh commented Sep 7, 2016

dims commented Nov 15, 2016

vishh commented Mar 9, 2017

Kubelet: Create a top-level container for pods #5671

Kubelet: Create a top-level container for pods #5671

Comments

vmarmol commented Mar 19, 2015

dchen1107 commented Mar 19, 2015

vishh commented Mar 19, 2015

vmarmol commented Mar 19, 2015

dchen1107 commented Mar 19, 2015

vishh commented Mar 19, 2015 via email

vmarmol commented Mar 19, 2015

bgrant0607 commented Mar 19, 2015

bgrant0607 commented Mar 19, 2015

vishh commented Mar 20, 2015

vmarmol commented Mar 20, 2015

ConnorDoyle commented Apr 9, 2015

jdef commented Apr 9, 2015

vishh commented Apr 9, 2015

jdef commented Apr 9, 2015

vmarmol commented Apr 9, 2015

ConnorDoyle commented Apr 9, 2015

vishh commented Apr 9, 2015

jdef commented Aug 31, 2015

bgrant0607 commented May 17, 2016

vishh commented May 23, 2016

goltermann commented Sep 7, 2016

vishh commented Sep 7, 2016

dims commented Nov 15, 2016

vishh commented Mar 9, 2017