-
Notifications
You must be signed in to change notification settings - Fork 38.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet: Create a top-level container for pods #5671
Comments
Great! I assume you are talking about one top-level kubernetes container as parent for all kubernetes containers to protect our daemons, not per pod container here. |
A top level cgroup for placing a bounding bucket across all pods mill most On Thu, Mar 19, 2015 at 3:01 PM, Dawn Chen notifications@github.com wrote:
|
I was referring to the per-pod containers :) I think if we have that we won't need the container for all kubernetes containers. |
Then this is not for v1.0. In v1.0 there is no resource limit on pod, it doesn't help on node stability at all. |
If users don't set any limits on pods, then per-pod cgroup doesn't provide
much benefit. Reserving some resource for the system daemons will benefit
node stability in the short-term though.
The bounding box will help us implement QOS tiers in the long run.
|
The bounding box is gonna be a nightmare to migrate off of when we do do per pod limits. We can set aside space today for daemons and allocate the other space for the containers. I don't disagree that this is not v1, but I don't think we need bounding-box for v1 either. |
@vmarmol Would we need to eliminate the bounding box when implementing per-pod limits? I'd also like a bounding box to protect pods/containers with limits from those without, by putting the latter into a box. |
I discussed with @vmarmol offline and explained the rationale behind nested On Thu, Mar 19, 2015 at 4:48 PM, Brian Grant notifications@github.com
|
For representing QoS tiers, the bounding boxes make sense. Having a top-level container for all containers doesn't buy is much in terms of protecting the node and only partially aligns to our longer term strategies. I just want us to be careful rolling out a hierarchy plan and having that change in the medium term. I also spoke with @dchen1107 and we agreed that short term this doesn't buy us much. The model with limit and non-limit (batch) containers makes sense as our first step (the model you mentioned). As far as we know, that is not a v1 work item and there are a few things missing from the scheduler to enable it. @bgrant0607 no we shouldn't need to remove the bounding boxes for pod-level limits. I think whether those are container or pod-level won't affect the overall plans. |
Here's one more use case to consider. When Kubernetes is run as an Apache Mesos framework, we're currently spinning up at most one Kubelet per host in the same process as a custom Mesos executor (framework component responsible for interpreting and executing tasks). Mesos already manages a container that encloses the executor process, and that container's resource limits grow and shrink as tasks (pods) come and go. From our perspective, it would be great if we could pass the executor's cgroup to the Kubelet to use as its parent. That way the Kubelet could still manage a hierarchy of containers, and Mesos could provide accurate global resource accounting and isolation. Could this (or something like it) be worked into the design? |
@ConnorDoyle: IIUC you want the kubelet to be the parent of all the docker containers it runs. As of now we were thinking of running all system daemons, which includes the kubelet, in a separate hierarchy from that of the docker containers. I |
@vishh @ConnorDoyle I think what would work really well for the kubernetes-mesos integration is to be able to specify a parent cgroup in the kubelet's configuration, so that the containers launched by kubelet are parented under that cgroup. The current integration blends the kubelet and mesos executor implementations into a single executable: the mesos executor actually configures the Kubelet instance directly: It would be great if we could pass in some sort of |
@jdef being able to specify where Kubelet makes top-level containers seems perfectly reasonable (and defaulting to root "/"). That may be the same as that Kubelet or another cgroup. That shouldn't be a problem. |
@jdef @ConnorDoyle: SGTM |
we've started implementing something along these lines. the currently exposed flags of the kubelet aren't quite sufficient since all kubelet-configured container paths, for example DockerDaemonContainer, aren't exposed. we've hacked around this for now, but it would be nice if future changes to the kubelet kept in mind that the kubelet may not be the only service on the host and that other management/provisioning tooling may want the ultimate say re: host-level configuration changes. |
cc @dubstack Your pending proposal on pod level cgroups is tied to this upstream issue. |
Automatic merge from submit-queue Kubelet: Pod level Resource Management This proposal outlines our plan for improving resource management in Kubernetes by having a Cgroup hierarchy with QoS and Pod level Cgroups. This is the initial proposal which broadly covers our goals and how we plan to achieve it. At this point we would really appreciate feedback from the community. This is tied to the upstream issue #5671. So i would request @vishh @dchen1107 @bgrant0607 @jdef PTAL. [![Analytics](https://kubernetes-site.appspot.com/UA-36037335-10/GitHub/.github/PULL_REQUEST_TEMPLATE.md?pixel)]()
@vishh in plan for 1.4, or punt to 1.5? |
@goltermann Yes. This feature has been punted to v1.5 |
looks like we have to punt to 1.6 :) |
This feature is completed as of v1.6. cc @derekwaynecarr |
Thanks to @vishh's awesome work, moby/moby#11428 went in with cgroup_parent support! With Docker 1.6 we'll be able to make top-level containers for pods.
The text was updated successfully, but these errors were encountered: