-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Support custom cgroups #8551
Comments
+1 |
1 similar comment
👍 |
+1 |
To elaborate on my use case, I have a multiple containers working together to provide a service and I want to limit the resources the service can consume. I'd like to be able to create a cgroup (manually) then tell docker to run the containers for my service with that cgroup (eg multiple containers using the same cgroup). Something like
|
+1 |
More use cases: One use case is the pod-like scenario (#8781) -- multiple co-deployed containers. Another is differentiated quality of service. Some workloads need a high degree of predictability, while others just want to use whatever resources are available. We'd like to protect the former from the latter by putting all of the latter into a bucket that is constrained such that it can't interfere with the predictable workload. This same approach can be applied hierarchically in order to support more than 2 QoS tiers. More discussion of this can be found in presentations and documentation about lmctfy: We'd like to similarly protect Docker and other system agents/daemons from user containers. We've received a number of reports from users who bricked nodes due to using up all the memory. Not all 3 of these cases necessarily need to be expressed in the same way in the API. For the pod case, one might be tempted to apply the current pattern of referring to other containers, such as with VolumesFrom and NetworkMode=container:id, using something like CgroupParent=container-id. However, this approach is problematic for a number of reasons. One is due to the coupling of the container lifetime and process lifetime. In the case of a system OOM, for example, such processes can die, even if they use minimal resources, which creates complicated failure modes. Another is the lack of reasonable mechanisms for managing and introspecting groups of related containers. For differentiated quality of service, I'd like to specify higher-level semantic intent rather than concrete slices or cgroup paths, but there needs to be a general way to pass extra options down to the exec driver, and such mechanisms keep getting shot down, or even removed after being added (e.g., #4833). Alternatively, we'd be happy to make a proposal for first-class support in the API. Configuration to protect Docker and other system agents could be specified with specified with flags when starting the daemon. |
+1 |
Having a way to expose the parent cgroup to use to create new cgroup under would go a long way in solving the issues @bgrant0607 pointed out. If the container cgroups are not tied directly to where docker daemon runs, it would help a lot in better protecting critical system daemons. libcontainer already accepts parent as a parameter. I think the actual work required to make this happen would be minimal. |
+1. Ping @crosbymichael |
What do you think the user facing API should look like? API and flags for solving your issues? |
@crosbymichael: For many of the use cases mentioned above, adding a |
And that's it? Nothing else required? |
I think that is pretty much right. What else were tiu expecting to see?
|
+1 |
+1, this simplifies process tracking for cluster managers. |
@crosbymichael: We will need another flag to alter the |
Let's keep the |
+1 as well. |
FWIW, we should mention that this will remove the ability to examine the On Tue, Jan 20, 2015 at 11:01 AM, Timothy Chen notifications@github.com
|
@crosbymichael: Can we get a +1 for this feature? I can send out a PR soon. This is an important feature that will help improve system reliability a lot for kubernetes. |
@vishh yes, I will bring it up with the other maintainers today |
+1 |
Ping @crosbymichael! On Thu, Feb 19, 2015 at 10:06 PM, Jack notifications@github.com wrote:
|
I plan to post a PR soon since no concerns have been expressed for this feature. |
I think that is awesome @vishh I am +1 :) |
anything to get rid of systemd cgroups :P |
I discussed this with @crosbymichael at the last DGAB meeting, and my understanding is that we have the go-ahead for this. |
+1 for |
+1, this will strengthen the multi-tenancy story when running Kubernetes as a Mesos framework and |
+1 for |
+1 for |
merged in #11428 |
Thanks for the quick review everyone :) |
w00t! This is a good one. Thanks everyone. On Thu, Mar 19, 2015 at 3:12 PM, Vish Kannan notifications@github.com
|
Yay cgroups! :D That was one of the fastest feature merges I've seen. |
👍 |
Awesome, thanks a lot. |
+1 On Thu, Mar 19, 2015 at 7:41 PM, Brian Grant notifications@github.com
James DeFelice |
Containers are mostly the combination of capabilities, namespaces, and cgroups. Docker already has custom capabilities support with
--cap-add
and--cap-drop
. Custom namespace support is halfway there already with--net=*
and--ipc=*
is being worked on. The last piece of the puzzle is to be able to control is cgroups.I propose that the cgroup paths be added to the HostConfig such that on start custom cgroup paths can optionally be used instead of the cgroups that Docker would setup. This would allow component outside of docker to control, create, and manage the cgroups but then the Docker container would just join them.
One could find many use cases for this I assume, but initially this feature can be used to better tie together cgroups managed by systemd. The background of this is rooted in a hack (https://github.com/ibuildthecloud/systemd-docker) that I've put together to better managed Docker under systemd.
systemd-docker
does various things that are useful that better integrates Docker with systemd and most of it should probably stay as a project outside of Docker. One critical piece that makessystemd-docker
work today is that it moves the running processes from one cgroup to the service's cgroup. This is what makes it a hack and also not 100% reliable. If Docker could just support the ability to use a custom cgroup, thensystemd-docker
could become a production worthy stop gap solution until a superior integration between systemd and docker existed natively.The text was updated successfully, but these errors were encountered: