Skip to content

Commit c1b42f7

Browse files
ZhouJasbmahler
authored andcommitted
[docs] Add public docs for Cgroups v2.
Currently there is no official documentation outlining the changes we have been making to support Cgroups v2. We add a main document outlining how Mesos interacts with Cgroups v2, and update some documents on the changes that were made, such as the device isolator document. Review: https://reviews.apache.org/r/75191/
1 parent 175f09d commit c1b42f7

File tree

3 files changed

+102
-0
lines changed

3 files changed

+102
-0
lines changed

docs/cgroups2-support.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
---
2+
title: Apache Mesos - Cgroups v2 Support
3+
layout: documentation
4+
---
5+
6+
# Using Mesos on systems with Cgroups2 enabled
7+
8+
As part of the move towards Cgroups2, the Cgroups isolator has been updated to
9+
support the updated interface, Changes are outlined below, and it is recommended
10+
to read up on the [Cgroups v2](https://docs.kernel.org/admin-guide/cgroup-v2.html)
11+
documentation for an deeper understanding.
12+
13+
### Requirements
14+
15+
The `cgroups2` filesystem must be mounted at `/sys/fs/cgroup`. This allows Mesos
16+
to pick the Cgroups2 Isolator when creating the Mesos Containerizer.
17+
18+
### Cgroup Names
19+
20+
A cgroup called “CGROUP_NAME” has a path `/sys/fs/cgroup/$CGROUP_NAME`. This
21+
applies for all cgroups. A cgroup's name is the cgroup's path relative to
22+
`/sys/fs/cgroup`, where the cgroup2 filesystem is mounted.
23+
24+
`flags.cgroups_root` (default: "mesos"): Root cgroup name.
25+
26+
The client has control over the name of the root cgroup subtree under
27+
`/sys/fs/cgroup` that Mesos manages. The default name is “mesos”.
28+
29+
### Process Cgroup
30+
31+
Every process Mesos manages will have a cgroup, and a leaf cgroup under it which
32+
contains the pids. This is done to adhere to the [No Internal Process Constraint](https://docs.kernel.org/admin-guide/cgroup-v2.html#no-internal-process-constraint)
33+
imposed by Cgroups v2.
34+
35+
### Container
36+
37+
When the cgroups v2 isolator is `prepare`d for a new container, cgroups are
38+
created for the new container. When the cgroups v2 isolator `isolate`s, the new
39+
container is moved into it's leaf cgroup.
40+
41+
Container Non-leaf Cgroup: `<flags.cgroups_root>/<containerId>`
42+
43+
Container Leaf Cgroup: `<flags.cgroups_root>/<containerId>/leaf`
44+
45+
### Nested Containers
46+
47+
The Cgroups v2 isolator supports nested containers.
48+
49+
Unlike Cgroups v1, we now create cgroups for all containers, even if they
50+
indicated they do not want their own resource isolation. This is to make it
51+
easier to keep track of a container’s processes.
52+
53+
If a container does not wish to have its own resource isolation, it can pass in
54+
a flag `share_cgroups` and the isolator will not update any controllers for it.
55+
56+
### Systemd Integration
57+
58+
We currently do not have systemd integration. This section should be updated
59+
with our approach if systemd support is implemented.
60+
61+
### Linux Launcher & Cgroups v2 Isolator
62+
63+
On Linux systems that support cgroups v2, the Mesos Containerizer will use the [Linux Launcher](https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/linux_launcher.cpp) and the [Cgroups v2 Isolator](https://github.com/apache/mesos/blob/master/src/slave/containerizer/mesos/isolators/cgroups2/cgroups2.cpp).
64+
65+
It’s recommended to review to code to gain a complete understanding of these steps.
66+
67+
Operations on startup:
68+
69+
- Linux Launcher `recover`: Parse the cgroups subtree rooted at
70+
`flags.cgroups_root` to obtain container ids. Compares the persisted state to
71+
the recovered dcontainers to determine what contains are orphans.
72+
- Cgroups v2 Isolator `recover`: Create internal state to track recovered
73+
containers. Calls `recover` on all of the controllers that are used by each of
74+
the recovered containers.
75+
76+
Operations when a new container is started:
77+
78+
- Cgroups v2 Isolator `prepare`: Creates cgroups for the new container and adds
79+
the container to isolator's internal state. Configures namespace creation flags
80+
and mount setups; does not create mounts or namespaces. Calls `prepare` on all
81+
of the controllers that are used by the new container.
82+
- Linux Launcher `fork`: Forks the Mesos Agent process to create the new
83+
container's process. Also moves the child processes into the container's leaf
84+
cgroup. Creates mounts and namespaces.
85+
- Cgroups v2 Isolator `watch`: Calls `watch` on each of the controllers that
86+
are used by the container. When a resource-watch promise is resolved a handler
87+
is invoked.
88+
- Cgroups v2 Isolator `isolate`: Calls `isolate` on each of the controllers that
89+
are used by the container. Then moves the container process into the container's
90+
leaf cgroup; at this point the container is isolated.

docs/isolators/cgroups-devices.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,13 @@ track and enforce open and mknod restrictions on device files. To enable the
1212
`cgroups/devices` isolator, append `cgroups/devices` to the `--isolation` flag
1313
when starting the Mesos agent.
1414

15+
## Changes for Cgroups2 Support
16+
17+
In Croups2, we create EBPF programs to keep track of which devices
18+
would be allowed or denied access. This is because cgroups2 no longer offers
19+
interface files for device access controls. Our default witelisted devices list
20+
remains unchanged for cgroups2.
21+
1522
## Default whitelisted devices
1623

1724
The following devices are, by default, whitelisted for each container, if you

docs/mesos-containerizer.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,3 +77,8 @@ unit file of Mesos agent, for example:
7777
[Service]
7878
Delegate=true
7979
```
80+
81+
## Cgroups2 Integration
82+
83+
In order to support the new requirements for Cgroups V2, the changes are
84+
documented in the [Cgroups2 Support](cgroups2-support.md) documentation.

0 commit comments

Comments
 (0)