-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support using cgroups inside run #436
Comments
PhilippWendler
added a commit
that referenced
this issue
Jul 10, 2019
This works for /proc/self/cgroup, but has no effect on the mounted cgroup hierarchy. Cf. #436 for more information
PhilippWendler
added a commit
that referenced
this issue
Aug 24, 2023
Cgroup namespaces allow us to properly isolate cgroups and make them usable inside the container. They exist for cgroups v1 and v2, but due to some v1 limitation make sense mostly on cgroupsv2. There are no disadvantages on using them, so we enable their use always when using cgroupsv2, and then provide a usable /sys/fs/cgroups directory. This implements most of #436 for runexec. The required prevention of modifying the resource limits from inside the container will come next.
PhilippWendler
added a commit
that referenced
this issue
Aug 24, 2023
This prevents the benchmarked process from changing the configured limits. So far, the cgroup with the limits it the same where we add the benchmarked process. But if we then delegate it into the container, the benchmarked process can access it and change the limits. So now we create a child cgroup of the cgroup with the limits, move the benchmarked process into the child, and make the child the root cgroup of the container. Then the limits are configured outside of the container and cannot be changed. This finishes #436 for runexec. We just need to take care that for some special operations we also use the child cgroup instead of the main one. An alternative would be the "nsdelegate" mount option for the cgroup2 fs. However, this needs to be set in the initial namespace, so we cannot enforce this. And at least on my Ubuntu system, it is missing, so we also not just declare it as a requirement.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sometimes the benchmarked process wants to use cgroups itself. BenchExec prevents this (in container mode) by mounting the cgroup hierarchy read-only. If we would not do this, the benchmarked process could interfere with the benchmarking (e.g., by moving itself out of our cgroup, or changing the memory limit).
To fully support this, the following needs to be done:
/proc/self/cgroup
needs to show the cgroups relative to the new root cgroup.Item 3. can be achieved with cgroup namespaces. Originally this was planned only for cgroup-v2, which we do not support yet (#133), but reworked for cgroup-v1 and at least on Ubuntu 18.04 it is usable.
With cgroup namespaces, Item 1. should also be possible if we remount the cgroup hierarchy. However, I did not yet manage to the example from the man page working with unprivileged (user) namespaces. An alternative could be bind-mounting the cgroups of the existing hierarchy to the cgroup root.
Item 2. would be doable with cgroup-v2 and
nsdelegate
(cf. man page). Without cgroup-v2 we could probably do it by using a nested cgroup, where we set the limits in the outer cgroup and make only the inner cgroup available in the container.The text was updated successfully, but these errors were encountered: