Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support using cgroups inside run #436

Closed
PhilippWendler opened this issue Jul 10, 2019 · 0 comments
Closed

Support using cgroups inside run #436

PhilippWendler opened this issue Jul 10, 2019 · 0 comments
Labels
cgroups container related to container mode

Comments

@PhilippWendler
Copy link
Member

Sometimes the benchmarked process wants to use cgroups itself. BenchExec prevents this (in container mode) by mounting the cgroup hierarchy read-only. If we would not do this, the benchmarked process could interfere with the benchmarking (e.g., by moving itself out of our cgroup, or changing the memory limit).

To fully support this, the following needs to be done:

  1. We need to mount the cgroup mounts in the namespace such that all other cgroups are invisible.
  2. We need to prevent the process from interfering with the limits set in the cgroup that is now the root of the visible hierarchy.
  3. /proc/self/cgroup needs to show the cgroups relative to the new root cgroup.

Item 3. can be achieved with cgroup namespaces. Originally this was planned only for cgroup-v2, which we do not support yet (#133), but reworked for cgroup-v1 and at least on Ubuntu 18.04 it is usable.

With cgroup namespaces, Item 1. should also be possible if we remount the cgroup hierarchy. However, I did not yet manage to the example from the man page working with unprivileged (user) namespaces. An alternative could be bind-mounting the cgroups of the existing hierarchy to the cgroup root.

Item 2. would be doable with cgroup-v2 and nsdelegate (cf. man page). Without cgroup-v2 we could probably do it by using a nested cgroup, where we set the limits in the outer cgroup and make only the inner cgroup available in the container.

@PhilippWendler PhilippWendler added the container related to container mode label Jul 10, 2019
@PhilippWendler PhilippWendler modified the milestone: Release 2.0 Jul 10, 2019
PhilippWendler added a commit that referenced this issue Jul 10, 2019
This works for /proc/self/cgroup,
but has no effect on the mounted cgroup hierarchy.

Cf. #436 for more information
PhilippWendler added a commit that referenced this issue Aug 24, 2023
Cgroup namespaces allow us to properly isolate cgroups
and make them usable inside the container.
They exist for cgroups v1 and v2, but due to some v1 limitation
make sense mostly on cgroupsv2.
There are no disadvantages on using them,
so we enable their use always when using cgroupsv2,
and then provide a usable /sys/fs/cgroups directory.

This implements most of #436 for runexec.
The required prevention of modifying the resource limits
from inside the container will come next.
PhilippWendler added a commit that referenced this issue Aug 24, 2023
This prevents the benchmarked process from changing the configured
limits.
So far, the cgroup with the limits it the same where we add the
benchmarked process. But if we then delegate it into the container,
the benchmarked process can access it and change the limits.
So now we create a child cgroup of the cgroup with the limits,
move the benchmarked process into the child,
and make the child the root cgroup of the container.
Then the limits are configured outside of the container and cannot be
changed.

This finishes #436 for runexec.

We just need to take care that for some special operations
we also use the child cgroup instead of the main one.

An alternative would be the "nsdelegate" mount option for the cgroup2 fs.
However, this needs to be set in the initial namespace,
so we cannot enforce this. And at least on my Ubuntu system,
it is missing, so we also not just declare it as a requirement.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cgroups container related to container mode
Development

No branches or pull requests

1 participant