Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow setting cgroup memory limits via both nix.conf and pre-build hooks #7388

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cleverca22
Copy link
Contributor

allows setting resource limits with --option use-cgroups true --option memory-max $((1024*1024*1024*5))
allows a pre-build hook to set limits:

#!/bin/sh

if [[ -d $NIX_CGROUP ]]; then
  echo $((1024*1024*1024*1)) > $NIX_CGROUP/memory.max
fi

such a pre-build hook could fork out a daemon, that watches memory.events and dynamically raises memory.high while co-operating with a resource manager

note, currently requires that nix-daemon be ran with an extra dir in its cgroup path, such as /system.slice/nix-daemon.service/actual-daemon

Comment on lines +320 to +329
sets the cgroup memory.high limit, causing the build to throttle if exceeded
depends on use-cgroups=true
)"
};

Setting<uint64_t> memoryMax{
this, 0, "memory-max",
R"(
sets the cgroup memory.max limit, causing the build to be killed if exceeded
depends on use-cgroups=true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am missing some documentation here in what relation to each other you should set memory-high and memory-max. I would assume something like memory-high 3GB and memory-max 4GB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on how much a specific derivation needs, the goal is to keep track of how much memory each derivation is using, and not schedule too many on one machine
then use cgroups to enforce that it stays within the reserved amounts

@fricklerhandwerk fricklerhandwerk added the feature Feature request or proposal label Dec 5, 2022
@thufschmitt
Copy link
Member

Could it make sense to have a more general cgroup-config parameter accepting a list of key-value pairs?

@@ -415,6 +416,7 @@ void LocalDerivationGoal::startBuilder()
throw Error("cannot determine cgroup name from /proc/self/cgroup");

auto ourCgroupPath = canonPath("/sys/fs/cgroup/" + ourCgroup);
ourCgroupPath = dirOf(ourCgroupPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very confusing naming. Doing dirOf means it isn't our cgroup, but the parent's. And it's not clear that we want to be putting stuff in the parent cgroup, since it causes nix to break out of whatever resource controls are in place for its own cgroup. It especially wouldn't be nice if Nix builds escape from system.slice/nix-daemon.service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i had modified things so the daemon runs in system.slice/nix-daemon.service/actual-daemon

the problem is that you cant certain things on system.slice/nix-daemon.service if it contains any process, everything must be in a child c-group

i'm not sure what the best solution to that is, and this currently needs an extra script around nix-daemon to move it to a child cgroup

@@ -113,6 +113,11 @@ static CgroupStats destroyCgroup(const Path & cgroup, bool returnStats)
}
}

auto memoryhighPath = cgroup + "/memory.high";
if (pathExists(memoryhighPath)) {
stats.memoryHigh = string2Int<uint64_t>(trim(readFile(memoryhighPath)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off here.

Comment on lines +320 to +321
sets the cgroup memory.high limit, causing the build to throttle if exceeded
depends on use-cgroups=true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs proper capitalization/spelling/Markdown formatting.

@@ -720,6 +722,15 @@ void LocalDerivationGoal::startBuilder()
chownToBuilder(*cgroup);
chownToBuilder(*cgroup + "/cgroup.procs");
chownToBuilder(*cgroup + "/cgroup.threads");
auto parentCgroup = dirOf(*cgroup);
writeFile(parentCgroup + "/cgroup.subtree_control", "+memory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail if the cgroup does not have memory in cgroup.controllers, or if it's not a leaf cgroup or one that only has pids in its children (see the "no internal processes" rule in cgroups(7)). In particular, this may cause non-root chroot builds to fail (i.e. --store /path/to/my/nix as a non-root user) since we can't count on memory being enabled.

It's not clear what the best solution is, but some options:

  • When nix-daemon starts, it can create a sub-cgroup (e.g. system.slice/nix-daemon.service/manager) for itself and move into it, and create sub-cgroups for the builds in the original cgroup (e.g. system.slice/nix-daemon.service/<build>). That way the original cgroup doesn't violate the "no internal processes" rule.
  • That doesn't work for the non-root case, since the original cgroup will likely have other processes in it (e.g. user.slice/user-1000.slice/user@1000.service/app.slice/app-org.kde.konsole-7edbbf6bbf864ef9a259f86f31b0fd90.scope). So we can't easily convert it into a cgroup that doesn't violate the "no internal processes" rule. I guess we could search upwards for the nearest parent that has the memory controller enabled and has no internal processes (e.g. /sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/app.slice in this example) but that's a bit naughty...
  • Allow the parent cgroup for builds to be specified in nix.conf.

Whatever the solution, Nix shouldn't fail if writing to cgroup.subtree_control fails.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2022-12-09-nix-team-meeting-minutes-15/23951/1

@stale stale bot added the stale label Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature request or proposal stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants