Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modules/nix-daemon: Replace daemon(IO)NiceLevel options #138741

Merged
merged 2 commits into from Nov 15, 2021

Conversation

illdefined
Copy link
Contributor

The nix.daemonNiceLevel options allows for setting the nice level of the
Nix daemon process. On a modern Linux kernel with group scheduling the
nice level only affects threads relative to other threads in the same
task group (see sched(7)). Therefore this option has not the effect one
might expect.

The options daemonCPUSchedPolicy, daemonCPUSchedPriority,
daemonIOSchedClass are introduced and the daemonIONiceLevel option
renamed to daemonIOSchedPrority for consistency. These options allow for
more effective control over CPU and I/O scheduling.

Instead of setting daemonNiceLevel to a high value to increase the
responsiveness of an interactive system during builds -- which would not
have the desired effect, as described above -- one could set both
daemonCPUSchedPolicy and daemonIOSchedClass to idle.

Motivation for this change

Heavy build jobs can affect other tasks on a system and reduce responsiveness during interactive use.

The daemonNiceLevel option is not particularly useful to mitigate this problem on recent Linux kernels, because with group scheduling the nice level of a thread only affects scheduling relative to other threads in the same task group.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • 21.11 Release Notes (or backporting 21.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@dasJ
Copy link
Member

dasJ commented Sep 20, 2021

Why not remove the option altogether and let users set systemd.services.nix-daemon.serviceConfig.[…] directly? No need to duplicate all the options

@berbiche
Copy link
Member

berbiche commented Sep 20, 2021

Why not remove the option altogether and let users set systemd.services.nix-daemon.serviceConfig.[…] directly? No need to duplicate all the options

Discovery and documentation.

An entry in the manual for this would be useful too.

@roberth
Copy link
Member

roberth commented Sep 21, 2021

Perhaps add a sentence like this to the options:

While a lower priority can improve system responsiveness during updates, it comes at the risk of slowing down or potentially starving crucial configuration updates, limiting your ability to fix problems during load.

@roberth
Copy link
Member

roberth commented Sep 21, 2021

See also NixOS/nix#5235

@illdefined
Copy link
Contributor Author

Perhaps add a sentence like this to the options:

While a lower priority can improve system responsiveness during updates, it comes at the risk of slowing down or potentially starving crucial configuration updates, limiting your ability to fix problems during load.

I’ll add a note to the description.

I also wonder whether it would make sense to not expose the real‐time scheduling variants. I can’t think of a use case where having the nix-daemon and its children always pre‐empt every other thread on a system has any benefit.

For the CPU scheduling policy, the most useful variants in my opinion are other, batch and idle. For the I/O scheduling class it’s best-effort and idle. I could either add an explanation or warning to the description or simply remove the others.

@roberth
Copy link
Member

roberth commented Sep 22, 2021

yeah just remove the realtime ones.

@illdefined
Copy link
Contributor Author

I removed the real‐time scheduling policies and classes as well as the daemonCPUSchedPriority option which is only used by the real‐time policies.

I also wonder if batch would be a better default for the CPU scheduling policy than other, preferring throughput over responsiveness for the nix-daemon and its build jobs. But maybe that’s a change for another PR with a separate discussion.

@roberth
Copy link
Member

roberth commented Sep 22, 2021

Relevant section of man 7 sched:

Under group scheduling, a thread's nice value has an effect for
scheduling decisions only relative to other threads in the same
task group. This has some surprising consequences in terms of
the traditional semantics of the nice value on UNIX systems. In
particular, if autogrouping is enabled (which is the default in
various distributions), then employing setpriority(2) or nice(1)
on a process has an effect only for scheduling relative to other
processes executed in the same session (typically: the same
terminal window).

Conversely, for two processes that are (for example) the sole
CPU-bound processes in different sessions (e.g., different
terminal windows, each of whose jobs are tied to different
autogroups), modifying the nice value of the process in one of
the sessions has no effect in terms of the scheduler's decisions
relative to the process in the other session. A possibly useful
workaround here is to use a command such as the following to
modify the autogroup nice value for all of the processes in a
terminal session:

      $ echo 10 > /proc/self/autogroup

I can confirm CONFIG_FAIR_GROUP_SCHED=y, so this section seems to apply, including the workaround. However, I'm surprised that systemd's Nice would have no effect. Perhaps systemd does get to impose its Nice via cgroups and only references setpriority as user documentation? I know very little about cgroups and systemd, so my idea may very well be wrong.

@illdefined
Copy link
Contributor Author

Relevant section of man 7 sched:

Under group scheduling, a thread's nice value has an effect for
scheduling decisions only relative to other threads in the same
task group. This has some surprising consequences in terms of
the traditional semantics of the nice value on UNIX systems. In
particular, if autogrouping is enabled (which is the default in
various distributions), then employing setpriority(2) or nice(1)
on a process has an effect only for scheduling relative to other
processes executed in the same session (typically: the same
terminal window).
Conversely, for two processes that are (for example) the sole
CPU-bound processes in different sessions (e.g., different
terminal windows, each of whose jobs are tied to different
autogroups), modifying the nice value of the process in one of
the sessions has no effect in terms of the scheduler's decisions
relative to the process in the other session. A possibly useful
workaround here is to use a command such as the following to
modify the autogroup nice value for all of the processes in a
terminal session:

      $ echo 10 > /proc/self/autogroup

I can confirm CONFIG_FAIR_GROUP_SCHED=y, so this section seems to apply, including the workaround. However, I'm surprised that systemd's Nice would have no effect. Perhaps systemd does get to impose its Nice via cgroups and only references setpriority as user documentation? I know very little about cgroups and systemd, so my idea may very well be wrong.

As far as I can tell, systemd just uses setpriority: https://github.com/systemd/systemd/blob/main/src/core/execute.c#L4024 and https://github.com/systemd/systemd/blob/main/src/basic/process-util.c#L1547. That’s the only place I found where the nice value from the ExecContext is used in a system call.

The cgroup code does not appear to do anything with it and there is no code writing to any autogroup file.

@illdefined
Copy link
Contributor Author

If I understand https://unix.stackexchange.com/questions/340283/using-and-understanding-systemd-scheduling-related-options-in-a-desktop-context/515893#515893 correctly, the Nice setting is useful as long autogrouping has been disabled and there is not a single service using any of CPUAccounting, CPUWeight, StartupCPUWeight, CPUShares or StartupCPUShares.

An alternative to completely removing the daemonNiceLevel option could be to just add a note that it is useless under normal circumstances and only has an effect if both autogrouping has been disabled and per‐service CPU control is not being used.

Copy link
Contributor

@ShamrockLee ShamrockLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a breaking change (option renaming), it would need to be documented inside the corresponding release note.

@illdefined
Copy link
Contributor Author

As a breaking change (option renaming), it would need to be documented inside the corresponding release note.

I added an explanation to the release notes. Is this adequate?

@ShamrockLee
Copy link
Contributor

As this PR is merging to the master branch, the corresponding breaking changes should appear in nixos/doc/manual/release-notes/rl-2111.section.md under the ## Backward Incompatibilities part.

It seems that the release note should be short, inform the user "what to do", and provide the links to detailed documentation. You can link to the documentation of the options like

[`nix.daemonCPUSchedPolicy`](options.html#opt-nix.daemonCPUSchedPolicy)

@illdefined
Copy link
Contributor Author

As this PR is merging to the master branch, the corresponding breaking changes should appear in nixos/doc/manual/release-notes/rl-2111.section.md under the ## Backward Incompatibilities part.

It seems that the release note should be short, inform the user "what to do", and provide the links to detailed documentation. You can link to the documentation of the options like

[`nix.daemonCPUSchedPolicy`](options.html#opt-nix.daemonCPUSchedPolicy)

Thank you. I moved it to rl-2111.section.md and reduced it a bit.

@ShamrockLee
Copy link
Contributor

Thank you. I moved it to rl-2111.section.md and reduced it a bit.

It seems that the lines are put under the ## Other Notable Changes
As the change is incompatible, ## Backward Incompatibilities would be a better place. Just move them to the bottom of Backward Incompatibilities (above ## Other Notable Changes)

@illdefined
Copy link
Contributor Author

Thank you. I moved it to rl-2111.section.md and reduced it a bit.

It seems that the lines are put under the ## Other Notable Changes As the change is incompatible, ## Backward Incompatibilities would be a better place. Just move them to the bottom of Backward Incompatibilities (above ## Other Notable Changes)

Sorry about that. Moved it to the correct place.

The nix.daemonNiceLevel options allows for setting the nice level of the
Nix daemon process. On a modern Linux kernel with group scheduling the
nice level only affects threads relative to other threads in the same
task group (see sched(7)). Therefore this option has not the effect one
might expect.

The options daemonCPUSchedPolicy and daemonIOSchedClass are introduced
and the daemonIONiceLevel option renamed to daemonIOSchedPrority for
consistency. These options allow for more effective control over CPU
and I/O scheduling.

Instead of setting daemonNiceLevel to a high value to increase the
responsiveness of an interactive system during builds -- which would not
have the desired effect, as described above -- one could set both
daemonCPUSchedPolicy and daemonIOSchedClass to idle.
@yu-re-ka yu-re-ka merged commit aeaafd1 into NixOS:master Nov 15, 2021
@illdefined
Copy link
Contributor Author

I considered proposing batch as a default value for nix.daemonCPUSchedPolicy since builds are non‐interactive and might benefit from longer time slices and the resulting improved caching.

To test this hypothesis, I built linuxPackages_latest.kernel-5.15.2 a number of times on my machine (Intel Core i7-11850H, 64 GiB RAM) with the other and batch policies. The build times varied only by a few seconds between the two (for a total build time of around 26 minutes).

These results may have however been confounded by a few factors specific to my machine. Apart from the large CPU cache and relatively fast RAM, I use a kernel with a custom hardening config, run builds in a tmpfs and most of the CPU cores are designated as full dynticks CPUs.

Copy link
Member

@samueldr samueldr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing mkRenamedOptionModule for renamed options.

Or mkRemovedOptionModule if the options are not equivalent I guess.

@samueldr
Copy link
Member

It was a bit unclear what values I should use if I want maximum interactive session responsiveness while not starving the build resources. Recommendations would have been nice.

@illdefined
Copy link
Contributor Author

This is missing mkRenamedOptionModule for renamed options.

Or mkRemovedOptionModule if the options are not equivalent I guess.

I can create another PR for that.

@illdefined
Copy link
Contributor Author

It was a bit unclear what values I should use if I want maximum interactive session responsiveness while not starving the build resources. Recommendations would have been nice.

From my personal experience idle for both daemonCPUSchedPolicy and daemonIOSchedClass works quite well for interactive systems.

@illdefined
Copy link
Contributor Author

It was a bit unclear what values I should use if I want maximum interactive session responsiveness while not starving the build resources. Recommendations would have been nice.

From my personal experience idle for both daemonCPUSchedPolicy and daemonIOSchedClass works quite well for interactive systems.

#147497 suggests idle for desktop and portable computers.

@illdefined
Copy link
Contributor Author

This is missing mkRenamedOptionModule for renamed options.
Or mkRemovedOptionModule if the options are not equivalent I guess.

I can create another PR for that.

See #147490.

ambroisie added a commit to ambroisie/nix-config that referenced this pull request Nov 29, 2021
This option doesn't really work the way it should anyway [1].

This reverts commit cbf6ea9.

[1]: NixOS/nixpkgs#138741
Atemu added a commit to Atemu/nixos-config that referenced this pull request Dec 28, 2021
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/upgrade-on-rasberry-pi-make-it-unresponsive/17345/2

ambroisie added a commit to ambroisie/nix-config that referenced this pull request Feb 8, 2023
This option doesn't really work the way it should anyway [1].

This reverts commit cbf6ea9.

[1]: NixOS/nixpkgs#138741
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants