Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/containers: Introduce several tweaks to systemd-nspawn from upstream systemd #48771

Merged
merged 1 commit into from Oct 31, 2018

Conversation

arianvp
Copy link
Member

@arianvp arianvp commented Oct 21, 2018

nixos/containers: Introduce several tweaks to systemd-nspawn from upstream systemd

* Lets container@.service  be activated by machines.target instead of
  multi-user.target

  According to the systemd manpages, all containers that are registered
  by machinectl, should be inside machines.target for easy stopping
  and starting container units altogether

* make sure container@.service and container.slice instances are
  actually located in machine.slice

  https://plus.google.com/112206451048767236518/posts/SYAueyXHeEX
  See original commit: https://github.com/NixOS/systemd/commit/45d383a3b8

* Enable Cgroup delegation for nixos-containers

  Delegate=yes should be set for container scopes where a systemd instance
  inside the container shall manage the hierarchies below its own cgroup
  and have access to all controllers.

  This is equivalent to enabling all accounting options on the systemd
  process inside the system container.  This means that systemd inside
  the container is responsible for managing Cgroup resources for
  unit files that enable accounting options inside.  Without this
  option, units that make use of cgroup features within system
  containers might misbehave

  See original commit: https://github.com/NixOS/systemd/commit/a931ad47a8

  from the manpage:
    Turns on delegation of further resource control partitioning to
    processes of the unit. Units where this is enabled may create and
    manage their own private subhierarchy of control groups below the
    control group of the unit itself. For unprivileged services (i.e.
    those using the User= setting) the unit's control group will be made
    accessible to the relevant user. When enabled the service manager
    will refrain from manipulating control groups or moving processes
    below the unit's control group, so that a clear concept of ownership
    is established: the control group tree above the unit's control
    group (i.e. towards the root control group) is owned and managed by
    the service manager of the host, while the control group tree below
    the unit's control group is owned and managed by the unit itself.
    Takes either a boolean argument or a list of control group
    controller names. If true, delegation is turned on, and all
    supported controllers are enabled for the unit, making them
    available to the unit's processes for management. If false,
    delegation is turned off entirely (and no additional controllers are
    enabled). If set to a list of controllers, delegation is turned on,
    and the specified controllers are enabled for the unit. Note that
    additional controllers than the ones specified might be made
    available as well, depending on configuration of the containing
    slice unit or other units contained in it. Note that assigning the
    empty string will enable delegation, but reset the list of
    controllers, all assignments prior to this will have no effect.
    Defaults to false.

    Note that controller delegation to less privileged code is only safe
    on the unified control group hierarchy. Accordingly, access to the
    specified controllers will not be granted to unprivileged services
    on the legacy hierarchy, even when requested.

    The following controller names may be specified: cpu, cpuacct, io,
    blkio, memory, devices, pids. Not all of these controllers are
    available on all kernels however, and some are specific to the
    unified hierarchy while others are specific to the legacy hierarchy.
    Also note that the kernel might support further controllers, which
    aren't covered here yet as delegation is either not supported at all
    for them or not defined cleanly.
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nox --run "nox-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Fits CONTRIBUTING.md.

@arianvp arianvp changed the title [WIP] containers: Introduce several tweaks to systemd-nspawn from upstream systemd containers: Introduce several tweaks to systemd-nspawn from upstream systemd Oct 21, 2018
@arianvp
Copy link
Member Author

arianvp commented Oct 21, 2018

 nix-build -I nixpkgs=.  ./nixos/tests/containers-tmpfs.nix

seems to fail

machine: output:
error: command `nixos-container run tmpfs -- mountpoint -q /var 2>/dev/null' did not succeed (exit code 1)
command `nixos-container run tmpfs -- mountpoint -q /var 2>/dev/null' did not succeed (exit code 1)
cleaning up
killing machine (pid 597)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
builder for '/nix/store/jc14mfzb7khf000fr8c9ipcp6pxjgw3y-vm-test-run-containers-tmpfs.drv' failed with exit code 255
error: build of '/nix/store/jc14mfzb7khf000fr8c9ipcp6pxjgw3y-vm-test-run-containers-tmpfs.drv' failed

@arianvp arianvp force-pushed the container-tweaks branch 2 times, most recently from bbd7b42 to b047c51 Compare October 21, 2018 14:02
@arianvp
Copy link
Member Author

arianvp commented Oct 21, 2018

Test fixed now. Apparently machines.target isn't activated because there is no dependency from multi-user.target on machines.target However, I still don't get why the systemd package does not include multi-user.target.wants/machines.target as according to the meson.build file, that file should be included in the systemd build artifacts. Can someone with knowledge of the systemd packaging story have a look at this? Specifically at the issue described in this commit: b047c51

@arianvp arianvp changed the title containers: Introduce several tweaks to systemd-nspawn from upstream systemd [WIP] containers: Introduce several tweaks to systemd-nspawn from upstream systemd Oct 21, 2018
@arianvp arianvp changed the title [WIP] containers: Introduce several tweaks to systemd-nspawn from upstream systemd containers: Introduce several tweaks to systemd-nspawn from upstream systemd Oct 22, 2018
@arianvp arianvp changed the title containers: Introduce several tweaks to systemd-nspawn from upstream systemd nixos/containers: Introduce several tweaks to systemd-nspawn from upstream systemd Oct 22, 2018
…tream systemd

* Lets container@.service  be activated by machines.target instead of
  multi-user.target

  According to the systemd manpages, all containers that are registered
  by machinectl, should be inside machines.target for easy stopping
  and starting container units altogether

* make sure container@.service and container.slice instances are
  actually located in machine.slice

  https://plus.google.com/112206451048767236518/posts/SYAueyXHeEX
  See original commit: NixOS/systemd@45d383a3b8

* Enable Cgroup delegation for nixos-containers

  Delegate=yes should be set for container scopes where a systemd instance
  inside the container shall manage the hierarchies below its own cgroup
  and have access to all controllers.

  This is equivalent to enabling all accounting options on the systemd
  process inside the system container.  This means that systemd inside
  the container is responsible for managing Cgroup resources for
  unit files that enable accounting options inside.  Without this
  option, units that make use of cgroup features within system
  containers might misbehave

  See original commit: NixOS/systemd@a931ad47a8

  from the manpage:
    Turns on delegation of further resource control partitioning to
    processes of the unit. Units where this is enabled may create and
    manage their own private subhierarchy of control groups below the
    control group of the unit itself. For unprivileged services (i.e.
    those using the User= setting) the unit's control group will be made
    accessible to the relevant user. When enabled the service manager
    will refrain from manipulating control groups or moving processes
    below the unit's control group, so that a clear concept of ownership
    is established: the control group tree above the unit's control
    group (i.e. towards the root control group) is owned and managed by
    the service manager of the host, while the control group tree below
    the unit's control group is owned and managed by the unit itself.
    Takes either a boolean argument or a list of control group
    controller names. If true, delegation is turned on, and all
    supported controllers are enabled for the unit, making them
    available to the unit's processes for management. If false,
    delegation is turned off entirely (and no additional controllers are
    enabled). If set to a list of controllers, delegation is turned on,
    and the specified controllers are enabled for the unit. Note that
    additional controllers than the ones specified might be made
    available as well, depending on configuration of the containing
    slice unit or other units contained in it. Note that assigning the
    empty string will enable delegation, but reset the list of
    controllers, all assignments prior to this will have no effect.
    Defaults to false.

    Note that controller delegation to less privileged code is only safe
    on the unified control group hierarchy. Accordingly, access to the
    specified controllers will not be granted to unprivileged services
    on the legacy hierarchy, even when requested.

    The following controller names may be specified: cpu, cpuacct, io,
    blkio, memory, devices, pids. Not all of these controllers are
    available on all kernels however, and some are specific to the
    unified hierarchy while others are specific to the legacy hierarchy.
    Also note that the kernel might support further controllers, which
    aren't covered here yet as delegation is either not supported at all
    for them or not defined cleanly.
@arianvp
Copy link
Member Author

arianvp commented Oct 22, 2018

CC @Mic92

@Mic92
Copy link
Member

Mic92 commented Oct 22, 2018

Can you add yourself to ofborg? https://github.com/NixOS/ofborg/pull/223/files

@arianvp
Copy link
Member Author

arianvp commented Oct 22, 2018

Done: NixOS/ofborg#256

@peterhoeg
Copy link
Member

I still don't get why the systemd package does not include multi-user.target.wants/machines.target

The wants from units are ignored. Try doing this:

systemd.targets.machines.wantedBy = [ "multi-user.target" ];

That should do the trick.

@Mic92
Copy link
Member

Mic92 commented Oct 23, 2018

Ah yes this is because our activation script needs this information as well.

@arianvp
Copy link
Member Author

arianvp commented Oct 23, 2018

@peterhoeg yes I do that now. Thanks! Albeit the other way around. (Multi-user.wants = machines.target). The result being identical in terms of semancis.

@arianvp
Copy link
Member Author

arianvp commented Oct 31, 2018

@GrahamcOfBorg test containers-imperative containers-tmpfs

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.containers-imperative, tests.containers-tmpfs

Partial log (click to expand)

syncing
machine: running command: sync
machine: exit status 0
test script finished in 33.49s
cleaning up
killing machine (pid 597)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/q3qz8y6kmb71k93yfn1rp23wjjgc80bl-vm-test-run-containers-imperative
/nix/store/yai1q0ns4haaarwqwv0gdfvjl5ay8m21-vm-test-run-containers-tmpfs

@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: tests.containers-imperative, tests.containers-tmpfs

Partial log (click to expand)

syncing
machine: running command: sync
machine: exit status 0
test script finished in 94.18s
cleaning up
killing machine (pid 631)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/m7qs9avpnqyb63l6zgrjc54xxybgdmhj-vm-test-run-containers-imperative
/nix/store/s81397i1zg3y2r2f9cd95ra258s98gqh-vm-test-run-containers-tmpfs

@arianvp
Copy link
Member Author

arianvp commented Oct 31, 2018

@GrahamcOfBorg test containers-bridge containers-extra_veth containers-hosts containers-ipv4 containers-ipv6 containers-macvlans containers-physical_interfaces containers-portforward containers-reloadable containers-restart_networking

@GrahamcOfBorg
Copy link

Success on x86_64-linux (full log)

Attempted: tests.containers-bridge, tests.containers-extra_veth, tests.containers-hosts, tests.containers-ipv4, tests.containers-ipv6, tests.containers-macvlans, tests.containers-physical_interfaces, tests.containers-restart_networking

The following builds were skipped because they don't evaluate on x86_64-linux: tests.containers-portforward, tests.containers-reloadable

Partial log (click to expand)

vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/0gpjb3gwc254wknw2zp1mpijnd9blfx6-vm-test-run-containers-bridge
/nix/store/9knf9gai1agqkn2lpw2jr8bdxmdc58xv-vm-test-run-containers-bridge
/nix/store/wy74p6y1w22kccpv5f04r5bja1ykkmxq-vm-test-run-containers-hosts
/nix/store/brph1km9kdlvc66cdb0gghwb1iz83qmm-vm-test-run-containers-ipv4
/nix/store/alab36bvskqchz4z9n9njgyn5wxbzfqm-vm-test-run-containers-ipv6
/nix/store/881if2g4321ww87hdabn4xfsl85af0gy-vm-test-run-containers-macvlans
/nix/store/ariaa0c97967lz4lymjyxvw8j8931xhi-vm-test-run-containers-physical_interfaces
/nix/store/4wh3lc39gb2mvdyrpsvik3ra22zvh5lv-vm-test-run-containers-restart_networking

@GrahamcOfBorg
Copy link

Success on aarch64-linux (full log)

Attempted: tests.containers-bridge, tests.containers-extra_veth, tests.containers-hosts, tests.containers-ipv4, tests.containers-ipv6, tests.containers-macvlans, tests.containers-physical_interfaces, tests.containers-restart_networking

The following builds were skipped because they don't evaluate on aarch64-linux: tests.containers-portforward, tests.containers-reloadable

Partial log (click to expand)

vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
/nix/store/rrp13j1h522ql20zf2pjk1fnw2fs4iyy-vm-test-run-containers-bridge
/nix/store/xw58ddc89phcmajb1r3vijnglmymclrg-vm-test-run-containers-bridge
/nix/store/22awk9ivdsg2psknyd27clj4vkn3s1af-vm-test-run-containers-hosts
/nix/store/fq2l4ssf5mc5kazf006vwn7dc0z23jzq-vm-test-run-containers-ipv4
/nix/store/szfrjjn15zmgv2j90378vv5fdw0ijzgq-vm-test-run-containers-ipv6
/nix/store/zi4dj1x1djgwjp4iy61klczzlif1qxhb-vm-test-run-containers-macvlans
/nix/store/cqffmqwxcvi6f88rkanlhrkzsdyk1dhx-vm-test-run-containers-physical_interfaces
/nix/store/cwm3d79d1j31dxcb6fxi0fyqlf09m7z7-vm-test-run-containers-restart_networking

@arianvp
Copy link
Member Author

arianvp commented Oct 31, 2018

Tests look good to me. The two tests that didn't 'evaluate' are not in release.nix. Do you think this is ready for merge @Mic92 ?

@Mic92 Mic92 merged commit 553e0d8 into NixOS:master Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants