Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the Systemd warning "Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service" #8615

Closed
eriksjolund opened this issue Dec 6, 2020 · 8 comments · Fixed by #8889
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@eriksjolund
Copy link
Contributor

eriksjolund commented Dec 6, 2020

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

podman generate systemd generates a service that lead to a warning from Systemd:

Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.

If the warning from Systemd can be ignored, it should be documented in
podman-generate-systemd.1.md

Steps to reproduce the issue:

  1. On Fedora CoreOS (next) run these commands
[eriktest@fedora ~]$ mkdir -p .config/systemd/user
[eriktest@fedora ~]$ podman create --name nginxtest2 nginx:latest
2f0e5c3537bbd51f9da86e1d27b992a92107104f21d7e542f0ae5c17f1c6b799
[eriktest@fedora ~]$ podman generate systemd --new --name nginxtest2 > ~/.config/systemd/user/nginxtest2.service
[eriktest@fedora ~]$ cat ~/.config/systemd/user/nginxtest2.service 
# container-nginxtest2.service
# autogenerated by Podman 2.1.1
# Sun Dec  6 13:54:57 UTC 2020

[Unit]
Description=Podman container-nginxtest2.service
Documentation=man:podman-generate-systemd(1)
Wants=network.target
After=network-online.target

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
ExecStartPre=/bin/rm -f %t/container-nginxtest2.pid %t/container-nginxtest2.ctr-id
ExecStart=/usr/bin/podman run --conmon-pidfile %t/container-nginxtest2.pid --cidfile %t/container-nginxtest2.ctr-id --cgroups=no-conmon -d --replace --name nginxtest2 nginx:latest
ExecStop=/usr/bin/podman stop --ignore --cidfile %t/container-nginxtest2.ctr-id -t 10
ExecStopPost=/usr/bin/podman rm --ignore -f --cidfile %t/container-nginxtest2.ctr-id
PIDFile=%t/container-nginxtest2.pid
KillMode=none
Type=forking

[Install]
WantedBy=multi-user.target default.target
[eriktest@fedora ~]$ systemctl --user status nginxtest2.service | cat -
● nginxtest2.service - Podman container-nginxtest2.service
     Loaded: loaded (/var/home/eriktest/.config/systemd/user/nginxtest2.service; disabled; vendor preset: disabled)
     Active: inactive (dead)
       Docs: man:podman-generate-systemd(1)

Dec 06 13:55:32 fedora systemd[479275]: /var/home/eriktest/.config/systemd/user/nginxtest2.service:19: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
Dec 06 13:55:46 fedora systemd[479275]: /var/home/eriktest/.config/systemd/user/nginxtest2.service:19: Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
[eriktest@fedora ~]$ 

Describe the results you received:

I see the warning:

 Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.

Describe the results you expected:

No warning

Additional information you deem important (e.g. issue happens only occasionally):

I looked quickly in the systemd source code
(https://github.com/systemd/systemd). It seems this warning text is present in v246 but not in v245.

Output of podman version:

podman version 2.1.1

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.21-3.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.21, commit: 0f53fb68333bdead5fe4dc5175703e22cf9882ab'
  cpus: 2
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 165536
      size: 65536
  kernel: 5.9.10-200.fc33.x86_64
  linkmode: dynamic
  memFree: 895729664
  memTotal: 2075795456
  ociRuntime:
    name: runc
    package: runc-1.0.0-279.dev.gitdedadbf.fc33.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc92+dev
      commit: c9a9ce0286785bef3f3c3c87cd1232e535a03e15
      spec: 1.0.2-dev
  os: linux
  remoteSocket:
    path: /run/user/1001/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-4.dev.giteecccdb.fc33.x86_64
    version: |-
      slirp4netns version 1.1.4+dev
      commit: eecccdb96f587b11d7764556ffacfeaffe4b6e11
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 0
  swapTotal: 0
  uptime: 88h 47m 9.4s (Approximately 3.67 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /var/home/eriktest/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 0
    stopped: 2
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.3.0-1.fc33.x86_64
      Version: |-
        fusermount3 version: 3.9.3
        fuse-overlayfs: version 1.3
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /var/home/eriktest/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 1
  runRoot: /run/user/1001/containers
  volumePath: /var/home/eriktest/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 1602087680
  BuiltTime: Wed Oct  7 16:21:20 2020
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 2.1.1

Package info (e.g. output of rpm -q podman or apt list podman):

warning: Found bdb Packages database while attempting sqlite backend: using bdb backend.
podman-2.1.1-12.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

No

Additional environment details (AWS, VirtualBox, physical, etc.):

[eriktest@fedora ~]$ rpm -qi systemd
warning: Found bdb Packages database while attempting sqlite backend: using bdb backend.
Name        : systemd
Version     : 246.6
Release     : 3.fc33
Architecture: x86_64
Install Date: Tue 01 Dec 2020 07:33:36 PM UTC
Group       : Unspecified
Size        : 12927882
License     : LGPLv2+ and MIT and GPLv2+
Signature   : RSA/SHA256, Sat 03 Oct 2020 10:18:38 AM UTC, Key ID 49fd77499570ff31
Source RPM  : systemd-246.6-3.fc33.src.rpm
Build Date  : Thu 01 Oct 2020 03:14:24 PM UTC
Build Host  : buildhw-x86-07.iad2.fedoraproject.org
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : https://www.freedesktop.org/wiki/Software/systemd
Bug URL     : https://bugz.fedoraproject.org/systemd
Summary     : System and Service Manager
Description :
systemd is a system and service manager that runs as PID 1 and starts
the rest of the system. It provides aggressive parallelization
capabilities, uses socket and D-Bus activation for starting services,
offers on-demand starting of daemons, keeps track of processes using
Linux control groups, maintains mount and automount points, and
implements an elaborate transactional dependency-based service control
logic. systemd supports SysV and LSB init scripts and works as a
replacement for sysvinit. Other parts of this package are a logging daemon,
utilities to control basic system configuration like the hostname,
date, locale, maintain a list of logged-in users, system accounts,
runtime directories and settings, and daemons to manage simple network
configuration, network time synchronization, log forwarding, and name
resolution.

This package was built from the 246.6-stable branch of systemd.
[eriktest@fedora ~]$ systemctl --version
systemd 246 (v246.6-3.fc33)
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified
[eriktest@fedora ~]$ rpm-ostree status
State: idle
Warning: failed to query journal: couldn't find current boot in journal
Deployments:
● ostree://fedora:fedora/x86_64/coreos/next
                   Version: 33.20201130.1.0 (2020-12-01T19:35:26Z)
                    Commit: b1418e28370be0fcb420920036aebaa44f7b3a2c2e17b088b81dec5383c8a573
              GPGSignature: Valid signature by 963A2BEB02009608FE67EA4249FD77499570FF31
@openshift-ci-robot openshift-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Dec 6, 2020
@rhatdan
Copy link
Member

rhatdan commented Dec 7, 2020

@vrothberg We need to work with systemd team for a better way of handling this.

@vrothberg
Copy link
Member

@vrothberg We need to work with systemd team for a better way of handling this.

I think this ship has sailed:

Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.

KillMode=none has been deprecated. @msekletar, which alternative would you advice us to use in the future?

@vrothberg
Copy link
Member

@msekletar and I had a chat. There's nothing to worry about. systemd decided to gently deprecate KillMode=none as it was a continuous source of confusion and bugs. We will change none to another mode. I plan to do that before Podman 3.0 early next year.

@giuseppe FYI

@giuseppe
Copy link
Member

giuseppe commented Dec 9, 2020

what is the suggested alternative to KillMode=none?

AFAIK, neither mixed or control-group are equivalent and we risk to kill the conmon or podman processes if they take too long to cleanup.

@vrothberg
Copy link
Member

I guess control-group is the only viable option assuming that processes in sub-cgroups aren't killed. I don't have time to tackle it at the moment.

An alternative may be using TimeoutStopSec.

@msekletar WDYT?

@mheon
Copy link
Member

mheon commented Dec 9, 2020

control-group could still be an issue, there are a lot of cases where Conmon is in the system-managed cgroup and we explicitly do not want Conmon to be killed.

@vrothberg
Copy link
Member

@giuseppe @msekletar and I had a quick sync on the issue.

  • Short term: we can use TimeoutStopSec and remove KillMode=none which will default to cgroup.

  • Long term: we want to change the type to sdnotify. The plumbing for Podman is done but we need it for conmon. Once sdnotify is working, we can get rid of the pidfile handling etc. and let Podman handle it. @msekletar came up with a nice idea that Podman increase the time out on demand. That's a much cleaner way than hard-coding the time out in the unit as suggest in the short-term solution.

@andrewgdunn
Copy link

Is there an issue tracking sdnotify for conmon?

vrothberg added a commit to vrothberg/libpod that referenced this issue Jan 5, 2021
`KillMode=none` has been deprecated in systemd and is now throwing big
warnings when being used.  Users have reported the issues upstream
(see containers#8615) and on the mailing list.

This deprecation was mainly motivated by an abusive use of third-party
vendors causing all kinds of undesired side-effects.  For instance, busy
mounts that delay reboot.

After talking to the systemd team, we came up with the following plan:

 **Short term**: we can use TimeoutStopSec and remove KillMode=none which
 will default to cgroup.

 **Long term**: we want to change the type to sdnotify. The plumbing for
 Podman is done but we need it for conmon. Once sdnotify is working, we
 can get rid of the pidfile handling etc. and let Podman handle it.
 Michal Seklatar came up with a nice idea that Podman increase the time
 out on demand. That's a much cleaner way than hard-coding the time out
 in the unit as suggest in the short-term solution.

This change is executing the short-term plan and sets a minimum timeout
of 60 seconds.  User-specified timeouts are added to that.

Fixes: containers#8615
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Jan 5, 2021
`KillMode=none` has been deprecated in systemd and is now throwing big
warnings when being used.  Users have reported the issues upstream
(see containers#8615) and on the mailing list.

This deprecation was mainly motivated by an abusive use of third-party
vendors causing all kinds of undesired side-effects.  For instance, busy
mounts that delay reboot.

After talking to the systemd team, we came up with the following plan:

 **Short term**: we can use TimeoutStopSec and remove KillMode=none which
 will default to cgroup.

 **Long term**: we want to change the type to sdnotify. The plumbing for
 Podman is done but we need it for conmon. Once sdnotify is working, we
 can get rid of the pidfile handling etc. and let Podman handle it.
 Michal Seklatar came up with a nice idea that Podman increase the time
 out on demand. That's a much cleaner way than hard-coding the time out
 in the unit as suggest in the short-term solution.

This change is executing the short-term plan and sets a minimum timeout
of 60 seconds.  User-specified timeouts are added to that.

Fixes: containers#8615
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
rohanpm added a commit to rohanpm/exodus-gw that referenced this issue Feb 21, 2022
These units were originally created by "podman generate systemd"
which inserted a KillMode=none directive. That value of KillMode
was deprecated by systemd and will be removed in the future.

This commit updates the units to be aligned with podman's fix for
the issue in containers/podman#8615,
which was to drop KillMode and add TimeoutStopSec.
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants