rootless podman container is not stopped properly on host shutdown (hangs) #15284

topas-rec · 2022-08-11T11:06:16Z

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

At shutdown the system hangs for some time and does not shutdown as fast as it normally does.
I have added the container to systemd by using the generate systemd command to create a unit file.

When I stop the podman container that is running before shutdown, the shutdown does not hang.
Stopping the container manually via systemctl --user [...] stops it without problems.

Steps to reproduce the issue:

Create a rootless container (nextcloud in my case)
Create a systemd unit file by using the generate systemd command
Shutdown the system and observe a delay at shutdown

Describe the results you received:
The system hangs for some time and does not shutdown as fast as it normally does because stopping the container doesn't work.

Describe the results you expected:
Properly shutdown container (stopped as fast as when the stop command in systemd is invoked manually) and a fast shutdown wihtout systemd to forcefully stop the container.

Additional information you deem important (e.g. issue happens only occasionally):
The issue happens on every shutdown. (It might not have happen after initial container creation - but I sadly do not remember)

Output of podman version:

Client:       Podman Engine
Version:      4.1.1
API Version:  4.1.1
Go Version:   go1.19
Git Commit:   f73d8f8875c2be7cd2049094c29aff90b1150241-dirty
Built:        Wed Aug  3 22:52:48 2022
OS/Arch:      linux/amd64

Output of podman info:

host:
  arch: amd64
  buildahVersion: 1.26.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon is owned by conmon 1:2.1.3-1
    path: /usr/bin/conmon
    version: 'conmon version 2.1.3, commit: ab52a597278b20173440140cd810dc9fa8785c93'
  cpuUtilization:
    idlePercent: 92.78
    systemPercent: 1.78
    userPercent: 5.44
  cpus: 4
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  hostname: tobias-pc
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.18.16-arch1-1
  linkmode: dynamic
  logDriver: journald
  memFree: 3396468736
  memTotal: 8216518656
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: /usr/bin/crun is owned by crun 1.5-1
    path: /usr/bin/crun
    version: |-
      crun version 1.5
      commit: 54ebb8ca8bf7e6ddae2eb919f5b82d1d96863dea
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns is owned by slirp4netns 1.2.0-1
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 4107792384
  swapTotal: 4107792384
  uptime: 34m 35.29s
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/tobiasb/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/tobiasb/.local/share/containers/storage
  graphRootAllocated: 49381437440
  graphRootUsed: 27635216384
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /home/tobiasb/.local/share/containers/storage/volumes
version:
  APIVersion: 4.1.1
  Built: 1659559968
  BuiltTime: Wed Aug  3 22:52:48 2022
  GitCommit: f73d8f8875c2be7cd2049094c29aff90b1150241-dirty
  GoVersion: go1.19
  Os: linux
  OsArch: linux/amd64
  Version: 4.1.1

Package info (e.g. output of rpm -q podman or apt list podman):

pacman --query --info podman
Name            : podman
Version         : 4.1.1-4
Description     : Tool and library for running OCI-based containers in pods
Architecture    : x86_64
URL             : https://github.com/containers/podman
Licenses        : Apache
Groups          : None
Provides        : None
Depends On      : catatonit  conmon  containers-common  crun  iptables  libdevmapper.so=1.02-64  libgpgme.so=11-64
                  libseccomp.so=2-64  slirp4netns
Optional Deps   : apparmor: for AppArmor support
                  btrfs-progs: support btrfs backend devices [installed]
                  netavark: for a new container-network-stack implementation [installed]
                  podman-compose: for docker-compose compatibility
                  podman-docker: for Docker-compatible CLI
Required By     : None
Optional For    : None
Conflicts With  : None
Replaces        : None
Installed Size  : 64.11 MiB
Packager        : David Runge <dvzrv@archlinux.org>
Build Date      : Wed 03 Aug 2022 22:52:48 CEST
Install Date    : Fri 05 Aug 2022 08:39:46 CEST
Install Reason  : Explicitly installed
Install Script  : No
Validated By    : Signature

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes (troubleshooting guide, version is mostly up to date)

Additional environment details (AWS, VirtualBox, physical, etc.):

This is the systemd unit file:

# container-myNextcloudContainer.service
# autogenerated by Podman 4.1.1
# Mon Aug  8 09:41:15 CEST 2022

[Unit]
Description=Podman container-myNextcloudContainer.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=/run/user/1000/containers

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman start myNextcloudContainer
ExecStop=/usr/bin/podman stop -t 10 myNextcloudContainer
ExecStopPost=/usr/bin/podman stop -t 10 myNextcloudContainer
PIDFile=/run/user/1000/containers/overlay-containers/30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0/userdata/conmon.pid
Type=forking

[Install]
WantedBy=default.target

I have a filtered and a full journalctl log from an affected boot:

filtered: (created by journalctl --boot -1 --grep=container\|libpod\|podman\|myNextcloudContainer)

Aug 11 12:23:06 tobias-pc systemd[333]: Starting Podman container-myNextcloudContainer.service...
Aug 11 12:23:07 tobias-pc systemd[333]: libpod-30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0.scope: unit configures an IP firewall, but not running as root.
Aug 11 12:23:07 tobias-pc systemd[333]: Started libcrun container.
Aug 11 12:23:07 tobias-pc podman[441]: 2022-08-11 12:23:07.593536749 +0200 CEST m=+0.335178142 container init 30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0 (image=docker.io/library/nextcloud:latest, name=myNextcloudContainer)
Aug 11 12:23:07 tobias-pc podman[441]: 2022-08-11 12:23:07.600739971 +0200 CEST m=+0.342381357 container start 30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0 (image=docker.io/library/nextcloud:latest, name=myNextcloudContainer)
Aug 11 12:23:07 tobias-pc podman[441]: myNextcloudContainer
Aug 11 12:23:07 tobias-pc systemd[333]: Started podman-pause-b9433de2.scope.
Aug 11 12:23:07 tobias-pc systemd[333]: Started Podman container-myNextcloudContainer.service.
Aug 11 12:23:47 tobias-pc systemd[333]: Stopping libcrun container...
Aug 11 12:23:47 tobias-pc systemd[333]: Stopping podman-pause-b9433de2.scope...
Aug 11 12:23:47 tobias-pc systemd[333]: Stopping Podman container-myNextcloudContainer.service...
Aug 11 12:23:47 tobias-pc systemd[333]: Stopped podman-pause-b9433de2.scope.
Aug 11 12:23:47 tobias-pc systemd[333]: Stopped libcrun container.
Aug 11 12:23:48 tobias-pc podman[1836]: 2022-08-11 12:23:48.058918141 +0200 CEST m=+0.399080940 container died 30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0 (image=docker.io/library/nextcloud:latest, name=myNextcloudContainer)
Aug 11 12:23:48 tobias-pc podman[1836]: 2022-08-11 12:23:48.062772871 +0200 CEST m=+0.402935700 container restart 30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0 (image=docker.io/library/nextcloud:latest, name=myNextcloudContainer)
Aug 11 12:24:57 tobias-pc systemd[333]: container-myNextcloudContainer.service: Stopping timed out. Terminating.
Aug 11 12:24:57 tobias-pc systemd[333]: container-myNextcloudContainer.service: Main process exited, code=killed, status=10/USR1
Aug 11 12:24:57 tobias-pc podman[1883]: time="2022-08-11T12:24:57+02:00" level=error msg="Joining network namespace for container 30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0: error retrieving network namespace at /run/user/1000/netns/netns-f3a5790f-2f43-a4e5-5389-bd1fb024df41: unknown FS magic on \"/run/user/1000/netns/netns-f3a5790f-2f43-a4e5-5389-bd1fb024df41\": 1021994"
Aug 11 12:24:57 tobias-pc podman[1883]: time="2022-08-11T12:24:57+02:00" level=error msg="Joining network namespace for container 30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0: error retrieving network namespace at /run/user/1000/netns/netns-f3a5790f-2f43-a4e5-5389-bd1fb024df41: unknown FS magic on \"/run/user/1000/netns/netns-f3a5790f-2f43-a4e5-5389-bd1fb024df41\": 1021994"
Aug 11 12:24:57 tobias-pc podman[1883]: 2022-08-11 12:24:57.890991966 +0200 CEST m=+0.037434651 container restart 30f94855b53ac624877026608b13344235ad4321720aeaec000df352f3f325b0 (image=docker.io/library/nextcloud:latest, name=myNextcloudContainer)

Full journalctl of the same boot (created with journalctl --boot -1):
journalctl_podman_shutdown.log

Possible duplicate: #14531

The text was updated successfully, but these errors were encountered:

vrothberg · 2022-08-11T11:33:17Z

Thanks for reaching out, @topas-rec.

It looks similar to #14531. Did you configure a restart policy for the container?

Cc: @mheon

topas-rec · 2022-08-11T11:37:36Z

Thanks for your quick help!

Yes, the container has --restart=always
Full run command is:

podman run --name=myNextcloudContainer --uidmap=0:1:33 --uidmap=33:0:1 --uidmap=34:34:65502 -dt --volume=nextcloudHtml:/var/www/html/ --volume=/home/tobiasb/DATA/:/media/DATA/ --restart=always -p 8084:80/tcp -e NEXTCLOUD_ADMIN_USER="[removed]" -e NEXTCLOUD_ADMIN_password="[removed]" nextcloud

vrothberg · 2022-08-11T11:41:17Z

@topas-rec, thanks! Is the container getting restarted after a manual systemctl --user stop $.service as well?

topas-rec · 2022-08-11T11:43:27Z

No, it isn't. Stops fast through a systemd manual stop command and stays stopped.

After stopping it manually the shutdown is just fast, too. So when the container is stopped the symptom of a hangin shutdown is gone.

vrothberg · 2022-08-11T11:49:02Z

Judging from the logs above it looks like the container gets restarted all the time until the 70 seconds timeout of the units kicks in.

topas-rec · 2022-08-11T12:16:10Z

I find it hard to evaluate, but yeah that might be.
So should I use another restart policy?

vrothberg · 2022-08-11T12:19:13Z

@topas-rec it would be worth trying if that is the cause - just to eliminate potential sources. Since the container is already running in a systemd unit, I suggest creating the container without a restart policy and use podman generate systemd --restart-policy instead. systemd can take care of that.

topas-rec · 2022-08-11T12:25:46Z

Sure.
I'll love to see the machine shutdown in seconds again.

If that is the cause this mechanism ahould be noted in the docs for the generate systemd command.

vrothberg · 2022-08-11T12:27:44Z

Sure. I'll love to see the machine shutdown in seconds again.

Absolutely :)

If that is the cause this mechanism ahould be noted in the docs for the generate systemd command.

Yes, definitely. Thanks for helping resolving the issue.

topas-rec · 2022-08-11T13:49:57Z

Starting the container without --restart=always solves the problem.
I left the inital systemd unit file unchanged because I think the default restart policy on-failure is ok.

So there should be a hint in the generate systemd docs that says:
When using systemd to start and stop the container, a container restart policy of always might lead to a race condition at shutdown.

Thanks for helping me!

vrothberg · 2022-08-11T13:51:56Z

Thanks a lot, @topas-rec!

I reopen to track that we either document that or elaborate on code improvements.

github-actions · 2022-09-11T00:08:32Z

A friendly reminder that this issue had no activity for 30 days.

rhatdan · 2022-09-12T19:18:24Z

@vrothberg Any progress on this?

Emit a warning to the user when generating a unit with --new on a container that was created with a custom --restart policy. As shown in containers#15284, a custom --restart policy in that case can lead to issues on system shutdown where systemd attempts to nuke the unit but Podman keeps on restarting the container. Fixes: containers#15284 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>

vrothberg · 2022-09-13T07:46:42Z

@vrothberg Any progress on this?

Thanks for the ping. I opened #15766.

openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 11, 2022

topas-rec closed this as completed Aug 11, 2022

vrothberg reopened this Aug 11, 2022

github-actions bot added the stale-issue label Sep 11, 2022

github-actions bot removed the stale-issue label Sep 13, 2022

vrothberg mentioned this issue Sep 13, 2022

generate systemd: warn on --restart without --new #15766

Merged

openshift-merge-robot closed this as completed in 0ea5080 Sep 13, 2022

github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 16, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rootless podman container is not stopped properly on host shutdown (hangs) #15284

rootless podman container is not stopped properly on host shutdown (hangs) #15284

topas-rec commented Aug 11, 2022 •

edited

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022 •

edited

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

github-actions bot commented Sep 11, 2022

rhatdan commented Sep 12, 2022

vrothberg commented Sep 13, 2022

rootless podman container is not stopped properly on host shutdown (hangs) #15284

rootless podman container is not stopped properly on host shutdown (hangs) #15284

Comments

topas-rec commented Aug 11, 2022 • edited

Additional environment details (AWS, VirtualBox, physical, etc.):

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022 • edited

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

topas-rec commented Aug 11, 2022

vrothberg commented Aug 11, 2022

github-actions bot commented Sep 11, 2022

rhatdan commented Sep 12, 2022

vrothberg commented Sep 13, 2022

topas-rec commented Aug 11, 2022 •

edited

topas-rec commented Aug 11, 2022 •

edited