Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Podman rm (--force) does not work anymore ("container state improper"/"invalid argument" when unmounting) and start neither #19913

Closed
rugk opened this issue Sep 10, 2023 · 16 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@rugk
Copy link
Contributor

rugk commented Sep 10, 2023

Issue Description

Describe the bug
Somehow I get the error:

Error: container "nextcloud_[…]" is mounted and cannot be removed without using force: container state improper

…when trying to remove containers.

Steps to reproduce the issue

$ podman-compose -p nextcloud down
podman-compose version: 1.0.6
['podman', '--version', '']
using podman version: 4.6.1
** excluding:  set()
podman stop -t 10 nextcloud_caddy_1
Error: no container with name or ID "nextcloud_caddy_1" found: no such container
exit code: 125
podman stop -t 10 nextcloud_cron_1
Error: no container with name or ID "nextcloud_cron_1" found: no such container
exit code: 125
podman stop -t 10 nextcloud_nc_1
Error: no container with name or ID "nextcloud_nc_1" found: no such container
exit code: 125
podman stop -t 10 nextcloud_db_1
Error: no container with name or ID "nextcloud_db_1" found: no such container
exit code: 125
podman stop -t 10 nextcloud_redis_1
Error: no container with name or ID "nextcloud_redis_1" found: no such container
exit code: 125
podman rm nextcloud_caddy_1
Error: no container with ID or name "nextcloud_caddy_1" found: no such container
exit code: 1
podman rm nextcloud_cron_1
Error: no container with ID or name "nextcloud_cron_1" found: no such container
exit code: 1
podman rm nextcloud_nc_1
Error: container "nextcloud_nc_1" is mounted and cannot be removed without using force: container state improper
exit code: 2
podman rm nextcloud_db_1
Error: container "nextcloud_db_1" is mounted and cannot be removed without using force: container state improper
exit code: 2
podman rm nextcloud_redis_1
Error: container "nextcloud_redis_1" is mounted and cannot be removed without using force: container state improper
exit code: 2
$ podman rm nextcloud_db_1
Error: container "nextcloud_db_1" is mounted and cannot be removed without using force: container state improper

When trying to start the container I get this error then:

$ podman-compose --in-pod=0 -p nextcloud up -d
podman-compose version: 1.0.6
['podman', '--version', '']
using podman version: 4.6.1
** excluding:  set()
['podman', 'ps', '--filter', 'label=io.podman.compose.project=nextcloud', '-a', '--format', '{{ index .Labels "io.podman.compose.config-hash"}}']
podman pod create --name=pod_nextcloud --infra=false --share=
Error: adding pod to state: name "pod_nextcloud" is in use: pod already exists
exit code: 125
[…]
Error: creating container storage: the container name "nextcloud_redis_1" is already in use by 4dbd88724af1ee89d859c6b2dfebb89f95cf6358503e09a8763009877a4830cb. You have to remove that container to be able to reuse that name: that name is already in use
exit code: 125
podman start nextcloud_redis_1
Error: no container with name or ID "nextcloud_redis_1" found: no such container
exit code: 125
[…]
Error: creating container storage: the container name "nextcloud_db_1" is already in use by 2760fa4a652ba952ef5270d256c658dd3f4455d96fe7554abdb13bbfbdbd6c19. You have to remove that container to be able to reuse that name: that name is already in use
exit code: 125
podman start nextcloud_db_1
Error: no container with name or ID "nextcloud_db_1" found: no such container
exit code: 125
[…]

Then there are dependency errors depending on the mentioned containers to start.

The thing is I see nothing of that running?

$ podman ps -a
CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES
$ podman pod ls
POD ID        NAME           STATUS      CREATED      INFRA ID    # OF CONTAINERS
562b052bdab9  pod_nextcloud  Created     4 hours ago              0

Also the container that is said to use the name, is not there?

$ podman inspect 4dbd88724af1ee89d859c6b2dfebb89f95cf6358503e09a8763009877a4830cb
[]
Error: no such object: "4dbd88724af1ee89d859c6b2dfebb89f95cf6358503e09a8763009877a4830cb"

I can remove the pod, but it does not help:

$ podman pod rm pod_nextcloud 
562b052bdab9c31692403405579935979a7026f1945bbc0eb0f1594a2b80b546
$ podman inspect nextcloud_db_1
[]
Error: no such object: "nextcloud_db_1"

Describe the results you received

I cannot stop or start the containers.

Describe the results you expected

I should somehow be able to force/fix that. I have no idea what is "improper" nor how to fix it.

podman info output

$ podman-compose version
podman-compose version: 1.0.6
['podman', '--version', '']
using podman version: 4.6.1
podman-compose version 1.0.6
podman --version 
podman version 4.6.1
exit code: 0
$ podman version
Client:       Podman Engine
Version:      4.6.1
API Version:  4.6.1
Go Version:   go1.20.7
Built:        Fri Aug 11 00:07:53 2023
OS/Arch:      linux/amd64
$ podman info
host:
  arch: amd64
  buildahVersion: 1.31.2
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 98.31
    systemPercent: 0.82
    userPercent: 0.86
  cpus: 4
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: coreos
    version: "38"
  eventLogger: journald
  freeLocks: 615
  hostname: minipure
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1002
      size: 1
    - container_id: 1
      host_id: 231072
      size: 65536
  kernel: 6.4.7-200.fc38.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 60165713920
  memTotal: 67283185664
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.7.0-1.fc38.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.7.0
    package: netavark-1.7.0-1.fc38.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.7.0
  ociRuntime:
    name: crun
    package: crun-1.8.6-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.6
      commit: 73f759f4a39769f60990e7d225f561b4f4f06bcf
      rundir: /run/user/1002/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20230625.g32660ce-1.fc38.x86_64
    version: |
      pasta 0^20230625.g32660ce-1.fc38.x86_64
      Copyright Red Hat
      GNU Affero GPL version 3 or later <https://www.gnu.org/licenses/agpl-3.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    path: /run/user/1002/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-12.fc38.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 4294963200
  swapTotal: 4294963200
  uptime: 4h 27m 21.00s (Approximately 0.17 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /var/home/****/.config/containers/storage.conf
  containerStore:
    number: 6
    paused: 0
    running: 4
    stopped: 2
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/****/.local/share/containers/storage
  graphRootAllocated: 999650168832
  graphRootUsed: 46548176896
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 12
  runRoot: /run/user/1002/containers
  transientStore: false
  volumePath: /var/home/****/.local/share/containers/storage/volumes
version:
  APIVersion: 4.6.1
  Built: 1691705273
  BuiltTime: Fri Aug 11 00:07:53 2023
  GitCommit: ""
  GoVersion: go1.20.7
  Os: linux
  OsArch: linux/amd64
  Version: 4.6.1
$ rpm -q podman
podman-4.6.1-1.fc38.x86_64

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes kinda

Additional environment details

  • OS: Linux Fedora CoreOS v38.20230819.3.0
  • podman version: 4.6.1
  • podman compose version: (git hex) 1.0.6

Additional information

Docker-compose v'3.7' – exact same YAML started before without any problems.

I tested the echo example here and it did work, I have no idea what's wrong.

Cross-posted as containers/podman-compose#767 and coreos/fedora-coreos-tracker#1572

@rugk rugk added the kind/bug Categorizes issue or PR as related to a bug. label Sep 10, 2023
@rugk rugk changed the title Podman-compose down does not work anymore (container state improper) and up neither Podman rm does not work anymore (container state improper) and start neither Sep 10, 2023
@Luap99
Copy link
Member

Luap99 commented Sep 11, 2023

Did you try to remove them with podman rm --force?

@baude
Copy link
Member

baude commented Sep 11, 2023

the nuclear option might be system reset if you don't have anything you want to retain.

@georgevdl
Copy link

Since yesterday I have been experiencing similar issues (couldn't delete or start containers).
I have a fresh install of Fedora Server and have very little experience with Podman or Docker so I may be doing something wrong and there may not be any bug. However the server had been working fine for about a week and I didn't change anything yesterday related to podman containers.

I am only using 2 containers, nextcloud (docker.io/library/nextcloud) and nextcloud-db (quay.io/fedora/mariadb-105)

After some googling, yesterday I solved the issues by running sudo touch /.autorelabel and sudo reboot. After rebooting the containers started automatically through systemd as they always do.

Today (after rebooting) the same issue occurred. I haven't done the auto relabeling yet but I saw that the podman service is not running. I started it again and it stopped on its own a few seconds after it started.

$ sudo service podman start
[sudo] password for user: 
Redirecting to /bin/systemctl start podman.service
$ sudo service podman status
Redirecting to /bin/systemctl status podman.service
○ podman.service - Podman API Service
     Loaded: loaded (/usr/lib/systemd/system/podman.service; disabled; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: inactive (dead) since Mon 2023-09-11 21:30:01 EEST; 4s ago
   Duration: 5.053s
TriggeredBy: ● podman.socket
       Docs: man:podman-system-service(1)
    Process: 11574 ExecStart=/usr/bin/podman $LOGGING system service (code=exited, status=0/SUCCESS)
   Main PID: 11574 (code=exited, status=0/SUCCESS)
        CPU: 53ms

Sep 11 21:29:56 user systemd[1]: Starting podman.service - Podman API Service...
Sep 11 21:29:56 user systemd[1]: Started podman.service - Podman API Service.
Sep 11 21:29:56 user podman[11574]: time="2023-09-11T21:29:56+03:00" level=info msg="/usr/bin/podman filtering at log level info"
Sep 11 21:29:56 user podman[11574]: time="2023-09-11T21:29:56+03:00" level=info msg="Not using native diff for overlay, this may cause degraded performance for building images: kernel has CONFIG_OVERLAY_FS_REDIRECT_D>
Sep 11 21:29:56 user podman[11574]: time="2023-09-11T21:29:56+03:00" level=info msg="Setting parallel job count to 49"
Sep 11 21:29:56 user podman[11574]: time="2023-09-11T21:29:56+03:00" level=info msg="Using systemd socket activation to determine API endpoint"
Sep 11 21:29:56 user podman[11574]: time="2023-09-11T21:29:56+03:00" level=info msg="API service listening on \"/run/podman/podman.sock\". URI: \"/run/podman/podman.sock\""
Sep 11 21:30:01 user systemd[1]: podman.service: Deactivated successfully.
[user@user ~]$ 

cat /home/user/.config/systemd/user/container-nextcloud-db.service
# container-nextcloud-db.service
# autogenerated by Podman 4.6.2
# Mon Sep 4 20:03:58 EEST 2023

[Unit]
Description=Podman container-nextcloud-db.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
RequiresMountsFor=%t/containers

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman run \
    --cidfile=%t/%n.ctr-id \
    --cgroups=no-conmon \
    --rm \
    --sdnotify=conmon \
    --replace \
    --detach \
    --env MYSQL_DATABASE=(removed) \
    --env MYSQL_USER=(removed) \
    --env MYSQL_PASSWORD=(removed) \
    --env MYSQL_ROOT_PASSWORD=(removed) \
    --volume nextcloud-db:/var/lib/mysql:Z \
    --network nextcloud-net \
    -p (removed):(removed)/tcp \
    --name nextcloud-db quay.io/fedora/mariadb-105
ExecStop=/usr/bin/podman stop \
    --ignore -t 10 \
    --cidfile=%t/%n.ctr-id
ExecStopPost=/usr/bin/podman rm \
    -f \
    --ignore -t 10 \
    --cidfile=%t/%n.ctr-id
Type=notify
NotifyAccess=all

[Install]
WantedBy=default.target

cat /home/user/.config/systemd/user/container-nextcloud.service
# container-nextcloud.service
# autogenerated by Podman 4.6.2
# Mon Sep 4 20:04:01 EEST 2023

[Unit]
Description=Podman container-nextcloud.service
Documentation=man:podman-generate-systemd(1)
Wants=network-online.target
After=network-online.target
BindsTo=container-nextcloud-db.service
RequiresMountsFor=%t/containers

[Service]
Environment=PODMAN_SYSTEMD_UNIT=%n
Restart=on-failure
TimeoutStopSec=70
ExecStart=/usr/bin/podman run \
    --cidfile=%t/%n.ctr-id \
    --cgroups=no-conmon \
    --rm \
    --sdnotify=conmon \
    --replace \
    --detach \
    --env MYSQL_HOST=nextcloud-db.dns.podman \
    --env MYSQL_DATABASE=(removed) \
    --env MYSQL_USER=(removed) \
    --env MYSQL_PASSWORD=(removed) \
    --env NEXTCLOUD_ADMIN_USER=(removed) \
    --env OVERWRITEWEBROOT=(removed) \
    --env NEXTCLOUD_ADMIN_PASSWORD=(removed) \
    --volume nextcloud-app:/var/www/html:Z \
    --volume nextcloud-data:/var/www/html/data:Z \
    --network nextcloud-net \
    --name nextcloud \
    --publish (removed):(removed) docker.io/library/nextcloud
ExecStop=/usr/bin/podman stop \
    --ignore -t 10 \
    --cidfile=%t/%n.ctr-id
ExecStopPost=/usr/bin/podman rm \
    -f \
    --ignore -t 10 \
    --cidfile=%t/%n.ctr-id
Type=notify
NotifyAccess=all

[Install]
WantedBy=default.target

@rhatdan
Copy link
Member

rhatdan commented Sep 11, 2023

This issue is not similar ,please open a different issue. or discussion

@rugk
Copy link
Contributor Author

rugk commented Sep 12, 2023

Did you try to remove them with podman rm --force?

$ podman rm --force nextcloud_db_1
WARN[0000] Unmounting container "nextcloud_db_1" while attempting to delete storage: unmounting "/var/home/****/.local/share/containers/storage/overlay/e655542583cc8bbc943e354a5b441bfb3df598e1cc88a0fa5b6ade002d533263/merged": invalid argument 
Error: removing storage for container "nextcloud_db_1": unmounting "/var/home/****/.local/share/containers/storage/overlay/e655542583cc8bbc943e354a5b441bfb3df598e1cc88a0fa5b6ade002d533263/merged": invalid argument
$ podman rm --force nextcloud_redis_1
WARN[0000] Unmounting container "nextcloud_redis_1" while attempting to delete storage: unmounting "/var/home/****/.local/share/containers/storage/overlay/7ae38929f1fb15eeef60eef63118a0d71ac386260bff82e17beb24c12127bd87/merged": invalid argument 
Error: removing storage for container "nextcloud_redis_1": unmounting "/var/home/****/.local/share/containers/storage/overlay/7ae38929f1fb15eeef60eef63118a0d71ac386260bff82e17beb24c12127bd87/merged": invalid argument
$ podman rm --force nextcloud_nc_1
WARN[0000] Unmounting container "nextcloud_nc_1" while attempting to delete storage: unmounting "/var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged": invalid argument 
Error: removing storage for container "nextcloud_nc_1": unmounting "/var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged": invalid argument

Manually checking the dir /var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged e.g. shows it is empty.

nuclear option might be system reset if you don't have anything you want to retain.

Uhm I do like to retain container volumes 👀

@rugk rugk changed the title Podman rm does not work anymore (container state improper) and start neither Podman rm (--force) does not work anymore ("container state improper"/"invalid argument" when unmounting) and start neither Sep 12, 2023
@Luap99
Copy link
Member

Luap99 commented Sep 13, 2023

run podman unshare mount -t tmps none <path from the error message> then try podman rm --force again, this should work to delete the storage container.

@rugk
Copy link
Contributor Author

rugk commented Sep 13, 2023

$ podman rm --force nextcloud_nc_1
WARN[0000] Unmounting container "nextcloud_nc_1" while attempting to delete storage: unmounting "/var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged": invalid argument 
Error: removing storage for container "nextcloud_nc_1": unmounting "/var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged": invalid argument
$ podman unshare mount -t tmps none /var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged
mount: /var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged: unknown filesystem type 'tmps'.
       dmesg(1) may have more information after failed mount system call.

I ran dmesg, but did not see something related. Note I run everything without sudo/root rights as this is rootless podman.
Also note the error has survived a reboot before, so I have no idea what's wrong.

@rugk
Copy link
Contributor Author

rugk commented Sep 13, 2023

Ahhh you probably meant podman unshare mount -t tmpfs none i.e. tmpfs instead of tmps. Typos…

This worked and podman rm --force nextcloud_nc_1 also worked afterwards (without any console output). Woow…

Thaat works, thanks a lot!

Anyway, I assume this should have never happened, also that like three commands that usually help did not help is bad IMHO: Not a good experience, whatever went wrong here…

@rugk
Copy link
Contributor Author

rugk commented Sep 22, 2023

Damn, the issue has happened again. I again have this crazy behavior and I have no idea what caused this.
Also got a strange “directory not empty” for Redis and MariaDB now in the bug fixing:

$ podman rm --force nextcloud_redis_1
WARN[0000] Unmounting container "nextcloud_redis_1" while attempting to delete storage: unmounting "/var/home/***********/.local/share/containers/storage/overlay/452031fcc02308878ff8a9af9bac59f46d7aebf0c32ef6b8b421996c87b5ee6f/merged": invalid argument 
Error: removing storage for container "nextcloud_redis_1": unmounting "/var/home/***********/.local/share/containers/storage/overlay/452031fcc02308878ff8a9af9bac59f46d7aebf0c32ef6b8b421996c87b5ee6f/merged": invalid argument
$ podman unshare mount -t tmpfs none /var/home/***********/.local/share/containers/storage/overlay/452031fcc02308878ff8a9af9bac59f46d7aebf0c32ef6b8b421996c87b5ee6f/merged
$ podman rm --force nextcloud_redis_1
WARN[0000] Unmounting container "nextcloud_redis_1" while attempting to delete storage: removing mount point "/var/home/***********/.local/share/containers/storage/overlay/452031fcc02308878ff8a9af9bac59f46d7aebf0c32ef6b8b421996c87b5ee6f/merged": directory not empty 
Error: removing storage for container "nextcloud_redis_1": unmounting "/var/home/***********/.local/share/containers/storage/overlay/452031fcc02308878ff8a9af9bac59f46d7aebf0c32ef6b8b421996c87b5ee6f/merged": invalid argument
$ podman rm --force nextcloud_db_1
WARN[0000] Unmounting container "nextcloud_db_1" while attempting to delete storage: unmounting "/var/home/***********/.local/share/containers/storage/overlay/8872c5d43462763a2fec2104e5e86656e47879c6373c650f87958e9668961155/merged": invalid argument 
Error: removing storage for container "nextcloud_db_1": unmounting "/var/home/***********/.local/share/containers/storage/overlay/8872c5d43462763a2fec2104e5e86656e47879c6373c650f87958e9668961155/merged": invalid argument
$ podman unshare mount -t tmpfs none /var/home/***********/.local/share/containers/storage/overlay/8872c5d43462763a2fec2104e5e86656e47879c6373c650f87958e9668961155/merged
$ podman rm --force nextcloud_db_1
WARN[0000] Unmounting container "nextcloud_db_1" while attempting to delete storage: removing mount point "/var/home/***********/.local/share/containers/storage/overlay/8872c5d43462763a2fec2104e5e86656e47879c6373c650f87958e9668961155/merged": directory not empty 
Error: removing storage for container "nextcloud_db_1": unmounting "/var/home/***********/.local/share/containers/storage/overlay/8872c5d43462763a2fec2104e5e86656e47879c6373c650f87958e9668961155/merged": invalid argument

Actually only Redis and the DB container were affected by the issue.
And I checked the path, yeah there is a file system (respectively the usual dirs of a root FS) in there and it is indeed not empty.

@rugk
Copy link
Contributor Author

rugk commented Sep 22, 2023

As you saw I executed the commands, though the container state was still “improper” afterwards. Anyway, after rebooting I executed podman ps and it takes ages to finish, but this is usual for me and has happened a lot of times…

I'll see whether it works later maybe.

@rugk
Copy link
Contributor Author

rugk commented Sep 22, 2023

Okay so after a reboot (which also starts/tries to start all containers I have to say) it first seemed to work (workaround worked for removing redis worked, but failed for the DB (MariaDB) container again due to the directory not being empty.

In the end I could "solve" the issue by deleting the directory it mentions respectively just renaming it (to have a backup, just in case), then the container could be removed.

Anyway, this workaround just got even harder and I have no clue what the root case is.

@rugk
Copy link
Contributor Author

rugk commented Oct 1, 2023

Aaand had it again. Could it be related to increased disk I/P (have stuff running in my Nextcloud now) or maybe DB connections? (I frequently have DB timeout issues like these). But this thing is new here…

Could fix it again, but this is annoying.

Frustrated I asked ChatGPT to automate this and it also gave me more hints to check stuff:


It seems like you're facing issues with removing containers in Podman due to "container state improper" errors. This can be a frustrating problem, but let's try to address it.

First, it's important to note that these issues might be specific to your environment or the specific containers you're running. However, there are some general troubleshooting steps you can take:

  1. Check for Zombie Containers:
    Sometimes, containers might not have completely exited and could be in a "zombie" state. You can check for such containers using:

    podman ps -a --filter "status=exited"
    

    If you see any containers in this state, you can try to remove them using podman rm with the --force flag.

  2. Manually Unmount Overlay Filesystem:
    It appears that the issue is related to unmounting overlay filesystems. You can try manually unmounting these filesystems using the podman unshare command. For example:

    podman unshare mount -t tmpfs none /var/home/****/.local/share/containers/storage/overlay/3989bcb8ce368e6b0a2bb98fa3fa6033edb16e0d5541ed37077b603101cc113e/merged
    

    After doing this for the problematic containers, try running podman rm --force again.

  3. Check for Disk Space:
    Ensure that you have enough disk space available on your system. If your storage is running low, it can lead to issues with container management.

  4. Podman and Kernel Version Compatibility:
    Sometimes, issues like these can be related to compatibility between Podman and the Linux kernel. Ensure that you are using a compatible version of Podman with your kernel.

  5. Podman and Systemd Service:
    It seems you have systemd service files for your containers. Check the status of these services to see if they are running or stopped:

    sudo service podman status
    

    If there are any issues with the Podman service itself, it might affect container management.

  6. Logs and Debugging:
    Check the logs for Podman and your containers for any error messages or clues about what might be causing the issue. You can use journalctl to view Podman's logs:

    journalctl -u podman.service
    

    Additionally, inspect the container logs:

    podman logs <container_name_or_id>
    

Remember that forcefully removing containers can lead to data loss, so make sure you understand the consequences before proceeding.


Edit: Just saw now BTW ChatGPT titled the topic/conversation as being „Covid-19 Vaccine Priority List” lol what… 🤣

When asked again it also gave me a workaround shell script to automate the manual unshare etc. steps. While cumbersome, I may try this the next time this happens and report back whether it works.

BTW checked the disk space (df -h) and it is at most 60% at the host, so all okay. I am unsure whether disk space usage in containers is something I can or need to check, but they should have no limit AFAIK, so well…

@rugk
Copy link
Contributor Author

rugk commented Oct 6, 2023

Updated https://gist.github.com/rugk/aab9539f689962ed8ff78ec5b5c94918 with script that works now. Anyway, how does this happen in the first place and how can I prevent it?

@Luap99
Copy link
Member

Luap99 commented Oct 6, 2023

Please update to 4.7 and see if you can reproduce there.

Copy link

github-actions bot commented Nov 6, 2023

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Nov 6, 2023

We believe this is fixed in current version.

@rhatdan rhatdan closed this as completed Nov 6, 2023
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Feb 5, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

5 participants