Skip to content

All podman functions hang after attempting to stop container #24487

@pavinjosdev

Description

@pavinjosdev

Issue Description

Once containers are started, podman works normally for a few hours.
But after a few hours, if I attempt to stop one of the containers, the action hangs and then podman itself hangs on subsequent commands like ps.

The issue is solved after deleting the empty files of format netns-{UUID} in /run/user/1000/netns/ directory.
Podman once again works normally for a few hours and the problem repeats.

Steps to reproduce the issue

  1. Start container
  2. Wait few hours
  3. Try to stop container

Describe the results you received

pavin@suse-pc:~> podman --log-level debug stop trading
INFO[0000] podman filtering at log level debug          
DEBU[0000] Called stop.PersistentPreRunE(podman --log-level debug stop trading) 
DEBU[0000] Using conmon: "/usr/bin/conmon"              
INFO[0000] Using sqlite as database backend             
DEBU[0000] Using graph driver overlay                   
DEBU[0000] Using graph root /home/pavin/.local/share/containers/storage 
DEBU[0000] Using run root /run/user/1000/containers     
DEBU[0000] Using static dir /home/pavin/.local/share/containers/storage/libpod 
DEBU[0000] Using tmp dir /run/user/1000/libpod/tmp      
DEBU[0000] Using volume path /home/pavin/.local/share/containers/storage/volumes 
DEBU[0000] Using transient store: false                 
DEBU[0000] [graphdriver] trying provided driver "overlay" 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that overlay is supported 
DEBU[0000] Cached value indicated that metacopy is not being used 
DEBU[0000] Cached value indicated that native-diff is usable 
DEBU[0000] backingFs=btrfs, projectQuotaSupported=false, useNativeDiff=true, usingMetacopy=false 
DEBU[0000] Initializing event backend journald          
DEBU[0000] Configured OCI runtime ocijail initialization failed: no valid executable found for OCI runtime ocijail: invalid argument 
DEBU[0000] Configured OCI runtime crun-vm initialization failed: no valid executable found for OCI runtime crun-vm: invalid argument 
DEBU[0000] Configured OCI runtime runc initialization failed: no valid executable found for OCI runtime runc: invalid argument 
DEBU[0000] Configured OCI runtime runsc initialization failed: no valid executable found for OCI runtime runsc: invalid argument 
DEBU[0000] Configured OCI runtime runj initialization failed: no valid executable found for OCI runtime runj: invalid argument 
DEBU[0000] Configured OCI runtime kata initialization failed: no valid executable found for OCI runtime kata: invalid argument 
DEBU[0000] Configured OCI runtime youki initialization failed: no valid executable found for OCI runtime youki: invalid argument 
DEBU[0000] Using OCI runtime "/usr/bin/crun"            
INFO[0000] Setting parallel job count to 37             
DEBU[0000] Starting parallel job on container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0000] Stopping ctr 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f (timeout 10) 
DEBU[0000] Stopping container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f (PID 7109) 
DEBU[0000] Sending signal 15 to container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Timed out stopping container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f with SIGTERM, resorting to SIGKILL: given PID did not die within timeout 
WARN[0010] StopSignal SIGTERM failed to stop container trading in 10 seconds, resorting to SIGKILL 
DEBU[0010] Sending signal 9 to container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Container "8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f" state changed from "stopping" to "stopped" while waiting for it to be stopped: discontinuing stop procedure as another process interfered 
DEBU[0010] Cleaning up container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Tearing down network namespace at /run/user/1000/netns/netns-d2685668-e90c-ac8b-99d2-20649aaa0936 for container 8ae550f0ed4cd9de128227e96eb0661d0af4b672cae41c7cb9becee96b56a53f 
DEBU[0010] Netns /run/user/1000/netns/netns-d2685668-e90c-ac8b-99d2-20649aaa0936 still busy, try removing it again in 10ms 
DEBU[0010] Netns /run/user/1000/netns/netns-d2685668-e90c-ac8b-99d2-20649aaa0936 still busy, try removing it again in 10ms 
... (repeated many tens of thousands of times)

Describe the results you expected

Podman stops container without hanging

podman info output

pavin@suse-pc:~> podman info
host:
  arch: amd64
  buildahVersion: 1.37.5
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-1.1.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: unknown'
  cpuUtilization:
    idlePercent: 96.34
    systemPercent: 0.95
    userPercent: 2.72
  cpus: 12
  databaseBackend: sqlite
  distribution:
    distribution: opensuse-slowroll
    version: "20241002"
  eventLogger: journald
  freeLocks: 2002
  hostname: suse-pc
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.11.5-1-default
  linkmode: dynamic
  logDriver: journald
  memFree: 460140544
  memTotal: 13972418560
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.2-1.1.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.2
    package: netavark-1.12.2-1.1.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.17-1.1.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.17
      commit: 000fa0d4eeed8938301f3bcf8206405315bc1017
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-20240906.6b38f07-2.1.x86_64
    version: |
      pasta 20240906.6b38f07-2.1
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.3.1-1.1.x86_64
    version: |-
      slirp4netns version 1.3.1
      commit: unknown
      libslirp: 4.8.0
      SLIRP_CONFIG_VERSION_MAX: 5
      libseccomp: 2.5.5
  swapFree: 7324561408
  swapTotal: 8589930496
  uptime: 51h 16m 3.00s (Approximately 2.12 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.opensuse.org
  - registry.suse.com
  - docker.io
store:
  configFile: /home/pavin/.config/containers/storage.conf
  containerStore:
    number: 3
    paused: 0
    running: 3
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/pavin/.local/share/containers/storage
  graphRootAllocated: 498681774080
  graphRootUsed: 88692174848
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/pavin/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.5
  Built: 1729756620
  BuiltTime: Thu Oct 24 13:27:00 2024
  GitCommit: ""
  GoVersion: go1.23.2
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.5

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.networkNetworking related issue or feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions