Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad container image consuming all network IP addresses when using userns=keep-id #18615

Closed
fgiorgetti opened this issue May 17, 2023 · 5 comments · Fixed by #20384
Closed

Bad container image consuming all network IP addresses when using userns=keep-id #18615

fgiorgetti opened this issue May 17, 2023 · 5 comments · Fixed by #20384
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature

Comments

@fgiorgetti
Copy link

fgiorgetti commented May 17, 2023

Issue Description

All IP addresses from a podman network, are being consumed by a container using
an invalid image (which keeps restarting constantly) when using --restart=always and --userns=keep-id.

Steps to reproduce the issue

Steps to reproduce the issue

  1. podman network create sample
  2. podman run --name test -u 1000 --userns keep-id --network sample --restart always --network-alias test -d registry.k8s.io/kube-apiserver:v1.25.3
  3. Watch ls -l /run/user/1000/netns | wc -l and you will see number of control files keep increasing
  4. After it consumes all the (default: 255) IP addresses from assigned range, a new container cannot connect to this network
  5. Run a few containers and connect them to this network and you will get a: Error: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254

Describe the results you received

Error: IPAM error: failed to find free IP in range: 10.89.0.1 - 10.89.0.254

Describe the results you expected

IPs should have been released for failed container.

podman info output

host:
  arch: amd64
  buildahVersion: 1.30.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 92.77
    systemPercent: 2.05
    userPercent: 5.18
  cpus: 16
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: workstation
    version: "38"
  eventLogger: journald
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.2.14-300.fc38.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 30239887360
  memTotal: 67101425664
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.8.4-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.4
      commit: 5a8fa99a5e41facba2eda4af12fa26313918805b
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-12.fc38.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 3h 13m 5.00s (Approximately 0.12 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  192.168.124.1:5000:
    Blocked: false
    Insecure: true
    Location: 192.168.124.1:5000
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: 192.168.124.1:5000
    PullFromMirror: ""
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/fgiorget/.config/containers/storage.conf
  containerStore:
    number: 7
    paused: 0
    running: 1
    stopped: 6
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/fgiorget/.local/share/containers/storage
  graphRootAllocated: 1022488477696
  graphRootUsed: 202280235008
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 2
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/fgiorget/.local/share/containers/storage/volumes
version:
  APIVersion: 4.5.0
  Built: 1681486942
  BuiltTime: Fri Apr 14 12:42:22 2023
  GitCommit: ""
  GoVersion: go1.20.2
  Os: linux
  OsArch: linux/amd64
  Version: 4.5.0

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

@fgiorgetti fgiorgetti added the kind/bug Categorizes issue or PR as related to a bug. label May 17, 2023
@Luap99
Copy link
Member

Luap99 commented May 19, 2023

I haven't verified it yet but I am 99% sure that #18468 would fix this.

@Luap99 Luap99 added the network Networking related issue or feature label May 19, 2023
@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@Luap99 Luap99 self-assigned this Oct 17, 2023
Luap99 added a commit to Luap99/libpod that referenced this issue Oct 17, 2023
When a userns and netns is used we need to let the runtime create the
netns othwerwise the netns is not owned by the right userns and thus
the capabilities would not be correct.

The current restart logic tries to reuse the netns which is fine if no
userns is used but when one is used we setup a new netns (which is
correct) but forgot to cleanup the old netns. This resulted in leaked
network namespaces and because no teardown was ever called leaked ipam
assignments, thus a quickly restarting contianer will run out of ip
space very fast.

Fixes containers#18615

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/libpod that referenced this issue Oct 17, 2023
When a userns and netns is used we need to let the runtime create the
netns othwerwise the netns is not owned by the right userns and thus
the capabilities would not be correct.

The current restart logic tries to reuse the netns which is fine if no
userns is used but when one is used we setup a new netns (which is
correct) but forgot to cleanup the old netns. This resulted in leaked
network namespaces and because no teardown was ever called leaked ipam
assignments, thus a quickly restarting contianer will run out of ip
space very fast.

Fixes containers#18615

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/libpod that referenced this issue Oct 17, 2023
When a userns and netns is used we need to let the runtime create the
netns otherwise the netns is not owned by the right userns and thus
the capabilities would not be correct.

The current restart logic tries to reuse the netns which is fine if no
userns is used but when one is used we setup a new netns (which is
correct) but forgot to cleanup the old netns. This resulted in leaked
network namespaces and because no teardown was ever called leaked ipam
assignments, thus a quickly restarting container will run out of ip
space very fast.

Fixes containers#18615

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this issue Oct 17, 2023
When a userns and netns is used we need to let the runtime create the
netns otherwise the netns is not owned by the right userns and thus
the capabilities would not be correct.

The current restart logic tries to reuse the netns which is fine if no
userns is used but when one is used we setup a new netns (which is
correct) but forgot to cleanup the old netns. This resulted in leaked
network namespaces and because no teardown was ever called leaked ipam
assignments, thus a quickly restarting container will run out of ip
space very fast.

Fixes containers#18615

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
edsantiago pushed a commit to edsantiago/libpod that referenced this issue Oct 17, 2023
When a userns and netns is used we need to let the runtime create the
netns otherwise the netns is not owned by the right userns and thus
the capabilities would not be correct.

The current restart logic tries to reuse the netns which is fine if no
userns is used but when one is used we setup a new netns (which is
correct) but forgot to cleanup the old netns. This resulted in leaked
network namespaces and because no teardown was ever called leaked ipam
assignments, thus a quickly restarting container will run out of ip
space very fast.

Fixes containers#18615

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@fgiorgetti
Copy link
Author

@Luap99 should this be reopened? I noticed that the related PR has been closed as well.

@Luap99
Copy link
Member

Luap99 commented Dec 1, 2023

This was fixed in #20384 which should be in 4.8

@fgiorgetti
Copy link
Author

This was fixed in #20384 which should be in 4.8

Thank you! I just found it.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Mar 1, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants