Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Podman file descriptor limit applies to entire container not individual processes #2053

Closed
edwintorok opened this issue Dec 25, 2018 · 3 comments
Assignees
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Milestone

Comments

@edwintorok
Copy link

edwintorok commented Dec 25, 2018

Is this a BUG REPORT or FEATURE REQUEST?:

[//]: kind bug

Description

I get EMFILE when running any command inside a rootless unprivileged podman container, whenever one process in the background has used up all its file descriptors.

Steps to reproduce the issue:

  1. See esy: curl: error while loading shared libraries: libm.so.6: cannot open shared object file: Error 24 esy/esy#762
podman run -it ocaml/opam2:ubuntu-18.10
git pull
opam update
sudo apt-get update
sudo apt-get install -y npm
npm install esy
export PATH=`pwd`/node_modules/.bin:$PATH
hash -r
git clone https://github.com/esy/esy.git
cd esy
esy 

(You can also install strace and run strace esy and see EMFILE errors)

  1. You can ^Z to background esy, and then try to run some commands, e.g. id

Describe the results you received:

opam@fa945c3503ef:~/opam-repository/esy$ id
id: error while loading shared libraries: libc.so.6: cannot open shared object file: Error 24

 curl: error while loading shared libraries: libp11-kit.so.0: cannot open shared object file: Error 24
                        curl: error while loading shared libraries: libtasn1.so.6: cannot open shared object file: Error 24
                                                               curl: error while loading shared libraries: libhcrypto.so.4: cannot open shared object file: Error 24
                                                                curl: error while loading shared libraries: libidn2.so.0: cannot open shared object file: Error 24
                                                                 curl: error while loading shared libraries: libcurl.so.4: cannot open shared object file: Error 24
                                                                 curl: error while loading shared libraries: libcom_err.so.2: cannot open shared object file: Error 24
                                                               curl: error while loading shared libraries: libz.so.1: cannot open shared object file: Error 24
                                                               curl: error while loading shared libraries: libc.so.6: cannot open shared object file: Error 24

(errno 24 is EMFILE)

Describe the results you expected:
esy to run without EMFILE errors, try running sudo podman instead, set ulimit -n 1024, and there are no EMFILE errors then.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:       0.12.1.2
Go Version:    go1.11.2
Git Commit:    "67ab7549b44484cc3f201d7bb2b58b922f8edc24"
Built:         Thu Dec 13 19:35:26 2018
OS/Arch:       linux/amd64

Output of podman info:

host:
  BuildahVersion: 1.6-dev
  Conmon:
    package: podman-0.12.1.2-1.git9551f6b.fc29.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.12.0-dev, commit: 67ab7549b44484cc3f201d7bb2b58b922f8edc24'
  Distribution:
    distribution: fedora
    version: "29"
  MemFree: 19445043200
  MemTotal: 33684381696
  OCIRuntime:
    package: runc-1.0.0-59.dev.gitccb5efd.fc29.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc6
      commit: 6e5a791a02fefb403034e0de8693d225d52b33a7
      spec: 1.0.1-dev
  SwapFree: 16915623936
  SwapTotal: 16915623936
  arch: amd64
  cpus: 8
  hostname: bolt
  kernel: 4.19.10-300.fc29.x86_64
  os: linux
  rootless: true
  uptime: 1h 34m 15.55s (Approximately 0.04 days)
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.access.redhat.com
  - registry.centos.org
store:
  ContainerStore:
    number: 12
  GraphDriverName: overlay
  GraphOptions:
  - overlay.mount_program=/usr/bin/fuse-overlayfs
  - overlay.mount_program=/usr/bin/fuse-overlayfs
  GraphRoot: /var/home/edwin/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
  ImageStore:
    number: 5
  RunRoot: /run/user/1000

Additional environment details (AWS, VirtualBox, physical, etc.):
Fedora 29 Silverblue physical.

This seems to be related to fuse-overlayfs running out of FDs, if I strace fuse-overlayfs outside of the container:

enat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-714c8f/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-dade59/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-6c52ff/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(15, "lib/x86_64-linux-gnu/libnss_files-2.28.so", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(8, "usr/lib/x86_64-linux-gnu/libkrb5support.so.0.1", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(15, "lib/x86_64-linux-gnu/libnss_files-2.28.so", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-714c8f/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-dade59/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(8, "usr/lib/x86_64-linux-gnu/libroken.so.18.1.0", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(16, "etc/ld.so.cache", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(8, "usr/lib/x86_64-linux-gnu/libheimbase.so.1.0.0", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-714c8f/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-dade59/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-714c8f/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(8, "usr/lib/x86_64-linux-gnu/libk5crypto.so.3.1", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(16, "home/opam/opam-repository/esy/_esy/default/tmp/esy-dade59/archive", O_RDONLY|O_NONBLOCK) = -1 EMFILE (Too many open files)
openat(15, "lib/x86_64-linux-gnu/libnss_dns-2.28.so", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)
openat(15, "lib/x86_64-linux-gnu/libnss_dns-2.28.so", O_RDONLY|O_LARGEFILE|O_NOFOLLOW) = -1 EMFILE (Too many open files)

When run with sudo there apparently is no fuse-overlayfs run, which is probably why it works.
Should podman increase the ulimit before running fuse-overlayfs? (not sure if it can because the user by default has a hard limit of 1024 in Fedora 29 Silverblue, unless it would use a suid/setcap helper)

@rhatdan
Copy link
Member

rhatdan commented Dec 27, 2018

Non priv podman is not going to be allowed to create more processes then the user is allowed to create. If it was, I would describe this as a Problem in the Kernel, and a vulnerability. I don't see how we can satisfy this issue.

As a side note, I wonder if the container is limited to 1024 per UID or total. Could you create 1024 open FDs for ROOT inside the container, and another 1024 for UID 1 inside the container? (User Namespace)

@edwintorok
Copy link
Author

edwintorok commented Dec 27, 2018

FWIW the vfs backend doesn't suffer from this problem, just the fuse one (esy actually succeeds).
I haven't reached the maximum number of processes, I've just reached the maximum number of file descriptors inside 1 process, and that process is fuse-overlayfs. Because AFAICT all file accesses from inside the container go through this single process, the entire container is limited to 1024 FDs/container, instead of 1024 FDs/process.

On my fedora system the default user has a soft limit of 1024 file descriptors, and hard limit of 4096:

$ ulimit -n
1024
$ ulimit -Hn
4096

So if I increase the FD limit before running podman then it delays the problem (and in this case it allows esy to make a bit more progress): ulimit -n 4096; podman .... I would now have a limit of 4096 FD/container instead of 1024, which is still wrong, and then we hit another error, but apparently its ignorable and then I can run again:

error command failed: 'tar' 'xf' '/home/opam/opam-repository/esy/_esy/default/dist/87d41e3b86827e177edc57b8d03998fa' '-C' '/tmp/esy-75a89f'
      stderr:
               tar: ocamlbuild-0.12.0/examples/07-dependent-projects/libdemo: Cannot utime: Operation not permitted
               tar: Exiting with failure status due to previous errors
               
      stdout:

There is also the problem that ulimit inside the container becomes 1024, and it cannot be raised, not even to 4096, not even if you sudo.

Couple of things that come to mind:

  • could fuse-overlayfs try to raise the file descriptor ulimit up to its hard max? (this might avoid the common problem where a process runs close to its ulimit, then forks and calls execve without closing its FDs, fuse-overlayfs got some extra files open so it actually hits the limit, whereas normal process wouldn't.)
  • could fuse-overlayfs fork itself, or could podman spawn multiple fuse-overlayfs processes to avoid the FD limit problem?
  • could there be a setuid helper that increases the RLIMIT_NOFILE limit, drops privileges and capabilities and then runs fuse-overlayfs? (although at that point you are probably not rootless).

@giuseppe
Copy link
Member

giuseppe commented Jan 4, 2019

I've opened a PR for fuse-overlayfs to always bump the RLIMIT_NOFILE soft limit to its hard limit.

I don't think it is possible to use multiple processes to circumvent the rlimit. The fd to the file is handed to FUSE once opened.

@jwhonce jwhonce added this to the 1.0 milestone Jan 7, 2019
@giuseppe giuseppe closed this as completed Jan 7, 2019
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants