Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion containerd.service
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ RestartSec=5
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
LimitNOFILE=1048576
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LimitNOFILE=1048576
LimitNOFILE=1024:524288

Based on looking into the topic extensively, the following is:

  • More than enough for docker.service and containerd.service to function (once they implicitly raise their soft limit to the hard limit as Go 1.19+ does)
  • Sufficient for each process in a container to inherit with this hard limit (same as systemd decided), whilst having the configured soft limit restored (pending backport to Go 1.19+) which will then match the limits the process would run with outside of a container.

Additional info

  • This is the default you'd get with systemd from v240 onwards, even on debian based systems that don't override fs.nr_open (leaving it at default 1048576, this is the value that infinity resolves to).
  • LimitNOFILE could technically be dropped. It is only relevant to builds before Go 1.19 because AFAIK there is nothing internal explicitly raising the soft limit, so the setting was used to ensure it was sufficient enough for containerd to do it's thing (not containers themselves that inherit the limits).
  • containerd technically only needs 262144 (2^18) to support 65k (2^16) busybox containers, which in itself needs over 200GiB of memory. I have shared my investigation + reproduction to back that up.
  • CI services like Github Actions are using limits of 2^16 which mitigates the issues and appears to be serving them well. That should still be capable of supporting thousands of containers on systems with memory of 64GB or higher.
  • The soft limit of 1024 would be ideal and reflect running software on a host outside of a container, and should not cause any regressions with builds using Go 1.19+.

Copy link

@dambrosio dambrosio Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@polarathene thanks for this suggestion, I tried to set the LimitNOFILE=1024:524288 for both the docker.service and containerd.service using override.conf files in /etc/systemd/system/docker.service.d and /etc/systemd/system/containerd.service.d and noticed that both the hard limit and soft limit for nofile (within my container) were set to 524288. I wonder if the soft limit from LimitNOFILE is ignored?

I know the override.conf files are "dropped-in":

● containerd.service - containerd container runtime
     Loaded: loaded (/usr/lib/systemd/system/containerd.service; enabled; preset: enabled)
    Drop-In: /etc/systemd/system/containerd.service.d
             └─override.conf

Output from within the container:

:~# docker run -it --network host --entrypoint bash container-name
:/# ulimit -Hn
524288
:/# ulimit -Sn
524288

Output on host:

:~# ulimit -Hn
524288
:~# ulimit -Sn
1024

Here is my output of docker version:

Client:
 Version:           23.0.1
 API version:       1.42
 Go version:        go1.19.7
 Git commit:        23.0.1
 Built:             unknown-buildtime
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          23.0.1
  API version:      1.42 (minimum version 1.12)
  Go version:       go1.19.7
  Git commit:       buildroot
  Built:            
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.16
  GitCommit:        
 runc:
  Version:          1.1.4
  GitCommit:     

If I add the following to my /etc/docker/daemon.json the ulimit hard and soft values for nofile are correct:

    "default-ulimits": {
        "nofile": {
            "Name": "nofile",
            "Hard": 524288,
            "Soft": 1024
        }
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dambrosio @polarathene it's a Golang issue, and will be fixed in the next patch release (the Go maintainers acknowledged a change they made in Go 1.19 was a regression, and a fix will be included in the next patch release); see the discussion on this ticket, and the related backport tickets;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thaJeztah I am well aware and was involved in the discussions to get the issue addressed 😎


@dambrosio my messages are a bit verbose sorry, but I did point this out in my review comment above:

_whilst having the configured soft limit restored (pending backport to Go 1.19+) _

Once that is available, both dockerd.service and containerd.service should adjust LimitNOFILE as suggested, and the soft limit will be respected for containers (without being an issue for either daemons needs as since Go 1.19).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am well aware and was involved in the discussions to get the issue addressed 😎

DOH! I got lost in all the linked issues, and replied from my phone, yes ... I knew you knew...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion seems completely reasonable assuming the syntax works even on older versions of systemd. I think centos7 is v219?

Copy link
Contributor

@polarathene polarathene Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming the syntax works even on older versions of systemd.

If previous values of 1048576 and infinity have worked fine prior to systemd v240, then yes LimitNOFILE should be fine applying the soft/hard limit suggested here.

As long as the releases are built with Go 1.19+ (otherwise the daemons would restrained to running approx 150 containers).

Copy link
Contributor

@polarathene polarathene May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go updated with fix

Could the LimitNOFILE=1024:524288 change request be applied, and queue this PR for review / merge? 😀


EDIT: I've created the equivalent PR for moby (docker.service and friends): moby/moby#45534

Let me know if you'd like a similar PR for containerd (avoids the noise present here, and the original author has not applied the suggested feedback).


Confirmation

Test values:

  • containerd.service: LimitNOFILE=2048:8192
  • docker.service: LimitNOFILE=3072:4096
FROM alpine
RUN echo "Soft: $(ulimit -Sn)" >> /limits.txt
RUN echo "Hard: $(ulimit -Hn)" >> /limits.txt
# From v23 `docker build` uses LimitNOFILE from `docker.service`,
# Alternatively `DOCKER_BUILDKIT=0` will inherit `containerd.service` LimitNOFILE instead:
$ docker build --no-cache -t limits-test .
$ docker run --rm limits-test cat /limits.txt

Soft: 3072
Hard: 4096

# Silently ignores configured limit and falls back to inherited LimitNOFILE from service file:
# (If soft limit exceeds a hard limit specified/inherited, error will be output)
$ docker build --no-cache --ulimit 'nofile=4096:16384' -t limits-test .
$ docker run --rm limits-test cat /limits.txt

Soft: 3072
Hard: 4096

# So long as values don't exceed the hard limit inherited, adjustments apply:
$ docker build --no-cache --ulimit 'nofile=3080:3090' -t limits-test .
$ docker run --rm limits-test cat /limits.txt

Soft: 3080
Hard: 3090
# LimitNOFILE from `containerd.service` applies to `docker run`:
$ docker run --rm alpine ash -c 'ulimit -Sn && ulimit -Hn'
2048
8192

# Same behaviour observed with `docker build` when hard limit exceeds inherited:
$ docker run --rm --ulimit 'nofile=4096:16384' alpine ash -c 'ulimit -Sn && ulimit -Hn'
2048
8192

# Same behaviour observed with `docker build` when valid override provided:
$ docker run --rm --ulimit 'nofile=1536:6144' alpine ash -c 'ulimit -Sn && ulimit -Hn'
1536
6144

This was tested on a Vultr VPS instance with Ubuntu 23.04 and installing Docker via the docs:

  • docker-ce 23.0.6
  • containerd containerd.io 1.6.21 3dce8eb055cbb6872793272b4f20ed16117344f8
  • runc version: v1.1.7-0-g860f061
docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.10.4
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.17.3
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 23.0.6
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
 runc version: v1.1.7-0-g860f061
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.2.0-20-generic
 Operating System: Ubuntu 23.04
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 945.4MiB
 Name: docker-limit
 ID: d94a5a6d-9d37-4244-bb85-6ca72526acdc
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LimitNOFILE=1048576

For the equivalent docker.service config, the decision was to drop this line, and it'll be part of the v25 release of moby.

Equivalent change should hopefully follow soon here 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being so slow. I see in the meantime you created a separate PR already (#8924).
Looks like it's about to be merged 🤞

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I know how it is :)

Thanks for kickstarting the process with the original attempt here! ❤️

# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
Expand Down