New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alpha version exhausting processes/threads when limits are not reached #1281

Closed
victorgp opened this Issue May 12, 2016 · 13 comments

Comments

Projects
None yet
5 participants
@victorgp

victorgp commented May 12, 2016

The current alpha versions 1032.1.0 and 1045.0.0 (the ones i tried) are exhausting the processes/threads very quickly even when the limits are not reached.

This problem doesn't occur in stable(899.17.0) and beta (1010.3.0) versions

I also increased the limits in systemd to discard that, but fails anyway.

Easy to reproduce:

1- Install alpha version
2- Create file /etc/systemd/system/docker.service.d/10-settings.conf with content:

[Service]
TasksAccounting=true
TasksMax=infinity
LimitNPROC=1048600
LimitNOFILE=1048601

3- Restart docker

systemctl daemon-reload
systemctl restart docker

4- Run a container:

docker run --rm -ti ubuntu:trusty /bin/bash

5- Check limits inside the container

root@b48a00426ee9:/# ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256425
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1048600
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

6- Loop inside the container that creates a lot of threads

root@b48a00426ee9:/# for i in {1..3000}; do sleep 5 & done

Output:

[0] 24
[...]
[500] 524
[501] 525
[502] 526
[503] 527
[504] 528
[505] 529
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: No child processes

etc.

For our case, alpha version is unusable right now, running around 20 containers make them fail because they cannot create more threads.

@mischief

This comment has been minimized.

mischief commented May 12, 2016

i believe this is because the cgroup created by docker on behalf of the container has the DefaultTasksMax pid limit set on the pids cgroup controller.

core@core ~ $ docker run -d ubuntu:trusty /bin/bash sleep 99999
b3f0418b6d1ad5127bb1925ba3b1356191c1cab4289e61525a52a7ba4687463b
core@core ~ $ cat /sys/fs/cgroup/pids/system.slice/docker-80fbdf5f42fc1b68992dc139fd12b3421490df5717efa82898f3755d663e503d.scope/pids.max 
512

in docker 1.11, you could avoid this by using --pids-limit=-1.

can you try setting DefaultTasksMax?

@victorgp

This comment has been minimized.

victorgp commented May 12, 2016

Yes, that seems to be the problem, now:

bash-4.2# docker run  -d ubuntu:trusty sleep 99999
bc006b7fdb2ec03e48b2d6defbff9010c99f467c5ca5421b5a405b2944df5f14
bash-4.2# cat /sys/fs/cgroup/pids/system.slice/docker-bc006b7fdb2ec03e48b2d6defbff9010c99f467c5ca5421b5a405b2944df5f14.scope/pids.max
1048600

Anyway, for this new CoreOS version are we expected to set this DefaultTasksMax value or pid-limit parameter?

Shouldn't the cgroup created by docker inherit its systemd parameters like TasksMax?

@mischief

This comment has been minimized.

mischief commented May 12, 2016

the pids cgroup controller was enabled in CoreOS 1029.0.0.

@victorgp

This comment has been minimized.

victorgp commented May 12, 2016

What does that exactly mean? is this not a bug?

If docker is creating cgroups ignoring its systemd configuration, then, to increase the number of pids, we either set it up in docker or with DefaultTasksMax. Is that the expected behaviour?

@robszumski

This comment has been minimized.

Member

robszumski commented May 12, 2016

From a user experience perspective, can we increase this so the default experience matches what folks expect? IE, being able to run more than 20 containers on a host.

@marineam

This comment has been minimized.

marineam commented May 13, 2016

@robszumski the limit should just be per-container, so 1 or 20 containers are fine but each suffer from the same limit.

@victorgp it means we are setting the default limit back to infinity, even outside of docker the default limit is super easy to hit with with Go and other, higher values felt equally as arbitrary. No other resource is limited by default so having this one limit is kinda odd.

@mischief

This comment has been minimized.

mischief commented May 14, 2016

@victorgp should be fixed in alpha 1047.0.0, give it a spin.

@victorgp

This comment has been minimized.

victorgp commented May 16, 2016

Is anything wrong with the releases? Yesterday i checked https://coreos.com/releases/ and alpha was 1047.0.0 but now it is back to 1032.1.0

@robszumski

This comment has been minimized.

Member

robszumski commented May 16, 2016

@victorgp yes, the 1045.0.0 and 1047.0.0 images were rolled back (blog post)

@mischief

This comment has been minimized.

mischief commented May 24, 2016

@mischief mischief closed this May 24, 2016

@mischief mischief removed their assignment May 24, 2016

@agherzan

This comment has been minimized.

agherzan commented May 27, 2016

Hi guys. I'm a little confused why was this fixed this way. Isn't this an issue with docker? And it should inherit the limits from docker service?

@marineam

This comment has been minimized.

marineam commented May 27, 2016

@agherzan we run docker with the systemd cgroup driver so containers are created in independent scopes from the daemon itself, so settings applied to the daemon's cgroup do not apply to containers.

@agherzan

This comment has been minimized.

agherzan commented May 27, 2016

Archlinux does similar. And it inherits the value from the service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment