New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error response from daemon: Cannot start container (fork/exec /usr/sbin/iptables: cannot allocate memory) #8539

Closed
xh3b4sd opened this Issue Oct 13, 2014 · 134 comments

Comments

Projects
None yet
@xh3b4sd

xh3b4sd commented Oct 13, 2014

We are running some coreos clusters. Over time (after a week) we see the following error taking place.

# systemd error log
2014/10/13 16:12:08 Error response from daemon: Cannot start container f9e42f092597e46f5cf6a507d7e70662e6ef1035a8f01d95f56c1f2934234361: iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 172.31.31.159 --dport 49450 ! -i docker0 -j DNAT --to-destination 172.17.0.95:80:  (fork/exec /usr/sbin/iptables: cannot allocate memory)

Here we need to restart the docker daemon and everything is back to normal. Is there a memory leak? I see there is enough memory available. Is it because of the swap?

environment information

# top
top - 16:13:01 up 22 days,  7:56,  1 user,  load average: 6.50, 5.44, 4.58
Tasks: 186 total,   3 running, 183 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni, 14.2%id,  1.3%wa, 33.7%hi,  0.0%si, 50.8%st
Mem:   4051204k total,  3886188k used,   165016k free,     3168k buffers
Swap:        0k total,        0k used,        0k free,  1681572k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
20318 root      20   0 3449m 1.3g  10m S    0 34.0  18:27.87 docker

# memory stats
$ free
             total       used       free     shared    buffers     cached
Mem:       4051204    3884208     166996          0       3168    1681744
-/+ buffers/cache:    2199296    1851908
Swap:            0          0          0

# coreos version
$ cat /etc/lsb-release
DISTRIB_ID=CoreOS
DISTRIB_RELEASE=410.0.0
DISTRIB_CODENAME="Red Dog"
DISTRIB_DESCRIPTION="CoreOS 410.0.0"

# docker info 
$ docker info
Containers: 22
Images: 332
Storage Driver: btrfs
Execution Driver: native-0.2
Kernel Version: 3.15.8+

# docker version
$ docker -v
Docker version 1.1.2, build d84a070
@dqminh

This comment has been minimized.

Show comment
Hide comment
@dqminh

dqminh Oct 14, 2014

Contributor

I think that the error here is because of the kernel prevents the go runtime from allocating more memory. Setting up a swap does help in this case ( i think it should be possible in coreos to setup a swap ).

If it's possible, you can also test the latest 1.3 since it does have lots of improvements on memory allocations.

Contributor

dqminh commented Oct 14, 2014

I think that the error here is because of the kernel prevents the go runtime from allocating more memory. Setting up a swap does help in this case ( i think it should be possible in coreos to setup a swap ).

If it's possible, you can also test the latest 1.3 since it does have lots of improvements on memory allocations.

@n0rad

This comment has been minimized.

Show comment
Hide comment
@n0rad

n0rad Oct 20, 2014

Same here after a full day starting / stopping / rm containers on archlinux.

2014/10/20 20:54:22 Error response from daemon: Cannot start container d3a826e2b5a0db5005ddf4431d4f508027deb16b5532370976b732beb5535eca: iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 192.168.42.3 --dport 8000 ! -i docker0 -j DNAT --to-destination 172.17.0.75:8000:  (fork/exec /usr/sbin/iptables: cannot allocate memory)

Far from full memory :

$ free        
             total       used       free     shared    buffers     cached
Mem:      14364156    2021808   12342348       1304     116144     532328
-/+ buffers/cache:    1373336   12990820
Swap:            0          0          0

But docker virtual memory is huge and close to my max memory :

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     31858  1.5  2.4 17645784 352304 ?     Ssl  Oct19  19:12 /usr/bin/docker -d -H fd:// 
$ docker --version
Docker version 1.3.0, build c78088f

n0rad commented Oct 20, 2014

Same here after a full day starting / stopping / rm containers on archlinux.

2014/10/20 20:54:22 Error response from daemon: Cannot start container d3a826e2b5a0db5005ddf4431d4f508027deb16b5532370976b732beb5535eca: iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 192.168.42.3 --dport 8000 ! -i docker0 -j DNAT --to-destination 172.17.0.75:8000:  (fork/exec /usr/sbin/iptables: cannot allocate memory)

Far from full memory :

$ free        
             total       used       free     shared    buffers     cached
Mem:      14364156    2021808   12342348       1304     116144     532328
-/+ buffers/cache:    1373336   12990820
Swap:            0          0          0

But docker virtual memory is huge and close to my max memory :

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     31858  1.5  2.4 17645784 352304 ?     Ssl  Oct19  19:12 /usr/bin/docker -d -H fd:// 
$ docker --version
Docker version 1.3.0, build c78088f
@haches

This comment has been minimized.

Show comment
Hide comment
@haches

haches Oct 26, 2014

I'm experiencing the exact same behavior.
Starting and removing many short-lived containers, the virtual memory after 24h is up from ~300MB to aroung 1.5GB

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
2762 root      20   0 1532596  22976  11220 S   0.0  0.6   1:28.18 docker
Docker version 1.3.0, build c78088f

on Ubuntu 14.04

haches commented Oct 26, 2014

I'm experiencing the exact same behavior.
Starting and removing many short-lived containers, the virtual memory after 24h is up from ~300MB to aroung 1.5GB

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
2762 root      20   0 1532596  22976  11220 S   0.0  0.6   1:28.18 docker
Docker version 1.3.0, build c78088f

on Ubuntu 14.04

@malagant

This comment has been minimized.

Show comment
Hide comment
@malagant

malagant Oct 26, 2014

Same here. Docker 1.3.0

Same here. Docker 1.3.0

@rosskukulinski

This comment has been minimized.

Show comment
Hide comment
@rosskukulinski

rosskukulinski Oct 26, 2014

also seeing what I believe to be the same issue on CoreOS 444.5.0 w/ Docker 1.2

also seeing what I believe to be the same issue on CoreOS 444.5.0 w/ Docker 1.2

@rafikk

This comment has been minimized.

Show comment
Hide comment
@rafikk

rafikk Nov 5, 2014

+1 on Docker 1.3.1

rafikk commented Nov 5, 2014

+1 on Docker 1.3.1

@denderello

This comment has been minimized.

Show comment
Hide comment
@denderello

denderello Nov 14, 2014

We have coreos cluster running and experience one outage once a week. Containers started through systemd unit files throw errors like this.

2014/11/10 18:29:23 Error response from daemon: Cannot start container 2748d711896daadcb83fd65ba3c3cd124070013e7396060dffdeec015062697e: fork/exec /usr/bin/docker: cannot allocate memory

Further we are getting sometimes errors like this.

2014/11/10 19:17:00 Error response from daemon: Cannot start container 982f7a3da2dc182b61b1004d436a2cf68674ef949ab9848d4009c79094b3e592: iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 49298 ! -i docker0 -j DNAT --to-destination 172.17.1.173:6379:  (fork/exec /usr/sbin/iptables: cannot allocate memory)

top output looks like this.

top - 19:19:33 up 24 days,  3:49,  1 user,  load average: 1.30, 1.24, 1.16
Tasks: 203 total,   1 running, 201 sleeping,   0 stopped,   1 zombie
%Cpu(s):  9.7 us,  6.0 sy,  0.0 ni, 75.9 id,  7.1 wa,  1.0 hi,  0.2 si,  0.2 st
KiB Mem:   4051192 total,  3913096 used,   138096 free,      140 buffers
KiB Swap:        0 total,        0 used,        0 free.   187052 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
28124 root      20   0 6618932 2.950g   9808 S   2.0 76.4  23:33.32 docker

The really crazy thing is docker --debug info's output. There are only 21 containers, but 2257 goroutines! This looks really really weird and feels like there are a lot of goroutines not cleaned up correctly.

docker --debug info
Containers: 21
Images: 374
Storage Driver: btrfs
Execution Driver: native-0.2
Kernel Version: 3.16.2+
Operating System: CoreOS 444.4.0
Debug mode (server): false
Debug mode (client): true
Fds: 514
Goroutines: 2257
EventsListeners: 0
Init SHA1: aad618defd9f41d37c64f32fb04c0f276c31bc42
Init Path: /usr/libexec/docker/dockerinit

lsof | wc -l says there are 841 open file descriptors. Docker seems to use 514 of them.

For us it looks like the hosts memory has nothing to do with that problem, since the docker daemon always frags on another limit. It looks more likely like the daemon is getting into a unrecoverable state at some point where it cannot handle the whole container management any more. This is maybe caused by spooky goroutines. I am just thinking loud. Maybe somebody has other ideas?

To get more and better information regarding this issue next time, which information would be cool to have to solve this problem quickly?

We have coreos cluster running and experience one outage once a week. Containers started through systemd unit files throw errors like this.

2014/11/10 18:29:23 Error response from daemon: Cannot start container 2748d711896daadcb83fd65ba3c3cd124070013e7396060dffdeec015062697e: fork/exec /usr/bin/docker: cannot allocate memory

Further we are getting sometimes errors like this.

2014/11/10 19:17:00 Error response from daemon: Cannot start container 982f7a3da2dc182b61b1004d436a2cf68674ef949ab9848d4009c79094b3e592: iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 49298 ! -i docker0 -j DNAT --to-destination 172.17.1.173:6379:  (fork/exec /usr/sbin/iptables: cannot allocate memory)

top output looks like this.

top - 19:19:33 up 24 days,  3:49,  1 user,  load average: 1.30, 1.24, 1.16
Tasks: 203 total,   1 running, 201 sleeping,   0 stopped,   1 zombie
%Cpu(s):  9.7 us,  6.0 sy,  0.0 ni, 75.9 id,  7.1 wa,  1.0 hi,  0.2 si,  0.2 st
KiB Mem:   4051192 total,  3913096 used,   138096 free,      140 buffers
KiB Swap:        0 total,        0 used,        0 free.   187052 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
28124 root      20   0 6618932 2.950g   9808 S   2.0 76.4  23:33.32 docker

The really crazy thing is docker --debug info's output. There are only 21 containers, but 2257 goroutines! This looks really really weird and feels like there are a lot of goroutines not cleaned up correctly.

docker --debug info
Containers: 21
Images: 374
Storage Driver: btrfs
Execution Driver: native-0.2
Kernel Version: 3.16.2+
Operating System: CoreOS 444.4.0
Debug mode (server): false
Debug mode (client): true
Fds: 514
Goroutines: 2257
EventsListeners: 0
Init SHA1: aad618defd9f41d37c64f32fb04c0f276c31bc42
Init Path: /usr/libexec/docker/dockerinit

lsof | wc -l says there are 841 open file descriptors. Docker seems to use 514 of them.

For us it looks like the hosts memory has nothing to do with that problem, since the docker daemon always frags on another limit. It looks more likely like the daemon is getting into a unrecoverable state at some point where it cannot handle the whole container management any more. This is maybe caused by spooky goroutines. I am just thinking loud. Maybe somebody has other ideas?

To get more and better information regarding this issue next time, which information would be cool to have to solve this problem quickly?

@rosskukulinski

This comment has been minimized.

Show comment
Hide comment
@rosskukulinski

rosskukulinski Nov 14, 2014

@denderello I'm curious -- do you make use of sidekicks that do docker inspect to determine the open ports of your containers? Experiencing similar problem to you on CoreOS, and I've got a hunch its related to our sidekicks. Going to run a test in our staging over the weekend to learn more.

@denderello I'm curious -- do you make use of sidekicks that do docker inspect to determine the open ports of your containers? Experiencing similar problem to you on CoreOS, and I've got a hunch its related to our sidekicks. Going to run a test in our staging over the weekend to learn more.

@denderello

This comment has been minimized.

Show comment
Hide comment
@denderello

denderello Nov 17, 2014

@rosskukulinski: Yes, we are using sidekicks and I just had a look on how we are doing port inspection. Our approach is a bit different but could hit the same issue. Our sidekicks use docker port to determine open ports of other containers. Have you been able to run your tests?

@rosskukulinski: Yes, we are using sidekicks and I just had a look on how we are doing port inspection. Our approach is a bit different but could hit the same issue. Our sidekicks use docker port to determine open ports of other containers. Have you been able to run your tests?

@rosskukulinski

This comment has been minimized.

Show comment
Hide comment
@rosskukulinski

rosskukulinski Nov 17, 2014

Hello @denderello. I just checked in on our staging infrastructure and we had zero memory leaks over the weekend (yay!).

Unfortunately for you, it seems that issue was not with our sidekicks, but rather our configuration of Logstash / Elastic Search. We are running a logstash global unit gathering log files from each host, shipping through redis, to another logstash for processing and archival into elastic search. On Friday I stopped those services and cycled a reboot of the cluster to start fresh. Memory utilization remained unchanged since friday afternoon when before memory use was constantly increasing.

I'm not sure which piece of the ELK stack was causing our Docker daemons to consume all the memory, but we'll be digging into that tomorrow.

Hello @denderello. I just checked in on our staging infrastructure and we had zero memory leaks over the weekend (yay!).

Unfortunately for you, it seems that issue was not with our sidekicks, but rather our configuration of Logstash / Elastic Search. We are running a logstash global unit gathering log files from each host, shipping through redis, to another logstash for processing and archival into elastic search. On Friday I stopped those services and cycled a reboot of the cluster to start fresh. Memory utilization remained unchanged since friday afternoon when before memory use was constantly increasing.

I'm not sure which piece of the ELK stack was causing our Docker daemons to consume all the memory, but we'll be digging into that tomorrow.

@denderello

This comment has been minimized.

Show comment
Hide comment
@denderello

denderello Nov 18, 2014

Cool that you could solve it in your cluster. I think we will have to go down the path of shutting down instances and take a look if maybe one specific container is causing the problems.

If anyone else has a hint we are happy to try it out. :)

Cool that you could solve it in your cluster. I think we will have to go down the path of shutting down instances and take a look if maybe one specific container is causing the problems.

If anyone else has a hint we are happy to try it out. :)

@dqminh

This comment has been minimized.

Show comment
Hide comment
@dqminh

dqminh Nov 19, 2014

Contributor

@denderello @rosskukulinski does the application output a lot of logs to journalctl and/or any other log processor ( i assume that since you use coreos ) ?

Contributor

dqminh commented Nov 19, 2014

@denderello @rosskukulinski does the application output a lot of logs to journalctl and/or any other log processor ( i assume that since you use coreos ) ?

@denderello

This comment has been minimized.

Show comment
Hide comment
@denderello

denderello Nov 21, 2014

No, mostly the daemon logging that containers are stopping and starting. Already checked with journalctl --disk-usage on the nodes which looks ok. Why are you asking? Experienced any problems with large journals?

No, mostly the daemon logging that containers are stopping and starting. Already checked with journalctl --disk-usage on the nodes which looks ok. Why are you asking? Experienced any problems with large journals?

@dqminh

This comment has been minimized.

Show comment
Hide comment
@dqminh

dqminh Nov 21, 2014

Contributor

@denderello docker does some log buffering internally, so i observed that when the container output more logs than the consumer can consume at a time ( the easiest way would be to run yes "long string" command as an docker unit, the daemon memory consumption can increase quite a bit.

Contributor

dqminh commented Nov 21, 2014

@denderello docker does some log buffering internally, so i observed that when the container output more logs than the consumer can consume at a time ( the easiest way would be to run yes "long string" command as an docker unit, the daemon memory consumption can increase quite a bit.

@calmera

This comment has been minimized.

Show comment
Hide comment
@calmera

calmera Nov 30, 2014

I'm having the same problem I think. Even without having any containers running I still get the "fork/exec /sbin/iptables: cannot allocate memory)" maybe worth mentioning I'm running docker on ARM so I have only limited memory resources to my disposal. But still I am not hitting any limits yet so it seems rather strange. Oh and my docker version is 1.3.0

calmera commented Nov 30, 2014

I'm having the same problem I think. Even without having any containers running I still get the "fork/exec /sbin/iptables: cannot allocate memory)" maybe worth mentioning I'm running docker on ARM so I have only limited memory resources to my disposal. But still I am not hitting any limits yet so it seems rather strange. Oh and my docker version is 1.3.0

@jlrigau

This comment has been minimized.

Show comment
Hide comment
@jlrigau

jlrigau Jan 7, 2015

Same problem with Docker 1.3.3

jlrigau commented Jan 7, 2015

Same problem with Docker 1.3.3

@stevenschlansker

This comment has been minimized.

Show comment
Hide comment
@stevenschlansker

stevenschlansker Jan 9, 2015

Same problem with Docker 1.4.1,

root      1318  0.9 12.0 27269524 3712088 ?    Ssl   2014 284:13 /usr/bin/docker -d -g /mnt/docker

Docker daemon is using almost 3GB of virtual memory. System isn't willing to commit to forking, leading to Error pulling image (gc-setup_teamcity_37) from example.com/gc-setup, ApplyLayer fork/exec /usr/bin/docker: cannot allocate memory

Same problem with Docker 1.4.1,

root      1318  0.9 12.0 27269524 3712088 ?    Ssl   2014 284:13 /usr/bin/docker -d -g /mnt/docker

Docker daemon is using almost 3GB of virtual memory. System isn't willing to commit to forking, leading to Error pulling image (gc-setup_teamcity_37) from example.com/gc-setup, ApplyLayer fork/exec /usr/bin/docker: cannot allocate memory

@efuquen

This comment has been minimized.

Show comment
Hide comment
@efuquen

efuquen Jan 23, 2015

+1 having the same issues. It's not clear from the comments, but has anyone from the docker team acknowledged this issue?

efuquen commented Jan 23, 2015

+1 having the same issues. It's not clear from the comments, but has anyone from the docker team acknowledged this issue?

@dansowter

This comment has been minimized.

Show comment
Hide comment
@dansowter

dansowter Jan 25, 2015

+1 Same issue here

+1 Same issue here

@ryan-stateless

This comment has been minimized.

Show comment
Hide comment
@dangra

This comment has been minimized.

Show comment
Hide comment
@dangra

dangra Jan 28, 2015

+1 Same problem here with a too verbose container on a mesos cluster running ubuntu 14.04.

dangra commented Jan 28, 2015

+1 Same problem here with a too verbose container on a mesos cluster running ubuntu 14.04.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Jan 28, 2015

Contributor

is this on the latest docker version

On Wed, Jan 28, 2015 at 8:39 AM, Daniel Graña notifications@github.com
wrote:

+1 Same problem here with a too verbose container on a mesos cluster
running ubuntu 14.04.


Reply to this email directly or view it on GitHub
#8539 (comment).

Contributor

jessfraz commented Jan 28, 2015

is this on the latest docker version

On Wed, Jan 28, 2015 at 8:39 AM, Daniel Graña notifications@github.com
wrote:

+1 Same problem here with a too verbose container on a mesos cluster
running ubuntu 14.04.


Reply to this email directly or view it on GitHub
#8539 (comment).

@ryan-stateless

This comment has been minimized.

Show comment
Hide comment
@ryan-stateless

ryan-stateless Jan 28, 2015

Docker version 1.4.1, build 5bc2ff8

Docker version 1.4.1, build 5bc2ff8

@dangra

This comment has been minimized.

Show comment
Hide comment
@dangra

dangra Jan 28, 2015

is this on the latest docker version

@jfrazelle no, docker 1.3.1 on btrfs.

dangra commented Jan 28, 2015

is this on the latest docker version

@jfrazelle no, docker 1.3.1 on btrfs.

@reiz

This comment has been minimized.

Show comment
Hide comment
@reiz

reiz Jan 29, 2015

I have the same problem with Docker 1.3.1 on Ubuntu 13. Running a couple other installations on Ubuntu 14 with Docker 1.4.1 without this problems.

reiz commented Jan 29, 2015

I have the same problem with Docker 1.3.1 on Ubuntu 13. Running a couple other installations on Ubuntu 14 with Docker 1.4.1 without this problems.

@ngpestelos

This comment has been minimized.

Show comment
Hide comment
@ngpestelos

ngpestelos Jan 30, 2015

Docker version 1.3.3, build 54d900a

Docker version 1.3.3, build 54d900a

@mboersma

This comment has been minimized.

Show comment
Hide comment
@mboersma

mboersma Jan 30, 2015

We've also seen this failure running our test suite against CoreOS alpha / Docker 1.4.1, although that does not appear to be the salient difference as others have reported similar problems with Docker 1.3.3.

deis/deis#2975

We've also seen this failure running our test suite against CoreOS alpha / Docker 1.4.1, although that does not appear to be the salient difference as others have reported similar problems with Docker 1.3.3.

deis/deis#2975

@xh3b4sd

This comment has been minimized.

Show comment
Hide comment
@xh3b4sd

xh3b4sd Jan 30, 2015

Just saw this

# see docker memory
$ docker --debug info
Containers: 17
Images: 380
Storage Driver: btrfs
 Build Version: Btrfs v3.17.1
 Library Version: 101
Execution Driver: native-0.2
Kernel Version: 3.18.1
Operating System: CoreOS 557.1.0
CPUs: 2
Total Memory: 7.309 GiB # <<< more than 7GB
Name: ip-172-31-30-34.eu-west-1.compute.internal
ID: 2PR3:TZSL:GIL3:DFGJ:4VP5:V275:QRY3:P75F:HUPC:VY6H:VSPC:57I5
Debug mode (server): false
Debug mode (client): true
Fds: 134
Goroutines: 206
EventsListeners: 1
Init SHA1: 6dfe406868afc44ced172a21ed28dffdfcf39743
Init Path: /usr/libexec/docker/dockerinit
Docker Root Dir: /var/lib/docker

# coreos version
$ cat /etc/lsb-release
DISTRIB_ID=CoreOS
DISTRIB_RELEASE=557.1.0
DISTRIB_CODENAME="Red Dog"
DISTRIB_DESCRIPTION="CoreOS 557.1.0"

We having swapping activated and machines are not fragged by theirselves, but we see container failing at some point in weird ghosty states when memory is extremely high.

xh3b4sd commented Jan 30, 2015

Just saw this

# see docker memory
$ docker --debug info
Containers: 17
Images: 380
Storage Driver: btrfs
 Build Version: Btrfs v3.17.1
 Library Version: 101
Execution Driver: native-0.2
Kernel Version: 3.18.1
Operating System: CoreOS 557.1.0
CPUs: 2
Total Memory: 7.309 GiB # <<< more than 7GB
Name: ip-172-31-30-34.eu-west-1.compute.internal
ID: 2PR3:TZSL:GIL3:DFGJ:4VP5:V275:QRY3:P75F:HUPC:VY6H:VSPC:57I5
Debug mode (server): false
Debug mode (client): true
Fds: 134
Goroutines: 206
EventsListeners: 1
Init SHA1: 6dfe406868afc44ced172a21ed28dffdfcf39743
Init Path: /usr/libexec/docker/dockerinit
Docker Root Dir: /var/lib/docker

# coreos version
$ cat /etc/lsb-release
DISTRIB_ID=CoreOS
DISTRIB_RELEASE=557.1.0
DISTRIB_CODENAME="Red Dog"
DISTRIB_DESCRIPTION="CoreOS 557.1.0"

We having swapping activated and machines are not fragged by theirselves, but we see container failing at some point in weird ghosty states when memory is extremely high.

@tbatchelli

This comment has been minimized.

Show comment
Hide comment
@tbatchelli

tbatchelli Feb 5, 2015

+1 Having this issue. It seems to correlate with new image downloads.

+1 Having this issue. It seems to correlate with new image downloads.

@Ulexus

This comment has been minimized.

Show comment
Hide comment
@Ulexus

Ulexus Feb 5, 2015

Docker 1.3.3, CoreOS stable (522.6.0). Been running fine for months. Just added logspout, and this bug is now hitting me. (Nothing before was doing much docker inspection.)

Ulexus commented Feb 5, 2015

Docker 1.3.3, CoreOS stable (522.6.0). Been running fine for months. Just added logspout, and this bug is now hitting me. (Nothing before was doing much docker inspection.)

@pkieltyka

This comment has been minimized.

Show comment
Hide comment

+1

@esnunes

This comment has been minimized.

Show comment
Hide comment

esnunes commented Feb 16, 2015

+1

@reiz

This comment has been minimized.

Show comment
Hide comment

reiz commented Feb 16, 2015

+1

@knowbody

This comment has been minimized.

Show comment
Hide comment

+1

@andergmartins

This comment has been minimized.

Show comment
Hide comment
@andergmartins

andergmartins Feb 19, 2015

I only see this when I'm running 2 containers in parallel started by a phing script (for my tests environment). I'm able to run it once, but removing the containers and trying to run again, Almost the same error:

fork/exec /usr/bin/docker: cannot allocate memory

I only see this when I'm running 2 containers in parallel started by a phing script (for my tests environment). I'm able to run it once, but removing the containers and trying to run again, Almost the same error:

fork/exec /usr/bin/docker: cannot allocate memory

@ciela

This comment has been minimized.

Show comment
Hide comment
@ciela

ciela Feb 20, 2015

+1 on 64bit Amazon Linux 2014.09 v1.0.9 running Docker 1.2.0

ciela commented Feb 20, 2015

+1 on 64bit Amazon Linux 2014.09 v1.0.9 running Docker 1.2.0

@jalev

This comment has been minimized.

Show comment
Hide comment
@jalev

jalev Feb 20, 2015

+1

We've seen this here too (docker 1.4.1+). It only appears to occur if you launch Docker through a script/systemd unit. We're able to 'fix' (or at least get around) this error by doing a docker run on the cmdline, after which the error won't return.

jalev commented Feb 20, 2015

+1

We've seen this here too (docker 1.4.1+). It only appears to occur if you launch Docker through a script/systemd unit. We're able to 'fix' (or at least get around) this error by doing a docker run on the cmdline, after which the error won't return.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Feb 20, 2015

Contributor

Hi all!
So we saw this on one of our test machines and I got really excited because I could finally reproduce and debug. My first idea was to add memory to swap because I realized the server had none. And that fixed it perfectly.
Looking at the comments above the output of swap also seems to be zero for those involved in this thread. Seeing as this fixed the issue and it was not a docker specific problem I am going to close. But feel free to kindly express your opinion on why I should reopen if you feel differently.

Contributor

jessfraz commented Feb 20, 2015

Hi all!
So we saw this on one of our test machines and I got really excited because I could finally reproduce and debug. My first idea was to add memory to swap because I realized the server had none. And that fixed it perfectly.
Looking at the comments above the output of swap also seems to be zero for those involved in this thread. Seeing as this fixed the issue and it was not a docker specific problem I am going to close. But feel free to kindly express your opinion on why I should reopen if you feel differently.

@jessfraz jessfraz closed this Feb 20, 2015

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Jan 30, 2016

Member

@cescoferraro do you have swap enabled? Docker cannot do magic if a process needs more than 512 MB, and no swap is available, it will fail.

Member

thaJeztah commented Jan 30, 2016

@cescoferraro do you have swap enabled? Docker cannot do magic if a process needs more than 512 MB, and no swap is available, it will fail.

@cescoferraro

This comment has been minimized.

Show comment
Hide comment
@cescoferraro

cescoferraro Jan 31, 2016

@thaJeztah You are right. The machines are too skinny. I building it locally and scp'ing the binaries. Thanks

@thaJeztah You are right. The machines are too skinny. I building it locally and scp'ing the binaries. Thanks

@hamsterksu

This comment has been minimized.

Show comment
Hide comment
@hamsterksu

hamsterksu Feb 9, 2016

i have caught this error too.

when you try to open huge count of ports you will get this error

for machine with 2G ram without swap you will get error

 docker run -p 65000-65300:65000-65300/udp phusion/baseimage:latest

according to top docker runs a lot of exec subprocess and memory is over.
according to ps where are a lot of docker-proxy subprocesses

i have caught this error too.

when you try to open huge count of ports you will get this error

for machine with 2G ram without swap you will get error

 docker run -p 65000-65300:65000-65300/udp phusion/baseimage:latest

according to top docker runs a lot of exec subprocess and memory is over.
according to ps where are a lot of docker-proxy subprocesses

@klausenbusk

This comment has been minimized.

Show comment
Hide comment
@klausenbusk

klausenbusk Feb 9, 2016

when you try to open huge count of ports you will get this error

#11185

when you try to open huge count of ports you will get this error

#11185

@tsuna

This comment has been minimized.

Show comment
Hide comment
@tsuna

tsuna Feb 25, 2016

Seems like this issue might be conflating multiple different problems, but at least in the case of a Docker container printing a lot of stuff quickly to stdout/stderr, the leak might be explained (and fixed) by #17877.

tsuna commented Feb 25, 2016

Seems like this issue might be conflating multiple different problems, but at least in the case of a Docker container printing a lot of stuff quickly to stdout/stderr, the leak might be explained (and fixed) by #17877.

@nhoover

This comment has been minimized.

Show comment
Hide comment
@nhoover

nhoover Mar 30, 2016

I had been running fine for over a month on Docker 1.9.1 with default logging. Yesterday I enabled log_driver syslog in all my containers and within a couple of hours ran into this problem. Then it happened a second time soon after. Then it ran for some hours ok.

I haven't yet tried 1.10 to see if it fixes the problem as per #17877

nhoover commented Mar 30, 2016

I had been running fine for over a month on Docker 1.9.1 with default logging. Yesterday I enabled log_driver syslog in all my containers and within a couple of hours ran into this problem. Then it happened a second time soon after. Then it ran for some hours ok.

I haven't yet tried 1.10 to see if it fixes the problem as per #17877

@webhacking

This comment has been minimized.

Show comment
Hide comment

👍

@ShariqT

This comment has been minimized.

Show comment
Hide comment
@ShariqT

ShariqT Aug 5, 2016

Getting this with CoreOS using docker v 1.10.3. Containers are dynamically created and destroyed by my application and I can get about 20-30 create/destroy transactions before I run into this error.

ShariqT commented Aug 5, 2016

Getting this with CoreOS using docker v 1.10.3. Containers are dynamically created and destroyed by my application and I can get about 20-30 create/destroy transactions before I run into this error.

singularo added a commit to universityofadelaide/docker-apache2-php7 that referenced this issue Aug 8, 2016

@bioothod

This comment has been minimized.

Show comment
Hide comment
@bioothod

bioothod Aug 26, 2016

Very similar problem exists with the 1.12 docker:

$ docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:13:43 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:13:43 2016
 OS/Arch:      linux/amd64

# docker commit ...
Error response from daemon: Untar error on re-exec cmd: fork/exec /proc/self/exe: cannot allocate memory

# docker cp /tmp/test.cpp b594:/tmp/greylock/src/
Error response from daemon: Untar error on re-exec cmd: fork/exec /proc/self/exe: cannot allocate memory

There is only one or two containers running at the moment, the one to copy file into, which is quite empty, and the one which runs for several days and outputs quite a lot of data to stdout

Another probably related problem is dockerd OOM, if application rapidly prints a lot of data to stdout in container, dockerd will allocate all the memory and will be killed by OOM, rarely it is possible that it will start releasing memory to the system, but this will happen in half of an hour or later after application has been started. Application itself doesn't eat memory, only dockerd.

But in particular this case with cannot allocate memory dockerd only ate about 1.1 Gb of resident memory out of 32, but 30.4Gb of virtual mem though.

Very similar problem exists with the 1.12 docker:

$ docker version
Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:13:43 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:13:43 2016
 OS/Arch:      linux/amd64

# docker commit ...
Error response from daemon: Untar error on re-exec cmd: fork/exec /proc/self/exe: cannot allocate memory

# docker cp /tmp/test.cpp b594:/tmp/greylock/src/
Error response from daemon: Untar error on re-exec cmd: fork/exec /proc/self/exe: cannot allocate memory

There is only one or two containers running at the moment, the one to copy file into, which is quite empty, and the one which runs for several days and outputs quite a lot of data to stdout

Another probably related problem is dockerd OOM, if application rapidly prints a lot of data to stdout in container, dockerd will allocate all the memory and will be killed by OOM, rarely it is possible that it will start releasing memory to the system, but this will happen in half of an hour or later after application has been started. Application itself doesn't eat memory, only dockerd.

But in particular this case with cannot allocate memory dockerd only ate about 1.1 Gb of resident memory out of 32, but 30.4Gb of virtual mem though.

@mlaventure

This comment has been minimized.

Show comment
Hide comment
@mlaventure

mlaventure Sep 1, 2016

Contributor

@bioothod is there any cgroup rule applied to dockerd limiting its memory usage?

Contributor

mlaventure commented Sep 1, 2016

@bioothod is there any cgroup rule applied to dockerd limiting its memory usage?

@bioothod

This comment has been minimized.

Show comment
Hide comment
@bioothod

bioothod Sep 1, 2016

@mlaventure not that I'm aware of.

In the other case when there was a single app printing data to stdout, dockerd ate 30+ Gb of memory and was killed by OOM, if there are any cgroup memory limitations, they would fire in that case.

bioothod commented Sep 1, 2016

@mlaventure not that I'm aware of.

In the other case when there was a single app printing data to stdout, dockerd ate 30+ Gb of memory and was killed by OOM, if there are any cgroup memory limitations, they would fire in that case.

@mlaventure

This comment has been minimized.

Show comment
Hide comment
@mlaventure

mlaventure Sep 1, 2016

Contributor

@bioothod which logging driver were you using? Have you tried using a different one? Or set a log-rotation?

Contributor

mlaventure commented Sep 1, 2016

@bioothod which logging driver were you using? Have you tried using a different one? Or set a log-rotation?

@bioothod

This comment has been minimized.

Show comment
Hide comment
@bioothod

bioothod Sep 1, 2016

@mlaventure this misleads discussion from the original topic, dockerd oom is a different problem, although both are related to how dockerd/go reclaim memory.

I believe logging driver is a default json file, since dockerd is started without any option in my fedora
You can start following application in the container, do not detouch from it and watch how dockerd eats the ram:

root@935acf6c1f6a:/# cat /tmp/test.cpp 
#include <stdio.h>

int main()
{
    long status = 0;
    while (true) {
        printf(" \r some status: %ld", status);
        ++status;
    }
}
root@935acf6c1f6a:/# g++ /tmp/test.cpp -o /tmp/test
root@935acf6c1f6a:/# /tmp/test

bioothod commented Sep 1, 2016

@mlaventure this misleads discussion from the original topic, dockerd oom is a different problem, although both are related to how dockerd/go reclaim memory.

I believe logging driver is a default json file, since dockerd is started without any option in my fedora
You can start following application in the container, do not detouch from it and watch how dockerd eats the ram:

root@935acf6c1f6a:/# cat /tmp/test.cpp 
#include <stdio.h>

int main()
{
    long status = 0;
    while (true) {
        printf(" \r some status: %ld", status);
        ++status;
    }
}
root@935acf6c1f6a:/# g++ /tmp/test.cpp -o /tmp/test
root@935acf6c1f6a:/# /tmp/test
@mlaventure

This comment has been minimized.

Show comment
Hide comment
@mlaventure

mlaventure Sep 1, 2016

Contributor

@bioothod just trying to figure out where the leak would come.

I'll try your test program and see if I can reproduce.

Contributor

mlaventure commented Sep 1, 2016

@bioothod just trying to figure out where the leak would come.

I'll try your test program and see if I can reproduce.

@bioothod

This comment has been minimized.

Show comment
Hide comment
@bioothod

bioothod Sep 2, 2016

@mlaventure I'm not sure this is the leak, sometimes (quite rarely though) dockerd starts returning memory to OS. Maybe there is a bug in docker daemon and it sometimes flushes data into log file from memory, but most of the time it doesn't, but it looks more like a problem with memory reclaim, i.e. freed memory is not returned to kernel.

My original post highlights that when docker daemon did not accept any operation and failed with fork/exec /proc/self/exe: cannot allocate memory error, it only allocated 1.1 Gb of private resident RAM, this is quite different from the 'stdout log' case.

bioothod commented Sep 2, 2016

@mlaventure I'm not sure this is the leak, sometimes (quite rarely though) dockerd starts returning memory to OS. Maybe there is a bug in docker daemon and it sometimes flushes data into log file from memory, but most of the time it doesn't, but it looks more like a problem with memory reclaim, i.e. freed memory is not returned to kernel.

My original post highlights that when docker daemon did not accept any operation and failed with fork/exec /proc/self/exe: cannot allocate memory error, it only allocated 1.1 Gb of private resident RAM, this is quite different from the 'stdout log' case.

@jonathanperret

This comment has been minimized.

Show comment
Hide comment
@jonathanperret

jonathanperret Sep 2, 2016

@bioothod when fork fails, it's because it cannot allocate enough virtual memory to (potentially) hold a copy of the daemon's allocated virtual memory, so the resident set size of 1.1 GB is irrelevant. Your daemon using 30.4 GB of virtual memory when the problem manifests makes your report consistent with others, as far as I can tell. Which does not tell us where this virtual memory leak comes from, though.

jonathanperret commented Sep 2, 2016

@bioothod when fork fails, it's because it cannot allocate enough virtual memory to (potentially) hold a copy of the daemon's allocated virtual memory, so the resident set size of 1.1 GB is irrelevant. Your daemon using 30.4 GB of virtual memory when the problem manifests makes your report consistent with others, as far as I can tell. Which does not tell us where this virtual memory leak comes from, though.

@jonathanperret

This comment has been minimized.

Show comment
Hide comment
@jonathanperret

jonathanperret Sep 2, 2016

@bioothod it looks like you're hitting the bug that was squashed in 513ec73 :

This change updates how we handle long lines of output from the
container. The previous logic used a bufio reader to read entire lines
of output from the container through an intermediate BytesPipe, and that
allowed the container to cause dockerd to consume an unconstrained
amount of memory as it attempted to collect a whole line of output, by
outputting data without newlines.

This commit was made back in May, but was not cherry-picked on the 1.12.x branch so I guess you'll have to wait for 1.13 to get the fix. I just tested a master dockerd binary from https://master.dockerproject.org/ and the unbounded memory usage does not occur anymore. By the way, here's a simpler test case:

$ docker run --rm busybox dd if=/dev/zero > /dev/null

This ends up crashing the 1.12.1 daemon, works fine on master.

@bioothod it looks like you're hitting the bug that was squashed in 513ec73 :

This change updates how we handle long lines of output from the
container. The previous logic used a bufio reader to read entire lines
of output from the container through an intermediate BytesPipe, and that
allowed the container to cause dockerd to consume an unconstrained
amount of memory as it attempted to collect a whole line of output, by
outputting data without newlines.

This commit was made back in May, but was not cherry-picked on the 1.12.x branch so I guess you'll have to wait for 1.13 to get the fix. I just tested a master dockerd binary from https://master.dockerproject.org/ and the unbounded memory usage does not occur anymore. By the way, here's a simpler test case:

$ docker run --rm busybox dd if=/dev/zero > /dev/null

This ends up crashing the 1.12.1 daemon, works fine on master.

@jonathanperret

This comment has been minimized.

Show comment
Hide comment
@jonathanperret

jonathanperret Sep 2, 2016

Final note, for completeness: the issue was tracked as #18057; the commit I referenced (513ec73) was actually part of #22982 which was only merged a month ago.

Final note, for completeness: the issue was tracked as #18057; the commit I referenced (513ec73) was actually part of #22982 which was only merged a month ago.

@mlaventure

This comment has been minimized.

Show comment
Hide comment
Contributor

mlaventure commented Sep 2, 2016

@jonathanperret

This comment has been minimized.

Show comment
Hide comment
@jonathanperret

jonathanperret Sep 2, 2016

@mlaventure were you trying to mention me? Just curious 😄

@mlaventure were you trying to mention me? Just curious 😄

@mlaventure

This comment has been minimized.

Show comment
Hide comment
@mlaventure

mlaventure Sep 2, 2016

Contributor

@jonathanperret yes, bad tab completion 😅

Contributor

mlaventure commented Sep 2, 2016

@jonathanperret yes, bad tab completion 😅

@bioothod

This comment has been minimized.

Show comment
Hide comment
@bioothod

bioothod Sep 2, 2016

@jonathanperret good to know, logging will soon be resolved, thank you. This doesn't explain why in some rare cases dockerd returns memory to the system while application writing into log is still relatively active, and also what happened in the case where fork failed: it took resident memory when logging container is active

bioothod commented Sep 2, 2016

@jonathanperret good to know, logging will soon be resolved, thank you. This doesn't explain why in some rare cases dockerd returns memory to the system while application writing into log is still relatively active, and also what happened in the case where fork failed: it took resident memory when logging container is active

@jonathanperret

This comment has been minimized.

Show comment
Hide comment
@jonathanperret

jonathanperret Sep 2, 2016

@bioothod As soon as your application writes a \n, the (1.12) log driver can write out the accumulated data and release the memory to the Go runtime; however it may take some time (seconds to minutes) for the Go runtime to release the memory to the OS, making it difficult to correlate memory usage reductions with various events such as a container exiting.

I'm not completely sure I understand what's puzzling you in the other case (…what happened in the case where fork failed: it took resident memory when logging container is active…): how much memory is resident out of the entire memory allocated by the daemon at any point in time, depends on various factors such as how recently it was accessed and the memory pressure from other processes on the box. It does not affect the (in)ability to fork.

@bioothod As soon as your application writes a \n, the (1.12) log driver can write out the accumulated data and release the memory to the Go runtime; however it may take some time (seconds to minutes) for the Go runtime to release the memory to the OS, making it difficult to correlate memory usage reductions with various events such as a container exiting.

I'm not completely sure I understand what's puzzling you in the other case (…what happened in the case where fork failed: it took resident memory when logging container is active…): how much memory is resident out of the entire memory allocated by the daemon at any point in time, depends on various factors such as how recently it was accessed and the memory pressure from other processes on the box. It does not affect the (in)ability to fork.

@tonistiigi

This comment has been minimized.

Show comment
Hide comment
@tonistiigi

tonistiigi Sep 2, 2016

Member

Closing this as the original problem of not capping the attach stream was fixed in #17877 (v1.10). There was still an issue with long log lines (#18057) that corresponds to report from @bioothod . Fix for that is in master (#22982). I've added it to 1.12.2 milestone in case there will be one.

Thanks for all the help @jonathanperret !

Member

tonistiigi commented Sep 2, 2016

Closing this as the original problem of not capping the attach stream was fixed in #17877 (v1.10). There was still an issue with long log lines (#18057) that corresponds to report from @bioothod . Fix for that is in master (#22982). I've added it to 1.12.2 milestone in case there will be one.

Thanks for all the help @jonathanperret !

@tonistiigi tonistiigi closed this Sep 2, 2016

@stszap

This comment has been minimized.

Show comment
Hide comment
@stszap

stszap Oct 24, 2016

@tonistiigi was the fix included in 1.12.2? I can still reproduce the problem in 1.12.2

stszap commented Oct 24, 2016

@tonistiigi was the fix included in 1.12.2? I can still reproduce the problem in 1.12.2

@mlaventure

This comment has been minimized.

Show comment
Hide comment
@mlaventure

mlaventure Oct 24, 2016

Contributor

@stszap it looks like it didn't make it into 1.12.2

ping @vieux @thaJeztah is this a good candidate for 1.12.3? (Is there even enough time left to add it?)

Contributor

mlaventure commented Oct 24, 2016

@stszap it looks like it didn't make it into 1.12.2

ping @vieux @thaJeztah is this a good candidate for 1.12.3? (Is there even enough time left to add it?)

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Oct 24, 2016

Member

@mlaventure @stszap we discussed cherry-picking that PR into a patch release for 1.12, but decided to keep it for 1.13 as it's not a regression in 1.12

Member

thaJeztah commented Oct 24, 2016

@mlaventure @stszap we discussed cherry-picking that PR into a patch release for 1.12, but decided to keep it for 1.13 as it's not a regression in 1.12

@cyphar

This comment has been minimized.

Show comment
Hide comment
@cyphar

cyphar Oct 25, 2016

Contributor

@thaJeztah It's a bug that massively affects the scalability of Docker (it drastically limits the number of containers running on a server). Your call though.

Contributor

cyphar commented Oct 25, 2016

@thaJeztah It's a bug that massively affects the scalability of Docker (it drastically limits the number of containers running on a server). Your call though.

@jmreicha

This comment has been minimized.

Show comment
Hide comment
@jmreicha

jmreicha Nov 28, 2016

@thaJeztah When is 1.13 scheduled to be released? What is the workaround until then?

@thaJeztah When is 1.13 scheduled to be released? What is the workaround until then?

@ekristen

This comment has been minimized.

Show comment
Hide comment
@ekristen

ekristen Jan 26, 2017

Contributor

Is this even fixed in 1.13? I'm seeing some interesting memory creeps and then eventually I start seeing docker top start to fail with the fork/exec error. I'm still trying to figure out where the memory creep is at.

Contributor

ekristen commented Jan 26, 2017

Is this even fixed in 1.13? I'm seeing some interesting memory creeps and then eventually I start seeing docker top start to fail with the fork/exec error. I'm still trying to figure out where the memory creep is at.

@cyphar

This comment has been minimized.

Show comment
Hide comment
@cyphar

cyphar Jan 26, 2017

Contributor

@ekristen There is a series of issues within the Go runtime that can cause your memory overcommit to go to very high levels, causing OOM to kill your process (even though it isn't actually using any of the memory). Maybe that's what causing the issue? Docker has pprof endpoints to get pprof output, so you could try investigating those.

Contributor

cyphar commented Jan 26, 2017

@ekristen There is a series of issues within the Go runtime that can cause your memory overcommit to go to very high levels, causing OOM to kill your process (even though it isn't actually using any of the memory). Maybe that's what causing the issue? Docker has pprof endpoints to get pprof output, so you could try investigating those.

@wsilvan

This comment has been minimized.

Show comment
Hide comment
@wsilvan

wsilvan Jan 28, 2017

This error has been resolved after restart of docker daemon. It's works for me, but is intermittent

wsilvan commented Jan 28, 2017

This error has been resolved after restart of docker daemon. It's works for me, but is intermittent

@cyphar

This comment has been minimized.

Show comment
Hide comment
@cyphar

cyphar Jan 29, 2017

Contributor

@wsilvan That's because restarting a process will result in getting a new address space (without any of the funky pages that the Go runtime lazy allocates). Restarting Docker will "fix" the issue, but reloading all of the containers might also drive up memory usage (causing the same issue to happen again).

Contributor

cyphar commented Jan 29, 2017

@wsilvan That's because restarting a process will result in getting a new address space (without any of the funky pages that the Go runtime lazy allocates). Restarting Docker will "fix" the issue, but reloading all of the containers might also drive up memory usage (causing the same issue to happen again).

@lpcclown

This comment has been minimized.

Show comment
Hide comment
@lpcclown

lpcclown Jan 29, 2018

This error has been resolved after restart of docker daemon.

The commands are:
sudo service docker stop

sudo service docker start

This error has been resolved after restart of docker daemon.

The commands are:
sudo service docker stop

sudo service docker start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment