Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cadvisor prevents docker from removing monitored containers? #771

Open
cornelius-keller opened this issue Jun 12, 2015 · 98 comments
Open

cadvisor prevents docker from removing monitored containers? #771

cornelius-keller opened this issue Jun 12, 2015 · 98 comments

Comments

@cornelius-keller
Copy link

@cornelius-keller cornelius-keller commented Jun 12, 2015

Hi all, I have a problem using cadvisor on centos 7. When cadvisor is running, docker failes to remove other containers saying that the containers filesystem is busy. After stopping cadvisor is stopped container removal is working again.

I demostrated that in this gist: https://gist.github.com/cornelius-keller/0fd2d23b68ccd88c9328

I also included os version and docker info in the gist.

@rjnagal
Copy link
Collaborator

@rjnagal rjnagal commented Jun 12, 2015

Thanks for reporting, @cornelius-keller

what cadvisor version are you running? Can you get host:port/validate for cadvisor?
Is this a temporary situation, or does the container fs stays busy till you delete cadvisor?

@cornelius-keller
Copy link
Author

@cornelius-keller cornelius-keller commented Jun 12, 2015

@rjnagal
Cadvisor version is:

[root@583274-app35 ~]# docker images
REPOSITORY                                      TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
docker.io/google/cadvisor                       latest              399ae3c46a0e        47 hours ago        19.89 MB
[root@583274-app35 ~]# 

This is a permanent situation. The container fs stays busy untill I delete cadvisor.

What do you mean by getting host:port/validate for cadvisor? Cadvisor was still running and responsive on the web ui if that is what you mean. Unfortunately I can't give you any public host port to validate as cadvisor is only exposed via a vpn.

@rjnagal
Copy link
Collaborator

@rjnagal rjnagal commented Jun 12, 2015

Yeah, I just need the ouput from /validate endpoint on cadvisor UI. You can
scrub any data that's private in there. Thanks

On Fri, Jun 12, 2015 at 9:54 AM, Cornelius Keller notifications@github.com
wrote:

@rjnagal https://github.com/rjnagal
Cadvisor version is:

[root@583274-app35 ~]# docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZEdocker.io/google/cadvisor latest 399ae3c46a0e 47 hours ago 19.89 MB
[root@583274-app35 ~]#

This is a permanent situation. The container fs stays busy untill I delete
cadvisor.

What do you mean by getting host:port/validate for cadvisor? Cadvisor was
still running and responsive on the web ui if that is what you mean.
Unfortunately I can't give you any public host port to validate as cadvisor
is only exposed via a vpn.


Reply to this email directly or view it on GitHub
#771 (comment).

@cornelius-keller
Copy link
Author

@cornelius-keller cornelius-keller commented Jun 12, 2015

Sorry was a long day, did not get that this was an endpoint. I added the output to the gist.

@gianlucaborello
Copy link

@gianlucaborello gianlucaborello commented Jun 23, 2015

I am facing this same issue. Essentially, running cadvisor with --volume=/:/rootfs:ro causes other containers' devicemapper mounts to be mounted inside the cadvisor container, so they can't be properly destroyed when issuing docker rm on the target container as they will appear in use.

How can this be solved?

@hoeghh
Copy link

@hoeghh hoeghh commented Jul 10, 2015

When i run it on Fedora 21, it works fine. But when i run it on Ubuntu 14.04.2 LTS I get the same error as described above.

Error response from daemon: Cannot destroy container xxx_jenkinsMaster_1230: Driver aufs failed to remove root filesystem 13b421d0458e740e42e5fa5ac1cb68f32638f0bc723d9ba16718955214d79b7d: rename /var/lib/docker/aufs/mnt/13b421d0458e740e42e5fa5ac1cb68f32638f0bc723d9ba16718955214d79b7d /var/lib/docker/aufs/mnt/13b421d0458e740e42e5fa5ac1cb68f32638f0bc723d9ba16718955214d79b7d-removing: device or resource busy

The main difference is, that Ubuntu uses AUFS, where Fedora uses Devicemapper. Maby thats the problem.

@shredder12
Copy link

@shredder12 shredder12 commented Aug 28, 2015

@rjnagal I can confirm that this issue happens on Ubuntu trusty x64 with Doceker 1.8.1, cadvisor:latest and devicemapper.

'1cb6051b30a1' being the container ID.

# grep -l 1cb6051b30a1 /proc/*/mountinfo
/proc/1963/mountinfo
# ps aux | grep -i 1963
root      1963  1.9  0.8 588740 71688 ?        Ssl  Aug26  30:08 /usr/bin/cadvisor
root     14767  0.0  0.0  11744   952 pts/0    S+   00:56   0:00 grep --color=auto -i 1963

Please suggest a workaround for this.

@difro
Copy link
Contributor

@difro difro commented Aug 28, 2015

same here with CentOS + Docker 1.8.1(devicemapper)

Had to remove --volume=/:/rootfs:ro && --volume=/var/lib/docker:/var/lib/docker:ro

@vishh
Copy link
Contributor

@vishh vishh commented Aug 28, 2015

@rjnagal: Excepting disk usage calculation, cAdvisor does not poke at any
of these directories right?

On Fri, Aug 28, 2015 at 12:26 AM, Jihoon Chung notifications@github.com
wrote:

same here with CentOS + Docker 1.8.1(devicemapper)

Had to remove --volume=/:/rootfs:ro &&
--volume=/var/lib/docker:/var/lib/docker:ro


Reply to this email directly or view it on GitHub
#771 (comment).

@hourliert
Copy link

@hourliert hourliert commented Oct 5, 2015

Same problem here with Ubuntu 14.04.3.

@difro solution works but cadvisor can't provide docker stats anymore.

Any workaround?

@rmetzler
Copy link

@rmetzler rmetzler commented Oct 5, 2015

The last time I ran into this problem, I digged a little bit into the cAdvisor source code. I'm not 100% sure - because it was a few weeks ago - but this is essentially the gist:

If you use cAdvisor like it is shown in README.md you'll mount /var/lib/docker as a volume into the container. This will create dead containers.

The reason, cAdvisor wants you to mount /var/lib/docker is - as far as I could see - only to display a certain info that is only interesting for admins and should be known before hand.

@jimmidyson
Copy link
Collaborator

@jimmidyson jimmidyson commented Oct 5, 2015

We should be able to get all info from a docker inspect rather than parsing the container config file. Seems like mounting /var/lib/docker is causing more trouble than it's worth.

@svenmueller
Copy link

@svenmueller svenmueller commented Oct 22, 2015

we also encounter the same problem (cadvisor:lastest, ubuntu 14.04)

@svenmueller
Copy link

@svenmueller svenmueller commented Jan 26, 2016

any updates regarding this?

@vishh
Copy link
Contributor

@vishh vishh commented Jan 26, 2016

The best we can do for now is to let users optionally disable filesystem
usage metrics. We are waiting for some of the new upstream kernel features
to simplify disk accounting.

On Tue, Jan 26, 2016 at 2:51 PM, Sven Müller notifications@github.com
wrote:

any updates regarding this?


Reply to this email directly or view it on GitHub
#771 (comment).

@tuxknight
Copy link

@tuxknight tuxknight commented Feb 1, 2016

Same situation.
My Docker Version is 1.9.1
Cadvisor version 0.18.0

And when docker rm container fails, the status of that container change to "dead" .
Is it possible to umount that specific mountpoint when container status changed to "exit" or "dead" ?

@arhea
Copy link

@arhea arhea commented Feb 3, 2016

+1

@vishh
Copy link
Contributor

@vishh vishh commented Feb 3, 2016

cAdvisor doesn't mount anything. It runs du periodically to collect
filesystem stats. Other than that, it does not touch the container's
filesystem at all.
The easy fix for this would be to retry docker deletion or disable
filesystem aggregation in cadvisor.

On Wed, Feb 3, 2016 at 2:57 PM, Alex Rhea notifications@github.com wrote:

+1


Reply to this email directly or view it on GitHub
#771 (comment).

@tonysickpony
Copy link

@tonysickpony tonysickpony commented Feb 11, 2016

running cAdvisor without --volume=/:/rootfs:ro seems to fix it.
As pointed out in https://github.com/google/cadvisor/blob/master/docs/running.md
I haven't fully tested it yet, but works fine up to now

@xbglowx
Copy link

@xbglowx xbglowx commented Feb 11, 2016

I had to remove the following volume mounts:

  • /:/rootfs:ro
  • /var/lib/docker/:/var/lib/docker:ro

Setup:

  • Ubuntu 14.04.3 LTS
  • docker 1.9.1 with aufs
  • cAdvisor 0.20.5
@xbglowx
Copy link

@xbglowx xbglowx commented Apr 14, 2016

Upgraded docker to 1.10.3 and now cAdvisor can only see the docker images, but no containers, if I only use volume mounts:

  • /var/run:/var/run:rw
  • /sys:/sys:ro
  • /var/lib/docker/:/var/lib/docker:ro

If I add /:/rootfs:ro, cAdvisor can see the containers, but I get device or resource busy, when trying to remove any container.

@vishh
Copy link
Contributor

@vishh vishh commented Apr 14, 2016

@xbglowx Are you using the latest cadvisor release?

@xbglowx
Copy link

@xbglowx xbglowx commented Apr 15, 2016

Using google/cadvisor:v0.22.0

@jordic
Copy link

@jordic jordic commented Apr 16, 2016

Any ideas or suggestions how can i dig inside the issue?

@vishh
Copy link
Contributor

@vishh vishh commented Apr 27, 2016

@timstclair
Copy link
Contributor

@timstclair timstclair commented Apr 28, 2016

I was able to reproduce this locally with docker v1.9.1 and cAdvisor 0.22.0, but only right after starting cAdvisor and only once (removing a second container works). I could not reproduce with docker v1.11.

Is this consistent with everyone else's experience?

@viossat
Copy link

@viossat viossat commented Apr 12, 2017

None of the workarounds above were working for me. Like @xbglowx, the issue has been solved after upgrading the kernel (from 3.16 to 4.9).

@RRAlex
Copy link

@RRAlex RRAlex commented Apr 12, 2017

@viossat: which storage driver are you using?

@viossat
Copy link

@viossat viossat commented Apr 12, 2017

It switched from aufs to overlay2 by itself after the kernel upgrade (Docker 17.04.0-ce).
(overlay is in the mainline from kernel 3.18 and overlay2 is supported from 4.0)

@keyolk
Copy link

@keyolk keyolk commented Jun 15, 2017

I got similar issue with prometheus node_exporter also
prometheus/node_exporter#602

seems bind mounting the path including /var/lib/docker
makes mount namespace leaking.

both are resolved with running it on host directly.

@zevarito
Copy link

@zevarito zevarito commented Jun 15, 2017

@keyolk
Copy link

@keyolk keyolk commented Jun 21, 2017

@zevarito
I think it can be mitigated.
if I can put exact volumes to be used to the container.
what I means just take off /var/lib/docker/devicemapper being mounted.

could you inform me what of exact host data it uses ?

@timstclair timstclair assigned tallclair and unassigned timstclair Jul 7, 2017
@jindov
Copy link

@jindov jindov commented Jul 17, 2017

Same issue with my system:

OS: ubuntu 14.04 LTS
Kernel: 3.13.0-48-generic
Docker: 17.04.0-ce

Got this issue when run cadvisor v:0.26 with docker (even cadvisor:latest). Everything seems ok with node_exporter

@viossat
Copy link

@viossat viossat commented Jul 17, 2017

@jindov Try to upgrade your kernel, you need to switch to the overlay driver. See my previous comment.

@jindov
Copy link

@jindov jindov commented Jul 18, 2017

Thank @viossat, I will try to upgrade on dev env, but with prod env, we can't do this, so I decide to run on host directly. It's worked well

@garyden
Copy link

@garyden garyden commented Aug 13, 2017

How to run cadvisor on the host directly?

Gary

@jindov
Copy link

@jindov jindov commented Aug 16, 2017

You can use supervisord to run directly, this is my configuration to run cadvisor:

[program:cadvisor]
directory=/build/metric_exporter/cadvisor/src/github.com/google/cadvisor
command=/build/metric_exporter/cadvisor/src/github.com/google/cadvisor/cadvisor -port 9080
autostart=true
autorestart=unexpected
redirect_stderr=true
environment=GOROOT="/usr/local/go",GOPATH="GOPATH=/build/metric_exporter/cadvisor",PATH="$GOPATH/bin:$GOROOT/bin:$PATH"

Jin

@stephan2012
Copy link

@stephan2012 stephan2012 commented Sep 15, 2017

Same problem with RHEL 7.4, Docker 17.06.2. Doesn't matter if I'm using ZFS or Overlay2.

Any solution for this by now? Or just run cAdvisor directly on the host?

@amcrn
Copy link

@amcrn amcrn commented Dec 19, 2017

Hope this helps someone else:

Ubuntu 16.X (kernel 4.4.X) and Docker 1.11.2 w/ AUFS works fine.
Ubuntu 14.X (kernel 3.13.X) and Docker 1.11.2 w/ AUFS exhibits the problem.

So, it looks like overlay isn't necessary, a kernel upgrade is all that's required.

@gengwg
Copy link

@gengwg gengwg commented Jan 30, 2019

I'm having this issue (can't remove container) on latest version of cadvisor on Centos 7 with kernel version:

$ uname -r
3.10.0-514.26.2.el7.x86_64

Unfortunately I can't upgrade the kernel version, as it is provisioned by our infra team. And we can't upgrade the OS ourselves.

I bypassed this issue by systemctl restart cadvisor, then docker rm <container id> worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.