Skip to content
This repository has been archived by the owner on May 23, 2019. It is now read-only.

Running out of loopback devices #19

Closed
d11wtq opened this issue Jun 10, 2014 · 31 comments
Closed

Running out of loopback devices #19

d11wtq opened this issue Jun 10, 2014 · 31 comments

Comments

@d11wtq
Copy link

d11wtq commented Jun 10, 2014

Firstly, thanks for this. While on the surface it's good fun, it's really useful for developing against docker itself (my use case).

I have changed the wrapdocker script slightly to use -s devicemapper instead of aufs, so that I don't need to use a volume for each container (and therefore in theory have more disposable containers).

However, after I've started and stopped some number of containers (it feels like 10-15, but I haven't counted), docker refuses to start in any further containers, with the log output:

[error] attach_loopback.go:39 There are no more loopback device available.
loopback mounting failed

At this point, I can't figure out how to clean up whatever is using the loopback devices. Nothing shows up in df -a on the host machine, nor in the current (new) container. So I end up resorting to restarting the entire system.

Do I need to add something else to wrapdocker if I'm using devicemapper? Or would this be considered a bug in docker itself?

@d11wtq
Copy link
Author

d11wtq commented Jun 10, 2014

I think I've figured out what I need to do. I need to umount /var/lib/docker when I exit the container. I wonder what the most reliable way to do this would be (depending on .bash_logout, or some such seems risky).

@d11wtq
Copy link
Author

d11wtq commented Jun 10, 2014

Confirmed: Unmounting /var/lib/docker on exit fixes the error.

Question: Is this a more general issue that just shows it head early with devicemapper? If you mount something inside the docker container, but you don't unmount it before you destroy the container, does that mountpoint live somewhere on the host forever? Just wondering if the cgroups mounts also need cleaning up on exit?

@calebds
Copy link

calebds commented Jul 12, 2014

I am seeing the same issue when using wrapdocker inside of a drone build container. The idea is that my build process outputs a docker container. After about ten builds I get:

[error] attach_loopback.go:42 There are no more loopback device available.
loopback mounting failed

But I am using an unmodified version of wrapdocker so I'm unsure @d11wtq whether your solution is right for me. Again, only restarting my drone machine temporarily fixes the problem. The minimal .drone.yml file that reproduces this is:

image: ...
script:
    - ./.drone/build.sh

Where build.sh is:

#!/bin/bash
wrapdocker &
sleep 5
docker build ...

Any thoughts? Thanks.

@rohanpm
Copy link

rohanpm commented Jul 14, 2014

I also hit a problem with the same symptoms. I can see in the output of losetup -l that the number of used loop devices continues to grow if I start nested docker daemons and don't stop them gracefully (in my case containers are always killed with SIGKILL).

I haven't been able to fix the leaking but I was able to mitigate it by an addition to the containers. What I found is that sometimes, the next available loop device would be a higher number than the existing device nodes /dev/loop* within the container, causing docker to give the "There are no more loopback device available" error. The error would occur much earlier than reaching the actual max number of loopback devices configured in the kernel.

I was able to fix that particular case by invoking this scriptlet in the container before docker is started, which ensures two free loopback devices exist:

#!/bin/bash
ensure_loop(){
  num="$1"
  dev="/dev/loop$num"
  if test -b "$dev"; then
    echo "$dev is a usable loop device."
    return 0
  fi

  echo "Attempting to create $dev for docker ..."
  if ! mknod -m660 $dev b 7 $num; then
    echo "Failed to create $dev!" 1>&2
    return 3
  fi

  return 0
}

LOOP_A=$(losetup -f)
LOOP_A=${LOOP_A#/dev/loop}
LOOP_B=$(expr $LOOP_A + 1)

ensure_loop $LOOP_A
ensure_loop $LOOP_B

Maybe it'd be possible to add something like that ^ into wrapdocker, if it helps anyone.

@calebds
Copy link

calebds commented Jul 14, 2014

The solution for me appears to be simply run:

service docker stop

at the end of my build script to gracefully stop the docker daemon started by wrapdocker. Then the number of loopback devices as per losetup -a does not grow (by 2) every drone build.

@calebds
Copy link

calebds commented Jul 21, 2014

I did encounter a situation in which service docker stop results in:

* Stopping Docker: docker
start-stop-daemon: warning: failed to kill 1067: No such process
1 pids were not killed
No process in pidfile '/var/run/docker-ssd.pid' found running; none killed.

In this case I use the following for graceful shutdown of docker -d:

kill -15 `ps ax | grep "docker -d" | grep -v grep | awk {'print $1'}`

@Cactusbone
Copy link

I'm using this to stop wrapdocker :
start-stop-daemon --stop --pidfile "/var/run/docker.pid"

it seems the pid used by service docker is not always the same as wrapdocker :)

@lpereir4
Copy link

@rohanpm it serves me well, thank you

@calebds
Copy link

calebds commented Sep 15, 2014

@rohanpm Works for me too, thank you! I quote your solution here: http://paislee.io/how-to-build-and-deploy-docker-images-with-drone/

@jpetazzo
Copy link
Owner

I'll close this issue since it appears to be solved. But feel free to re-open/comment if it's not the case!

Thanks,

@SvenDowideit
Copy link

@rohanpm I've used your script in moby/moby#9117 - if you can confirm thats ok, that would be great :)

@rohanpm
Copy link

rohanpm commented Nov 13, 2014

@SvenDowideit, sure, no problem. (I saw the later discussion about maybe doing the same thing from golang code too.)

@SvenDowideit
Copy link

ya :) much +1 to getting something done!

@blalor
Copy link

blalor commented Nov 22, 2014

I think this should be reopened and @rohanpm's contribution added to wrapdocker. Running dind on a CentOS 7 host results in the same error due to not enough loopback devices.

@jpetazzo
Copy link
Owner

Oh, sure. I'll be happy to look at a PR for this. Thanks a lot!

@rncry
Copy link

rncry commented Feb 17, 2015

I'm having the same issue with running dind on Centos7, it can't seem to actually start the docker daemon within the container.

@StefanScherer
Copy link

The .drone/build.sh script could be enhanced by using the trap command of the bash script. This error handler will be called before the bash script exits. So this could be a good place to clean up loop devices.

#!/bin/bash

handle_error() {
  echo "FAILED: line $1, exit code $2"
  echo "Remove loop device here ...."
  exit 1
}

trap 'handle_error $LINENO $?' ERR 

set -e
# start your build here ...

Just my two cents.

@jpetazzo jpetazzo reopened this Apr 7, 2015
@jpetazzo
Copy link
Owner

jpetazzo commented Apr 7, 2015

I'm reopening but I don't use CentOS so I don't know how to help. I hope someone has a better idea!

@t5unamie
Copy link

I am currently still haveing issues with running out of "Running out of loopback devices " on ubuntu 14:04 LTS with docker 1.5 on the host machine.

I hhave tried the following.

#66

Not sure where I am going wrong with this. Please help.

@ryanwalls
Copy link

This happens on the standard amazon linux AMI on EC2 as well.

@derfred
Copy link

derfred commented May 5, 2015

I am seeing this issue on CoreOS 668.2.0. The script by @rohanpm does not work in my instance. I get the following output:

root@slave6:~# ensure_loop $LOOP_A
/dev/loop1 is a usable loop device.
root@slave6:~# ensure_loop $LOOP_B
/dev/loop2 is a usable loop device.

@alexanderilyin
Copy link

trap works for me alexanderilyin/docker-teamcity-agent@ef0cf6e

sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.
sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.
sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.
sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.
@kennethkalmer
Copy link

Just to add, on CoreOS 717.3.0 this seems to work just fine with loopback devices starting with /dev/loop0...

@spg
Copy link

spg commented Jul 14, 2015

Problem still happening on CoreOS

@zoechi
Copy link

zoechi commented Jul 28, 2015

+1 Debian, Docker 1.6.2

@ghost
Copy link

ghost commented Aug 7, 2015

Same problem on CentOS 7.1
I started a container where I started another container. This problem occurred when I shut down the first container without having shut down the one inside.

@zoechi
Copy link

zoechi commented Aug 17, 2015

The first DinD container seems to start fine every time now,
but when I try to start a 2nd DinD I get the "no more loopback devices" again.

@zoechi
Copy link

zoechi commented Aug 17, 2015

I added a retry-loop and at the 3rd attempt also both containers are and stay running.

@jpetazzo
Copy link
Owner

jpetazzo commented Sep 3, 2015

For those of you using DinD for CI/testing, please have a look at this new blog post!

@alexanderilyin
Copy link

@jpetazzo thx for post. After severals weeks of pain and suffering I've ended with exposing socket. I've just wanted to hear that it is not the worst solution from someone else.

@jpetazzo
Copy link
Owner

There is now an official docker:dind image upstream! I invite you to test it, since it is actively maintained. Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests