New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running out of loopback devices #19

Closed
d11wtq opened this Issue Jun 10, 2014 · 31 comments

Comments

Projects
None yet
@d11wtq

d11wtq commented Jun 10, 2014

Firstly, thanks for this. While on the surface it's good fun, it's really useful for developing against docker itself (my use case).

I have changed the wrapdocker script slightly to use -s devicemapper instead of aufs, so that I don't need to use a volume for each container (and therefore in theory have more disposable containers).

However, after I've started and stopped some number of containers (it feels like 10-15, but I haven't counted), docker refuses to start in any further containers, with the log output:

[error] attach_loopback.go:39 There are no more loopback device available.
loopback mounting failed

At this point, I can't figure out how to clean up whatever is using the loopback devices. Nothing shows up in df -a on the host machine, nor in the current (new) container. So I end up resorting to restarting the entire system.

Do I need to add something else to wrapdocker if I'm using devicemapper? Or would this be considered a bug in docker itself?

@d11wtq

This comment has been minimized.

Show comment
Hide comment
@d11wtq

d11wtq Jun 10, 2014

I think I've figured out what I need to do. I need to umount /var/lib/docker when I exit the container. I wonder what the most reliable way to do this would be (depending on .bash_logout, or some such seems risky).

d11wtq commented Jun 10, 2014

I think I've figured out what I need to do. I need to umount /var/lib/docker when I exit the container. I wonder what the most reliable way to do this would be (depending on .bash_logout, or some such seems risky).

@d11wtq

This comment has been minimized.

Show comment
Hide comment
@d11wtq

d11wtq Jun 10, 2014

Confirmed: Unmounting /var/lib/docker on exit fixes the error.

Question: Is this a more general issue that just shows it head early with devicemapper? If you mount something inside the docker container, but you don't unmount it before you destroy the container, does that mountpoint live somewhere on the host forever? Just wondering if the cgroups mounts also need cleaning up on exit?

d11wtq commented Jun 10, 2014

Confirmed: Unmounting /var/lib/docker on exit fixes the error.

Question: Is this a more general issue that just shows it head early with devicemapper? If you mount something inside the docker container, but you don't unmount it before you destroy the container, does that mountpoint live somewhere on the host forever? Just wondering if the cgroups mounts also need cleaning up on exit?

@paislee

This comment has been minimized.

Show comment
Hide comment
@paislee

paislee Jul 12, 2014

I am seeing the same issue when using wrapdocker inside of a drone build container. The idea is that my build process outputs a docker container. After about ten builds I get:

[error] attach_loopback.go:42 There are no more loopback device available.
loopback mounting failed

But I am using an unmodified version of wrapdocker so I'm unsure @d11wtq whether your solution is right for me. Again, only restarting my drone machine temporarily fixes the problem. The minimal .drone.yml file that reproduces this is:

image: ...
script:
    - ./.drone/build.sh

Where build.sh is:

#!/bin/bash
wrapdocker &
sleep 5
docker build ...

Any thoughts? Thanks.

paislee commented Jul 12, 2014

I am seeing the same issue when using wrapdocker inside of a drone build container. The idea is that my build process outputs a docker container. After about ten builds I get:

[error] attach_loopback.go:42 There are no more loopback device available.
loopback mounting failed

But I am using an unmodified version of wrapdocker so I'm unsure @d11wtq whether your solution is right for me. Again, only restarting my drone machine temporarily fixes the problem. The minimal .drone.yml file that reproduces this is:

image: ...
script:
    - ./.drone/build.sh

Where build.sh is:

#!/bin/bash
wrapdocker &
sleep 5
docker build ...

Any thoughts? Thanks.

@rohanpm

This comment has been minimized.

Show comment
Hide comment
@rohanpm

rohanpm Jul 14, 2014

I also hit a problem with the same symptoms. I can see in the output of losetup -l that the number of used loop devices continues to grow if I start nested docker daemons and don't stop them gracefully (in my case containers are always killed with SIGKILL).

I haven't been able to fix the leaking but I was able to mitigate it by an addition to the containers. What I found is that sometimes, the next available loop device would be a higher number than the existing device nodes /dev/loop* within the container, causing docker to give the "There are no more loopback device available" error. The error would occur much earlier than reaching the actual max number of loopback devices configured in the kernel.

I was able to fix that particular case by invoking this scriptlet in the container before docker is started, which ensures two free loopback devices exist:

#!/bin/bash
ensure_loop(){
  num="$1"
  dev="/dev/loop$num"
  if test -b "$dev"; then
    echo "$dev is a usable loop device."
    return 0
  fi

  echo "Attempting to create $dev for docker ..."
  if ! mknod -m660 $dev b 7 $num; then
    echo "Failed to create $dev!" 1>&2
    return 3
  fi

  return 0
}

LOOP_A=$(losetup -f)
LOOP_A=${LOOP_A#/dev/loop}
LOOP_B=$(expr $LOOP_A + 1)

ensure_loop $LOOP_A
ensure_loop $LOOP_B

Maybe it'd be possible to add something like that ^ into wrapdocker, if it helps anyone.

rohanpm commented Jul 14, 2014

I also hit a problem with the same symptoms. I can see in the output of losetup -l that the number of used loop devices continues to grow if I start nested docker daemons and don't stop them gracefully (in my case containers are always killed with SIGKILL).

I haven't been able to fix the leaking but I was able to mitigate it by an addition to the containers. What I found is that sometimes, the next available loop device would be a higher number than the existing device nodes /dev/loop* within the container, causing docker to give the "There are no more loopback device available" error. The error would occur much earlier than reaching the actual max number of loopback devices configured in the kernel.

I was able to fix that particular case by invoking this scriptlet in the container before docker is started, which ensures two free loopback devices exist:

#!/bin/bash
ensure_loop(){
  num="$1"
  dev="/dev/loop$num"
  if test -b "$dev"; then
    echo "$dev is a usable loop device."
    return 0
  fi

  echo "Attempting to create $dev for docker ..."
  if ! mknod -m660 $dev b 7 $num; then
    echo "Failed to create $dev!" 1>&2
    return 3
  fi

  return 0
}

LOOP_A=$(losetup -f)
LOOP_A=${LOOP_A#/dev/loop}
LOOP_B=$(expr $LOOP_A + 1)

ensure_loop $LOOP_A
ensure_loop $LOOP_B

Maybe it'd be possible to add something like that ^ into wrapdocker, if it helps anyone.

@paislee

This comment has been minimized.

Show comment
Hide comment
@paislee

paislee Jul 14, 2014

The solution for me appears to be simply run:

service docker stop

at the end of my build script to gracefully stop the docker daemon started by wrapdocker. Then the number of loopback devices as per losetup -a does not grow (by 2) every drone build.

paislee commented Jul 14, 2014

The solution for me appears to be simply run:

service docker stop

at the end of my build script to gracefully stop the docker daemon started by wrapdocker. Then the number of loopback devices as per losetup -a does not grow (by 2) every drone build.

@paislee

This comment has been minimized.

Show comment
Hide comment
@paislee

paislee Jul 21, 2014

I did encounter a situation in which service docker stop results in:

* Stopping Docker: docker
start-stop-daemon: warning: failed to kill 1067: No such process
1 pids were not killed
No process in pidfile '/var/run/docker-ssd.pid' found running; none killed.

In this case I use the following for graceful shutdown of docker -d:

kill -15 `ps ax | grep "docker -d" | grep -v grep | awk {'print $1'}`

paislee commented Jul 21, 2014

I did encounter a situation in which service docker stop results in:

* Stopping Docker: docker
start-stop-daemon: warning: failed to kill 1067: No such process
1 pids were not killed
No process in pidfile '/var/run/docker-ssd.pid' found running; none killed.

In this case I use the following for graceful shutdown of docker -d:

kill -15 `ps ax | grep "docker -d" | grep -v grep | awk {'print $1'}`
@Cactusbone

This comment has been minimized.

Show comment
Hide comment
@Cactusbone

Cactusbone Aug 18, 2014

I'm using this to stop wrapdocker :
start-stop-daemon --stop --pidfile "/var/run/docker.pid"

it seems the pid used by service docker is not always the same as wrapdocker :)

Cactusbone commented Aug 18, 2014

I'm using this to stop wrapdocker :
start-stop-daemon --stop --pidfile "/var/run/docker.pid"

it seems the pid used by service docker is not always the same as wrapdocker :)

@lpereir4

This comment has been minimized.

Show comment
Hide comment
@lpereir4

lpereir4 Sep 15, 2014

@rohanpm it serves me well, thank you

lpereir4 commented Sep 15, 2014

@rohanpm it serves me well, thank you

@paislee

This comment has been minimized.

Show comment
Hide comment
@paislee

paislee commented Sep 15, 2014

@rohanpm Works for me too, thank you! I quote your solution here: http://paislee.io/how-to-build-and-deploy-docker-images-with-drone/

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Oct 31, 2014

Owner

I'll close this issue since it appears to be solved. But feel free to re-open/comment if it's not the case!

Thanks,

Owner

jpetazzo commented Oct 31, 2014

I'll close this issue since it appears to be solved. But feel free to re-open/comment if it's not the case!

Thanks,

@SvenDowideit

This comment has been minimized.

Show comment
Hide comment
@SvenDowideit

SvenDowideit Nov 12, 2014

@rohanpm I've used your script in moby/moby#9117 - if you can confirm thats ok, that would be great :)

SvenDowideit commented Nov 12, 2014

@rohanpm I've used your script in moby/moby#9117 - if you can confirm thats ok, that would be great :)

@rohanpm

This comment has been minimized.

Show comment
Hide comment
@rohanpm

rohanpm Nov 13, 2014

@SvenDowideit, sure, no problem. (I saw the later discussion about maybe doing the same thing from golang code too.)

rohanpm commented Nov 13, 2014

@SvenDowideit, sure, no problem. (I saw the later discussion about maybe doing the same thing from golang code too.)

@SvenDowideit

This comment has been minimized.

Show comment
Hide comment
@SvenDowideit

SvenDowideit Nov 14, 2014

ya :) much +1 to getting something done!

SvenDowideit commented Nov 14, 2014

ya :) much +1 to getting something done!

@blalor

This comment has been minimized.

Show comment
Hide comment
@blalor

blalor Nov 22, 2014

I think this should be reopened and @rohanpm's contribution added to wrapdocker. Running dind on a CentOS 7 host results in the same error due to not enough loopback devices.

blalor commented Nov 22, 2014

I think this should be reopened and @rohanpm's contribution added to wrapdocker. Running dind on a CentOS 7 host results in the same error due to not enough loopback devices.

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Nov 26, 2014

Owner

Oh, sure. I'll be happy to look at a PR for this. Thanks a lot!

Owner

jpetazzo commented Nov 26, 2014

Oh, sure. I'll be happy to look at a PR for this. Thanks a lot!

@rncry

This comment has been minimized.

Show comment
Hide comment
@rncry

rncry Feb 17, 2015

I'm having the same issue with running dind on Centos7, it can't seem to actually start the docker daemon within the container.

rncry commented Feb 17, 2015

I'm having the same issue with running dind on Centos7, it can't seem to actually start the docker daemon within the container.

@StefanScherer

This comment has been minimized.

Show comment
Hide comment
@StefanScherer

StefanScherer Feb 23, 2015

The .drone/build.sh script could be enhanced by using the trap command of the bash script. This error handler will be called before the bash script exits. So this could be a good place to clean up loop devices.

#!/bin/bash

handle_error() {
  echo "FAILED: line $1, exit code $2"
  echo "Remove loop device here ...."
  exit 1
}

trap 'handle_error $LINENO $?' ERR 

set -e
# start your build here ...

Just my two cents.

StefanScherer commented Feb 23, 2015

The .drone/build.sh script could be enhanced by using the trap command of the bash script. This error handler will be called before the bash script exits. So this could be a good place to clean up loop devices.

#!/bin/bash

handle_error() {
  echo "FAILED: line $1, exit code $2"
  echo "Remove loop device here ...."
  exit 1
}

trap 'handle_error $LINENO $?' ERR 

set -e
# start your build here ...

Just my two cents.

@jpetazzo jpetazzo reopened this Apr 7, 2015

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Apr 7, 2015

Owner

I'm reopening but I don't use CentOS so I don't know how to help. I hope someone has a better idea!

Owner

jpetazzo commented Apr 7, 2015

I'm reopening but I don't use CentOS so I don't know how to help. I hope someone has a better idea!

@t5unamie

This comment has been minimized.

Show comment
Hide comment
@t5unamie

t5unamie Apr 19, 2015

I am currently still haveing issues with running out of "Running out of loopback devices " on ubuntu 14:04 LTS with docker 1.5 on the host machine.

I hhave tried the following.

#66

Not sure where I am going wrong with this. Please help.

t5unamie commented Apr 19, 2015

I am currently still haveing issues with running out of "Running out of loopback devices " on ubuntu 14:04 LTS with docker 1.5 on the host machine.

I hhave tried the following.

#66

Not sure where I am going wrong with this. Please help.

@ryanwalls

This comment has been minimized.

Show comment
Hide comment
@ryanwalls

ryanwalls May 4, 2015

This happens on the standard amazon linux AMI on EC2 as well.

ryanwalls commented May 4, 2015

This happens on the standard amazon linux AMI on EC2 as well.

@derfred

This comment has been minimized.

Show comment
Hide comment
@derfred

derfred May 5, 2015

I am seeing this issue on CoreOS 668.2.0. The script by @rohanpm does not work in my instance. I get the following output:

root@slave6:~# ensure_loop $LOOP_A
/dev/loop1 is a usable loop device.
root@slave6:~# ensure_loop $LOOP_B
/dev/loop2 is a usable loop device.

derfred commented May 5, 2015

I am seeing this issue on CoreOS 668.2.0. The script by @rohanpm does not work in my instance. I get the following output:

root@slave6:~# ensure_loop $LOOP_A
/dev/loop1 is a usable loop device.
root@slave6:~# ensure_loop $LOOP_B
/dev/loop2 is a usable loop device.
@alexanderilyin

This comment has been minimized.

Show comment
Hide comment

sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015

Add docker-compose-dind.yml variable with Docker-in-Docker mesos-slaves
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.

sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015

Add docker-compose-dind.yml variant with Docker-in-Docker mesos-slaves
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.

sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015

Add docker-compose-dind.yml variant with Docker-in-Docker mesos-slaves
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.

sttts added a commit to mesosphere/kubernetes-mesos that referenced this issue Jun 18, 2015

Add docker-compose-dind.yml variant with Docker-in-Docker mesos-slaves
NOTE #1: The dind mesos-slave require a /var/lib/docker volume in order to use the
aufs driver instead of the lvm-loop one. The later leaks loop devices when used
with docker-compose (compare jpetazzo/dind#19).

There /var/lib/docker volumes are mounted from the host using /var/tmp/mesosslave1
and /var/tmp/mesosslave2. These can be deleted after a run.

NOTE #2: When using boot2docker on Mac the /Users directory is mounted using a
vboxsf mount into the boot2docker VirtualBox machine. Those vboxsf mounts are
not sufficient to mount them into a dind container as /var/lib/docker. Any
"docker pull" will fail on those volumes then.
@kennethkalmer

This comment has been minimized.

Show comment
Hide comment
@kennethkalmer

kennethkalmer Jul 13, 2015

Just to add, on CoreOS 717.3.0 this seems to work just fine with loopback devices starting with /dev/loop0...

kennethkalmer commented Jul 13, 2015

Just to add, on CoreOS 717.3.0 this seems to work just fine with loopback devices starting with /dev/loop0...

@spg

This comment has been minimized.

Show comment
Hide comment
@spg

spg Jul 14, 2015

Problem still happening on CoreOS

spg commented Jul 14, 2015

Problem still happening on CoreOS

@zoechi

This comment has been minimized.

Show comment
Hide comment
@zoechi

zoechi Jul 28, 2015

+1 Debian, Docker 1.6.2

zoechi commented Jul 28, 2015

+1 Debian, Docker 1.6.2

@eivantsov

This comment has been minimized.

Show comment
Hide comment
@eivantsov

eivantsov Aug 7, 2015

Same problem on CentOS 7.1
I started a container where I started another container. This problem occurred when I shut down the first container without having shut down the one inside.

eivantsov commented Aug 7, 2015

Same problem on CentOS 7.1
I started a container where I started another container. This problem occurred when I shut down the first container without having shut down the one inside.

@zoechi

This comment has been minimized.

Show comment
Hide comment
@zoechi

zoechi Aug 17, 2015

The first DinD container seems to start fine every time now,
but when I try to start a 2nd DinD I get the "no more loopback devices" again.

zoechi commented Aug 17, 2015

The first DinD container seems to start fine every time now,
but when I try to start a 2nd DinD I get the "no more loopback devices" again.

@zoechi

This comment has been minimized.

Show comment
Hide comment
@zoechi

zoechi Aug 17, 2015

I added a retry-loop and at the 3rd attempt also both containers are and stay running.

zoechi commented Aug 17, 2015

I added a retry-loop and at the 3rd attempt also both containers are and stay running.

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Sep 3, 2015

Owner

For those of you using DinD for CI/testing, please have a look at this new blog post!

Owner

jpetazzo commented Sep 3, 2015

For those of you using DinD for CI/testing, please have a look at this new blog post!

@alexanderilyin

This comment has been minimized.

Show comment
Hide comment
@alexanderilyin

alexanderilyin Sep 3, 2015

@jpetazzo thx for post. After severals weeks of pain and suffering I've ended with exposing socket. I've just wanted to hear that it is not the worst solution from someone else.

alexanderilyin commented Sep 3, 2015

@jpetazzo thx for post. After severals weeks of pain and suffering I've ended with exposing socket. I've just wanted to hear that it is not the worst solution from someone else.

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Sep 16, 2015

Owner

There is now an official docker:dind image upstream! I invite you to test it, since it is actively maintained. Thank you!

Owner

jpetazzo commented Sep 16, 2015

There is now an official docker:dind image upstream! I invite you to test it, since it is actively maintained. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment