Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No available IPv4 addresses on this network's address pools: bridge #18527

Closed
Soulou opened this issue Dec 9, 2015 · 36 comments
Closed

No available IPv4 addresses on this network's address pools: bridge #18527

Soulou opened this issue Dec 9, 2015 · 36 comments
Labels
area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
Milestone

Comments

@Soulou
Copy link
Contributor

Soulou commented Dec 9, 2015

Hi there, one of my production host encountered the following error when starting a new container (creation when well, POST /start request failed):

no available IPv4 addresses on this network's address pools: bridge 

The bridge network is the default one giving IPs on 172.17.0.0/16

There are not so many containers on the host (70, see docker info output below) So I suppose there are plenty of available addresses.

In the complete log I've a warning about umount ipc, but I suppose it's related to the cleaning of the container start after the IP allocation error.

What could be the source of that? Thank you very much.

Complete error log

time="2015-12-09T05:11:25.189277559+01:00" level=info msg="POST /v1.12/containers/8071fd471587152b90353bcdcc182d5609094123c1c1bc48c8ec0c68617b0a7a/start
time="2015-12-09T05:11:25.190917242+01:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/8071fd471587152b90353bcdcc182d5609094123c1c1bc48c8ec0c68617b0a7a/shm: no such file or directory\nfailed to umount /var/lib/docker/containers/8071fd471587152b90353bcdcc182d5609094123c1c1bc48c8ec0c68617b0a7a/mqueue: no such file or directory"
time="2015-12-09T05:11:25.191040536+01:00" level=error msg="Handler for POST /v1.12/containers/8071fd471587152b90353bcdcc182d5609094123c1c1bc48c8ec0c68617b0a7a/start returned error: Cannot start container 8071fd471587152b90353bcdcc182d5609094123c1c1bc48c8ec0c68617b0a7a: no available IPv4 addresses on this network's address pools: bridge (1a238bd06bf71115861570f9a62ae5d047334adc12899f0810d31109836071cd)"
time="2015-12-09T05:11:25.191097543+01:00" level=error msg="HTTP Error" err="Cannot start container 8071fd471587152b90353bcdcc182d5609094123c1c1bc48c8ec0c68617b0a7a: no available IPv4 addresses on this network's address pools: bridge (1a238bd06bf71115861570f9a62ae5d047334adc12899f0810d31109836071cd)" statusCode=500

docker version

Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:12:04 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:12:04 UTC 2015
 OS/Arch:      linux/amd64

docker info

Containers: 70
Images: 254
Server Version: 1.9.1
Storage Driver: btrfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.0-21-generic
Operating System: Ubuntu 14.04.3 LTS
CPUs: 6
Total Memory: 31.42 GiB
Name: <hostname>
ID: 5GQY:KHCJ:3Z7I:EFGX:YAI5:IA4B:56FT:RXFD:WWSM:F7XV:WVKG:2J6D
@GordonTheTurtle
Copy link

Hi!

Please read this important information about creating issues.

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

This is an automated, informational response.

Thank you.

For more information about reporting issues, see https://github.com/docker/docker/blob/master/CONTRIBUTING.md#reporting-other-issues


BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:
uname -a:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

----------END REPORT ---------

#ENEEDMOREINFO

@Soulou
Copy link
Contributor Author

Soulou commented Dec 9, 2015

    {
        "Name": "bridge",
        "Id": "1a238bd06bf71115861570f9a62ae5d047334adc12899f0810d31109836071cd",
        "Scope": "local",
        "Driver": "bridge",
        "IPAM": {
            "Driver": "default",
            "Config": [
                {
                    "Subnet": "172.17.42.1/16",
                    "Gateway": "172.17.42.1"
                }
            ]
        },

More information about current network

@Soulou
Copy link
Contributor Author

Soulou commented Dec 9, 2015

List of allocated IP according to docker network inspect bridge

"IPv4Address": "172.17.0.1/16"
"IPv4Address": "172.17.0.2/16"
"IPv4Address": "172.17.0.3/16"
"IPv4Address": "172.17.0.4/16"
"IPv4Address": "172.17.0.5/16"
"IPv4Address": "172.17.0.6/16"
"IPv4Address": "172.17.0.7/16"
"IPv4Address": "172.17.0.8/16"
"IPv4Address": "172.17.0.9/16"
"IPv4Address": "172.17.0.10/16"
"IPv4Address": "172.17.0.11/16"
"IPv4Address": "172.17.0.12/16"
"IPv4Address": "172.17.0.13/16"
"IPv4Address": "172.17.0.14/16"
"IPv4Address": "172.17.0.15/16"
"IPv4Address": "172.17.0.16/16"
"IPv4Address": "172.17.0.17/16"
"IPv4Address": "172.17.0.18/16"
"IPv4Address": "172.17.0.19/16"
"IPv4Address": "172.17.0.20/16"
"IPv4Address": "172.17.0.21/16"
"IPv4Address": "172.17.0.22/16"
"IPv4Address": "172.17.0.23/16"
"IPv4Address": "172.17.0.24/16"
"IPv4Address": "172.17.0.25/16"
"IPv4Address": "172.17.0.26/16"
"IPv4Address": "172.17.0.27/16"
"IPv4Address": "172.17.0.28/16"
"IPv4Address": "172.17.0.29/16"
"IPv4Address": "172.17.0.30/16"
"IPv4Address": "172.17.0.31/16"
"IPv4Address": "172.17.0.32/16"
"IPv4Address": "172.17.0.33/16"
"IPv4Address": "172.17.0.34/16"
"IPv4Address": "172.17.0.35/16"
"IPv4Address": "172.17.0.36/16"
"IPv4Address": "172.17.0.37/16"
"IPv4Address": "172.17.0.38/16"
"IPv4Address": "172.17.0.39/16"
"IPv4Address": "172.17.0.40/16"
"IPv4Address": "172.17.0.41/16"
"IPv4Address": "172.17.0.42/16"
"IPv4Address": "172.17.0.43/16"
"IPv4Address": "172.17.0.44/16"
"IPv4Address": "172.17.0.45/16"
"IPv4Address": "172.17.0.46/16"
"IPv4Address": "172.17.0.47/16"
"IPv4Address": "172.17.0.48/16"
"IPv4Address": "172.17.0.49/16"
"IPv4Address": "172.17.0.50/16"
"IPv4Address": "172.17.0.51/16"
"IPv4Address": "172.17.0.52/16"
"IPv4Address": "172.17.0.53/16"
"IPv4Address": "172.17.0.54/16"
"IPv4Address": "172.17.0.55/16"
"IPv4Address": "172.17.0.56/16"
"IPv4Address": "172.17.0.57/16"
"IPv4Address": "172.17.0.58/16"
"IPv4Address": "172.17.0.59/16"
"IPv4Address": "172.17.0.60/16"
"IPv4Address": "172.17.0.61/16"
"IPv4Address": "172.17.0.62/16"
"IPv4Address": "172.17.0.63/16"
"IPv4Address": "172.17.0.64/16"
"IPv4Address": "172.17.0.65/16"
"IPv4Address": "172.17.0.66/16"
"IPv4Address": "172.17.0.67/16"
"IPv4Address": "172.17.0.68/16"
"IPv4Address": "172.17.0.69/16"
"IPv4Address": "172.17.0.70/16"
"IPv4Address": "172.17.0.71/16"
"IPv4Address": "172.17.0.72/16"
"IPv4Address": "172.17.0.73/16"
"IPv4Address": "172.17.0.75/16"
"IPv4Address": "172.17.0.77/16"
"IPv4Address": "172.17.0.80/16"
"IPv4Address": "172.17.0.81/16"
"IPv4Address": "172.17.0.82/16"
"IPv4Address": "172.17.0.83/16"

Definitely not all of them

@thaJeztah
Copy link
Member

Are you using docker-in-docker or running more than one daemon? If you do, this may be a duplicate of #17939

@Soulou
Copy link
Contributor Author

Soulou commented Dec 9, 2015

No I'm not doing that, I've seen the issue, but have not detected any similarity

@thaJeztah
Copy link
Member

ping @aboch (sorry I'm kinda loosing track which network issues are still being tracked) 😄

@aboch
Copy link
Contributor

aboch commented Dec 9, 2015

@Soulou
If your system is still in this state, can you please post the o/p of:
strings /var/lib/docker/network/files/local-kv.db

@Soulou
Copy link
Contributor Author

Soulou commented Dec 9, 2015

The system is not in the state but I can reproduce it super easyly @aboch output is coming.

@Soulou
Copy link
Contributor Author

Soulou commented Dec 9, 2015

output.txt

Output is attached to the post

@feisuzhu
Copy link

@Soulou
same issue, mark here
can you provide the way to reproduce?

@thaJeztah
Copy link
Member

ping @aboch did you have time to look at the output @Soulou provided? #18527 (comment)

@Soulou
Copy link
Contributor Author

Soulou commented Dec 14, 2015

@feisuzhu No way to reproduce on another host. The host which has this problem is still running and I can reproduce it. (I just have to run a few more containers until hitting this weird limit

@mavenugo
Copy link
Contributor

@aboch do you think this is also related to #18145 ?

@aboch
Copy link
Contributor

aboch commented Dec 14, 2015

@mavenugo
I don't think so. I think this is more related to the local store file getting somehow messed up, but I was not able to confirm either of the twos, @Soulou's file has quite a number of configurations.

What I can tell is that it is not the same issue hit in #18113 (comment)

@Soulou In order to debug more deeper what is going on, would it be ok for you to send me (privately) your full /var/lib/docker/network/files/local-kv.db file so that I can analyze it with some instrumented code ?

@Soulou
Copy link
Contributor Author

Soulou commented Dec 15, 2015

@aboch no problem, please send me an email at leo [ at ] scalingo [ dot ] com, I'll send you the file back.

I suppose it's related to some kind of corruptions of the allocator, a bit the opposite of #18535, IPs are considered taken but are not or something like this.

@psviderski
Copy link

I have the same issue on my production server that has occurred twice already. The number of running containers was 216 and it wasn't possible to start a new one. I was able to remove some of running containers and the issue gone.
When the system was in a stuck state I made a copy of /var/lib/docker/network/files/local-kv.db file which I can share if that helps fix the issue.

@aboch
Copy link
Contributor

aboch commented Dec 23, 2015

@psviderski
Please send me the copy of the file at aboch [at] docker.com

@Soulou
Your mail server has been blocking my email.

@psviderski
Copy link

@aboch I've just sent you a letter with the db file. Please let me know if I need to provide something else that probably can help you.

@Soulou
Copy link
Contributor Author

Soulou commented Dec 24, 2015

@aboch I had to reset the node (as it was one of our production servers) and I forgot to backup the db file :-( If the situation appears again, I'll send it to you directly!

@beetree
Copy link

beetree commented Jan 2, 2016

I believe I'm having the same issue:

Command:

# docker start d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723
Error response from daemon: Cannot start container d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723: no available IPv4 addresses on this network's address pools: bridge (562b4a92aede2ec01b0797fdf20ebca7121cd1695000de11b47cbfa7108bc062)
Error: failed to start containers: [d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723]

Docker log:

Jan 02 19:26:33 Ubuntu-1510-wily-64-minimal docker[3999]: time="2016-01-02T19:26:33.965622536+01:00" level=info msg="POST /v1.21/containers/d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723/start"
Jan 02 19:26:33 Ubuntu-1510-wily-64-minimal docker[3999]: time="2016-01-02T19:26:33.974207730+01:00" level=warning msg="failed to cleanup ipc mounts:\nfailed to umount /var/lib/docker/containers/d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723/shm: no such file or directory\nfailed to umount /var/lib/docker/containers/d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723/mqueue: no such file or directory"
Jan 02 19:26:34 Ubuntu-1510-wily-64-minimal docker[3999]: time="2016-01-02T19:26:34.005211348+01:00" level=error msg="Handler for POST /v1.21/containers/d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723/start returned error: Cannot start container d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723: no available IPv4 addresses on this network's address pools: bridge (562b4a92aede2ec01b0797fdf20ebca7121cd1695000de11b47cbfa7108bc062)"
Jan 02 19:26:34 Ubuntu-1510-wily-64-minimal docker[3999]: time="2016-01-02T19:26:34.005272128+01:00" level=error msg="HTTP Error" err="Cannot start container d25760b692490b43ba0925ee472bb74a6fa39a0330d06b41e3e09c7ed959e723: no available IPv4 addresses on this network's address pools: bridge (562b4a92aede2ec01b0797fdf20ebca7121cd1695000de11b47cbfa7108bc062)" statusCode=500
Jan 02 19:26:47 Ubuntu-1510-wily-64-minimal docker[3999]: time="2016-01-02T19:26:47.632223089+01:00" level=info msg="POST /v1.21/containers/9e53cf34454b562883f814826cc5d826f3bb18f6889ca650305414728cab9c6a/exec"
Jan 02 19:26:47 Ubuntu-1510-wily-64-minimal docker[3999]: time="2016-01-02T19:26:47.632635812+01:00" level=info msg="POST /v1.21/exec/674132a729976e13a86d58df839e7b07735605bf03f563042c8caacb1af8e74e/start"
Jan 02 19:26:47 Ubuntu-1510-wily-64-minimal docker[3999]: time="2016-01-02T19:26:47.675101492+01:00" level=info msg="GET /v1.21/exec/674132a729976e13a86d58df839e7b07735605bf03f563042c8caacb1af8e74e/json"

System info:

# docker info
Containers: 134
Images: 186
Server Version: 1.9.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 454
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.0-21-generic
Operating System: Ubuntu 15.10
CPUs: 12
Total Memory: 125.9 GiB
Name: Ubuntu-1510-wily-64-minimal
ID: 4ZRV:LJBE:5ACK:TH4J:5LQW:OELG:JP27:AGRH:R22R:H6HX:UNR3:T46C
WARNING: No swap limit support

# docker version
Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:20:08 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 13:20:08 UTC 2015
 OS/Arch:      linux/amd64

# uname -a
Linux Ubuntu-1510-wily-64-minimal 4.2.0-21-generic #25-Ubuntu SMP Wed Dec 2 18:42:25 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

@aboch I'll send you the full local-kv.db on your email.

EDIT: This is the first time I'm seeing this issue in my cluster. Unfortunately, it's blocking container creation on the node so I'll have to "fix" it by reducing the amount of containers. Unclear whether I can recreate it or not. FWIW, I'm running 122 containers only on the node whereas I have identical nodes running up to 250 containers without this issue appearing (yet).

@thaJeztah thaJeztah added the kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. label Jan 2, 2016
@Soulou
Copy link
Contributor Author

Soulou commented Jan 3, 2016

@beetree if it's the same issue as me, the issue will appear again at 250 containers, it was highly reproducible until I've restarted the docker daemon.

@aboch
Copy link
Contributor

aboch commented Jan 3, 2016

Thanks @beetree for the file.
Had some time to decode the inner bitsequence structure from it. I can confirm it is corrupted with similar pattern that also caused #18535 (See #18535 (comment)).

Bug will not always occur,only some address release/re-allocate patterns will expose the issue.

As you found, you may be able to keep it working decreasing the number of containers. But unfortunately you will hit it again on some n-th container creation, as @Soulou already mentioned.

This is now fixed in master.

@Soulou
Copy link
Contributor Author

Soulou commented Jan 3, 2016

Great, can be close then? :)

@thaJeztah
Copy link
Member

I'll close this issue, as it should be resolved on master, and will be in 1.10, but feel free to comment if you're still able to reproduce this on the current master, or in the 1.10 release

@thaJeztah thaJeztah added this to the 1.10 milestone Jan 3, 2016
@aboch
Copy link
Contributor

aboch commented Jan 3, 2016

@Soulou @psviderski
With the instrumented code I checked your files as well. Thanks for sending them.
Same corruption pattern as in @beetree's case.
Thanks @thaJeztah for closing the issue.

@beetree
Copy link

beetree commented Jan 4, 2016

Thanks @aboch and @thaJeztah for helping us out with this. Knowing that this gets fixed in 1.10 gives a lot of comfort. Keep the improvements coming...!

@psviderski
Copy link

@aboch @thaJeztah: Thanks! 👍

@beetree
Copy link

beetree commented Jan 16, 2016

Restarting the daemon solves this, but restarting the daemon can result in the error Error starting daemon: Error initializing network controller: could not delete the default bridge network. If this happens, just delete /var/lib/docker/network. See this issue for more info: #17083

UPDATE: Deleting /var/lib/docker/network seems to actually not solve it. However, deleting everything in /var/lib/docker does solve it.

/beetree

@beetree
Copy link

beetree commented Jan 18, 2016

@aboch @thaJeztah I don't see any mentioning of this having been fixed in https://github.com/docker/docker/releases/tag/v1.10.0-rc1 Are you sure it is in?

@mavenugo
Copy link
Contributor

@beetree yes. this is resolved.
I think the changelog is used to communicate important features and enhancements.
Being a bug fix, this would have been left out of changelog.

@thaJeztah
Copy link
Member

@mavenugo perhaps we can add a mention of "stability improvements to ....", open to suggestions if you have them

@jtangelder
Copy link

I had the same issue, with only 10 machines running. Fixed it by removing the network and creating it again.

@wsw70
Copy link

wsw70 commented Mar 22, 2016

@jtangelder how exactly did you "remove the network and created it again"? (I have the same issue)

@jtangelder
Copy link

Something with docker machine network remove. Not behind my machine at the moment...

@AndrewGuenther
Copy link
Contributor

@aboch Could you point me to the commit in libnetwork which fixed this?

@aboch
Copy link
Contributor

aboch commented Apr 11, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed.
Projects
None yet
Development

No branches or pull requests