Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Packets lost when scaling #1350
Hello I have some issues when I try to scale a service in docker swarm (One node is enough to reproduce the bug).
To reproduce the bug:
Create a service, here it's just a nc server who returns the container hostname on the port 8080.
docker service create -p 8080:8080 --name=nc nurza/nc bash -c 'hostname>hostname && while true ; do nc -l 8080 < hostname ; done'
On the same system, launch a curl loop.
while true ; do curl 127.0.0.1:8080 ; sleep 0.1 ; done
And scale the service with a high number like 30
docker service scale nc=30
And in the loop, it will appears some lines like this during 30sec approximately:
548ce9a70d35 bfdc0cc701e1 8bed39426672 7880a6e5167d curl: (7) Failed to connect to 127.0.0.1 port 8080: Connection refused c5d5aead5781 127888f96883 e062990c1f2c 9bef031116a5 d81d97c7a679 3143312230b4 043c0ee0b059 93828ebfcb02 719495543f82 548ce9a70d35 bfdc0cc701e1 8bed39426672 7880a6e5167d curl: (7) Failed to connect to 127.0.0.1 port 8080: Connection refused d2657ab2594c 7a02ef70551c 36627dd0e197
And same when you scale down:
docker service scale nc=2
I have these line in the loop:
In conclusion: scaling a service will result in a network packets drop.
Am I the only one with this problem? Thank you.
EDIT: sometime when I scale down, the service's network freeze during 1min.
I think the
The TCP connection is in SYN_SENT state. In my system it'd retry 6 times, taking around 45 seconds to give up this connection.
@Nurza @dperny @dongluochen This issue is not in swarmkit. It has been fixed in docker/libnetwork#1370 which will be available in the next patch release of docker engine. Since it is not an issue in swarmkit and since there are already a bunch of issues open for this in docker/docker I am going to close this issue. Please feel free to continue the discussion here if necessary.