New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swarm with multiple ingress networks #2637
Comments
ping @ctelfer |
I was able to resolve this as follows:
|
I recall there was an issue in the past where nodes upgraded from an old version did not have the "ingress" attribute set on the ingress network; were these existing nodes, and upgraded from an older version of docker (and if so, do you know what version?) |
@thaJeztah No I performed a CloudFormation Stack Update which means that all old nodes are replaced. Also, the old nodes ran the same docker version as the new ones. |
To answer the first question, no there should definitely not be two ingress networks present at the same time. My first thought was that this had something to do with some kind of incomplete restoration of the ingress network after a dockerd restart. My second thought was that since From @Mobe91 's last comments, it sounds like something needed to be pruned whether it was FOO_default or ingress. |
It were other manager nodes that showed 1 ingress network. I think I did not check the worker nodes.
Unfortunately, I did not save the output of |
Hi, Seems the same bug as mine with Docker 18.06.1 I should have opened it here maybe : docker/for-linux#424 |
I encounter the same problem and after few tests it's seems caused by network with duplicated name and local scope. It could be reproduced on a fresh Ubuntu 18.04 install root@server:~# docker --version
Docker version 18.09.5, build e8ff056
root@server:~# docker swarm init
Swarm initialized: current node (t8qsjoaroynwsxeq9la4f0i5b) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-24qa1rusq46mmgjah41z8pvtyghnlrz9g3u7q49keol2p0r5te-9ek99ggjlgrtnd2iiqy5h44bw 51.15.155.120:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
root@server:~# docker network create --scope=local test_stack_default
02a5d86ad0fa904c268efa3d0debe7efd83d9438eca7288372406c118bce36c4
root@server:~# docker network create --scope=swarm additional
nf74b2kya75mui7zzj5ofdqs8
root@server:~# cat test_stack.yml
version: '3.4'
services:
test_service:
image: "traefik"
networks:
- default
- additional
networks:
additional:
external: true
root@server:~# docker stack deploy -c test_stack.yml test_stack
Creating network test_stack_default
Creating service test_stack_test_service
root@server:~# docker network ls
NETWORK ID NAME DRIVER SCOPE
nf74b2kya75m additional bridge swarm
8d04df932427 bridge bridge local
7c3601152798 docker_gwbridge bridge local
150cf9c0525f host host local
xqz9vq824pa1 ingress overlay swarm
560fc2bccd2e none null local
02a5d86ad0fa test_stack_default bridge local
lrp4mt5zjwmf test_stack_default overlay swarm
root@server:~# docker service ps test_stack_test_service --no-trunc
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
8je5ervdw0udo880i4ryo0ri9 test_stack_test_service.1 traefik:latest@sha256:02cfdb77b0cd82d973dffb3dafe498283f82399bd75b335797d7f0fe3ebeccb8 server Running Running 1 second ago
root@server:~# service docker restart
root@server:~# docker service ps test_stack_test_service --no-trunc
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
kltfyg26kiydgoextni5kt7kb test_stack_test_service.1 traefik:latest@sha256:02cfdb77b0cd82d973dffb3dafe498283f82399bd75b335797d7f0fe3ebeccb8 server Ready Rejected 4 seconds ago "network nf74b2kya75mui7zzj5ofdqs8 exists"
t1ux7w6ulsildjqwut3mysrp1 \_ test_stack_test_service.1 traefik:latest@sha256:02cfdb77b0cd82d973dffb3dafe498283f82399bd75b335797d7f0fe3ebeccb8 server Shutdown Rejected 9 seconds ago "network nf74b2kya75mui7zzj5ofdqs8 exists"
8je5ervdw0udo880i4ryo0ri9 \_ test_stack_test_service.1 traefik:latest@sha256:02cfdb77b0cd82d973dffb3dafe498283f82399bd75b335797d7f0fe3ebeccb8 server Shutdown Complete 9 seconds ago
root@server:~# |
This just happened to me again.
I checked this time. Both ingress networks are marked as "Ingress: true" in the output of |
FYI @cypx checked with 19.03.5, still same behavior as your result. |
I noticed that my swarm has 2 ingress networks:
I think as a consequence, one of my services fails to start - it remains in state
starting
forever and when Idocker inspect
the correspoding container, it says:So it seems that it fails to find one of the 2 ingress networks.
EDIT 1
I checked my other running services and none of them uses ingress network
mfoezf9fniby
. So I trieddocker network rm mfoezf9fniby
but this fails withError response from daemon: network mfoezf9fnibyov8ps098ngvjy not found
. After that, runningdocker network ls
still shows the 2 ingress networks.EDIT 2
Running
docker network ls
on a different node only lists 1 ingress network (networkmfoezf9fniby
is gone). So it seems that the node on which the service task fails has stale data?Inspecting docker.log on the corrupt node constantly shows the following entries:
I tried
docker rm -f fc016a345607573568b64824f6a40dcc2226b4620641b5cad8613558d92d5809
which completed successfully. I turns out that this container was the service task that was in starting state forever. The service deployment then picked a different node automatically and launched a new service task. But again, the service could not be started. I randocker network ls
on the newly picked node and again, 2 ingress networks were shown (both with the same ID like on the original node). And again, the service could not be started.I should also mention that I am using docker-for-aws - don't know if that matters.
The text was updated successfully, but these errors were encountered: