New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker service update messes up VIP tables and "tasks" DNS entries #26772

Closed
evanp opened this Issue Sep 21, 2016 · 23 comments

Comments

Projects
None yet
@evanp
Copy link

evanp commented Sep 21, 2016

Multiple docker service update calls make the VIP tables for the overlay network incorrect, and mess up the DNS lookups for tasks.<service name>.

Description

I noticed connectivity problems between services in my cluster. By launching a terminal and using curl and dig to examine the service names and "tasks" round-robin names, I realized that the map of IP addresses was incorrect.

Steps to reproduce the issue:

  1. Create a 3-node swarm with a single manager and an encrypted overlay network testnet. I used docker-machine with the digitalocean driver.
parallel docker-machine create --driver digitalocean --digitalocean-access-token xxxxxxxxxx --digitalocean-image ubuntu-16-04-x64 --digitalocean-region sfo1 --digitalocean-private-networking --digitalocean-size 512mb test\{\} ::: 20 21 22
docker $(docker-machine config test20) swarm init --advertise-addr xx.xx.xx.xx
docker $(docker-machine config test21) swarm join --token SWMTKN-1-xxxxxxx xx.xx.xx.xx:2377
docker $(docker-machine config test22) swarm join --token SWMTKN-1-xxxxxxx xx.xx.xx.xx:2377
docker $(docker-machine config test20) node ls
# Output:
#ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
#4qfqe8ru5z5afme3003m4luk5 *  test20    Ready   Active        Leader
#ajaw2v30hidqgahivczw146lf    test21    Ready   Active        
#auf08eohiu8lbx210piriqi7w    test22    Ready   Active        
docker $(docker-machine config test20) network create --driver overlay --opt encrypted testnet
  1. Create three simple web servers that show HTML with a line of text defined in an env var, and a terminal to query them.
for i in `seq 11 13`; do docker $(docker-machine config test20) service create --network testnet --name web${i} --env LINE="Test server ${i}" --replicas 3 fuzzyio/show-line; done
docker $(docker-machine config test20) service create --network testnet --name terminal ubuntu sleep 365d
  1. Within the terminal, query the web11 and tasks.web11 (and 12 and 13) dns entries with dig, and check the output from curl.
for i in `seq 11 13`; do curl http://web${i}/; done
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 12</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 13</p>
  </body>
</html>
root@d3026eb35a1d:/# dig tasks.web11

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web11
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27789
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web11.           IN  A

;; ANSWER SECTION:
tasks.web11.        600 IN  A   10.0.0.3
tasks.web11.        600 IN  A   10.0.0.4
tasks.web11.        600 IN  A   10.0.0.5

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:09:07 UTC 2016
;; MSG SIZE  rcvd: 110

root@d3026eb35a1d:/# dig tasks.web12

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web12
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15666
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web12.           IN  A

;; ANSWER SECTION:
tasks.web12.        600 IN  A   10.0.0.9
tasks.web12.        600 IN  A   10.0.0.7
tasks.web12.        600 IN  A   10.0.0.8

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:09:11 UTC 2016
;; MSG SIZE  rcvd: 110

root@d3026eb35a1d:/# dig tasks.web13

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web13
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35157
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web13.           IN  A

;; ANSWER SECTION:
tasks.web13.        600 IN  A   10.0.0.13
tasks.web13.        600 IN  A   10.0.0.11
tasks.web13.        600 IN  A   10.0.0.12

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:09:13 UTC 2016
;; MSG SIZE  rcvd: 110
  1. Do multiple service update calls per service. I did 19 updates, just changing the LINE environment variable.
for j in `seq 2 20`; do for i in `seq 11 13`; do docker $(docker-machine config test20) service update --env-add LINE="Test server ${i} update ${j}" web${i}; done; done
  1. Within the terminal service task container, again, use curl and dig to review the web11 and tasks.web11 DNS entries and the Web output.

Describe the results you received:

Lookup on web11 remained correct, but tasks.web11 has far too many IP addresses for scale=3 service.

dig tasks.web11

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web11
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23513
;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web11.           IN  A

;; ANSWER SECTION:
tasks.web11.        600 IN  A   10.0.0.11
tasks.web11.        600 IN  A   10.0.0.8
tasks.web11.        600 IN  A   10.0.0.3
tasks.web11.        600 IN  A   10.0.0.16
tasks.web11.        600 IN  A   10.0.0.14
tasks.web11.        600 IN  A   10.0.0.15
tasks.web11.        600 IN  A   10.0.0.4
tasks.web11.        600 IN  A   10.0.0.5

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:11:53 UTC 2016
;; MSG SIZE  rcvd: 245

curl sporadically failed to connect or connected to Web servers for different services.

for i in `seq 1 10`; do curl --connect-timeout 1 http://web11/; done   
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 19</p>
  </body>
</html>curl: (28) Connection timed out after 1001 milliseconds
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 12 update 20</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 13 update 20</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 20</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 20</p>
  </body>
</html>curl: (28) Connection timed out after 1000 milliseconds
curl: (28) Connection timed out after 1001 milliseconds
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 19</p>
  </body>
</html>curl: (28) Connection timed out after 1001 milliseconds

Describe the results you expected:

At scale=3, a lookup on tasks.web11 should return 3 IP addresses.

And the curl results (using the web11 name, which points to the VIP) should only return HTML from the server 11 service task containers.

Additional information you deem important (e.g. issue happens only occasionally):

The /proc/net/ip_vs output is attached.

I think this situation can arise with the ingress overlay network, too.

proc-net-ip_vs.txt

Output of docker version:

Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:33:38 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:33:38 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 16
 Running: 3
 Paused: 0
 Stopped: 13
Images: 1
Server Version: 1.12.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 42
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null host bridge overlay
Swarm: active
 NodeID: 4qfqe8ru5z5afme3003m4luk5
 Is Manager: true
 ClusterID: 5blzvlhz9l1f1uvpouwa9yzht
 Managers: 1
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 104.131.144.235
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-38-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 488.5 MiB
Name: test20
ID: DGRP:JBSI:O6PC:G2WE:ZGSI:EXKT:PWB6:WT5M:BVTQ:WYTY:BFWH:7UZH
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: evanp
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 provider=digitalocean
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):

@xiaods

This comment has been minimized.

Copy link
Contributor

xiaods commented Sep 21, 2016

i found current report issue almost point the issue for you report.

#25394
#26480

@evanp

This comment has been minimized.

Copy link
Author

evanp commented Sep 21, 2016

@xiaods almost, but not quite. #26480 says the names aren't available on different hosts. That's not what's happening with this issue; the names are available on all hosts. #25394 says that the round-robin isn't routing to tasks on different hosts; I'm not checking that in this issue.

Neither of those issues do an update, so I think this issue is different from both.

One thing to note: I just tried updating my test servers to 1.13-dev, and doing a few dozen updates, and the problem does not occur! I see a small list of IPs in tasks.web11, and a bunch of curl calls come back correctly.

I'm going to try testing a little more to see if I can make this happen again in 1.13-dev, but if not I'll close this bug.

@xiaods

This comment has been minimized.

Copy link
Contributor

xiaods commented Sep 21, 2016

@evanp i also came across the annoy issue related VIP. so wait your testing result.

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Sep 21, 2016

ping @mrjana ptal

@mrjana

This comment has been minimized.

Copy link
Contributor

mrjana commented Sep 21, 2016

@evanp Thanks for taking the effort to test these with docker/docker master code. Many more fixes were added there and please try to reproduce this problem there and let us know.

@evanp

This comment has been minimized.

Copy link
Author

evanp commented Sep 21, 2016

@mrjana We upgraded our 90-node cluster this morning and saw a great improvement.

However, later in the day after a few updates, we're again seeing this error. I'm going to see if I can get some more detail.

@evanp

This comment has been minimized.

Copy link
Author

evanp commented Sep 22, 2016

Also, as of right now the only way I see to repair this situation once it has arisen is to burn the cluster. It might be possible to just remove all the services, remove the network, and then re-add the network and all the services.

It would be nice if you could use ipvsadm to edit and tune the ip_vs network, but I haven't been able to make that work. Even if I manage to nsenter into the right network namespace on one node, and make changes to the ipvs tables, it doesn't propagate to other nodes.

Probably the most frustrating part of this situation is that the data is available and correct in requests like docker service inspect <serviceid> or docker inspect <taskid>. I wrote a quick script for extracting the data here https://gist.github.com/evanp/8be3e3536dcb27bf16f8d47cb4c93cf5 . It would be interesting to put together a script that could extract that data and then use ipvsadm to synch it with the ip_vs network.

I'm going to retry the test scenario outlined above with 1.13-dev and see if I can replicate it and possibly get some debug information from logs when it occurs.

@evanp

This comment has been minimized.

Copy link
Author

evanp commented Sep 22, 2016

So, I spent some time this morning trying to replicate the error, and I couldn't do it in a test environment as described above.

My next step is to set the debugging flag on a Docker node in our production environment and then do an update on a service in that node. If it causes the same problem, we'll at least have the logs necessary to explain it.

@xiaods

This comment has been minimized.

Copy link
Contributor

xiaods commented Sep 26, 2016

@evanp wait for your update.

@aluzzardi

This comment has been minimized.

Copy link
Member

aluzzardi commented Sep 27, 2016

@evanp Thanks for the detailed information!

We're actually in the process of building 1.12.2-rc1 today which contains the fixes found in 1.13-dev.

In a few hours you should be able to try that version (and the official 1.12.2 should be out in a matter of weeks).

@MichaelW-SD

This comment has been minimized.

Copy link

MichaelW-SD commented Sep 27, 2016

Don't know if it helps but I am seeing this problem in a situation where docker service update fails. I have a service that consists out of two containers but after a failed update (due to errors in the image) I have dig tasks.<service> showing me 3 IP addresses. Removing and recreating the service does not fix the problem.
Most annoying part is that there is no manual way to fix an issue in the internal DNS.

@evanp

This comment has been minimized.

Copy link
Author

evanp commented Sep 29, 2016

@MichaelW-SD what do you mean by "fail"? How does the update fail when there are errors "in the image"?

@clhlc

This comment has been minimized.

Copy link
Contributor

clhlc commented Sep 29, 2016

In my test enviroment , i find the same error when update service. and in my case ,i recreate service some time, so dig , output some diff vip . docker 1.13 fixed?

@woyorus

This comment has been minimized.

Copy link

woyorus commented Sep 29, 2016

@clhlc looks like 1.12.2-rc1 fixed the problem in my particular case.

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Sep 29, 2016

If others are able to test if 1.12.2-rc1 is fixing this, that would be great (note of course, it's an RC, so generally we don't recommend testing it on critical / production systems) https://github.com/docker/docker/releases/tag/v1.12.2-rc1

@MichaelW-SD

This comment has been minimized.

Copy link

MichaelW-SD commented Sep 29, 2016

@evanp In my case I had a faulty image so that containers based on that image would not start. docker service update keeps trying forever to update the containers but it never succeeds, so eventually you have to stop the update process. After that DNS was not cleaned up. After I fixed the image and started the service again I had 2 containers but 3 IP addresses. 2 addresses were correct and one was a leftover from before the update.

I will test this with 1.12.2-rc1.

@MichaelW-SD

This comment has been minimized.

Copy link

MichaelW-SD commented Oct 1, 2016

1.12.2-rc1 fixed this for me.

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Oct 1, 2016

@evanp was this fixed for you as well on 1.12.2-rc?

@evanp

This comment has been minimized.

Copy link
Author

evanp commented Oct 1, 2016

So, we still saw this error with 1.13-dev as of this morning. We've been unable to get any purchase on the bug, and so we're regretfully moving to another clustering tool. I'm happy to help out with this bug if there's anything further I can do, but we no longer have a production cluster running Docker 1.12.x in swarm mode.

Also, feel free to close this bug if there aren't others seeing the same problem.

@aluzzardi

This comment has been minimized.

Copy link
Member

aluzzardi commented Oct 1, 2016

/cc @mrjana

@mrjana

This comment has been minimized.

Copy link
Contributor

mrjana commented Oct 1, 2016

@evanp When you tried 1.13-dev from this morning can you tell me what failures you had? Did you have incorrect tasks.<svc> response or did you have connectivity problems or both? Although you described your problem in detail as part of opening this issue it may not be exactly the same issue you are experiencing now. That's why I am asking for the details. Can you give a simple set of repro steps of how you encountered issues with latest 1.13-dev? Also if you can, can you attach daemon logs from one of the problem nodes? Thanks much for your help in this.

@garthk

This comment has been minimized.

Copy link

garthk commented Oct 5, 2016

This issue smells like it's within cooee of #25266. My temperature is certainly within cooee of whomever on @evanp's team declared Swarm unfit for production. I'll try pounding docker service update. Meanwhile, I trust @mrjana is prowling the corridors looking for whomever wrote all this code without enough instrumentation to troubleshoot issues.

@mrjana

This comment has been minimized.

Copy link
Contributor

mrjana commented Oct 12, 2016

@garthk The instrumentation is really in the daemon error logs and every issue that is fixed in 1.12.2 as you can see is based on such instrumentation. That is why I am asking for daemon logs. Do you have any from problem nodes so that we can confirm or deny if this is already fixed in 1.12.2?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment