Swarm mode : number of replicas exceeding the desired count #36553

vce-xx · 2018-03-10T09:48:10Z

Description

Notice bellow that docker stack services reported 2/1 replicas.

 vce > docker stack services blue
ID                  NAME                MODE                REPLICAS            IMAGE                                            PORTS
...
lm3nueuigt0x        blue_info           replicated          2/1                 xxx/info:723                                     *:30018->80/tcp
...
 vce > docker service ps blue_info
ID               NAME                IMAGE            NODE      DESIRED STATE     CURRENT STATE           ERROR               PORTS
jns2wmbzl9vx     blue_info.1         xxx/info:721     node1     Shutdown          Shutdown 14 hours ago
0b5dhaw8g1kb      \_ blue_info.1     xxx/info:626     node1     Shutdown          Running 4 days ago
zrju1v7w3f68      \_ blue_info.1     xxx/info:622     node1     Shutdown          Shutdown 4 days ago
l29nz3bed8pg     blue_info.2         xxx/info:723     node2     Running           Running 2 minutes ago

Steps to reproduce the issue:

No recipe to reproduce yet.
What I have done :

The info service was defined as described bellow
The stack was deployed and redeployed over and over with docker stack deploy -c
Downscaled the managers from 3 to 1 at some point
Continued to redeploy util I noticed this oddity

version: "3.5"

services:

  info:
    image: xxx/info:${TAG}
    deploy:
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3
        window: 15s
    healthcheck:
      interval: 10s
    networks:
      - mynet
    ports:
      - 80

Describe the results you received:

We can see 2 tasks in Running state.

Describe the results you expected:

1 task in Running state is expected.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

docker version
Client:
 Version:	18.03.0-ce-rc1
 API version:	1.35 (downgraded from 1.37)
 Go version:	go1.9.4
 Git commit:	c160c73
 Built:	Thu Feb 22 02:34:03 2018
 OS/Arch:	darwin/amd64
 Experimental:	false
 Orchestrator:	swarm

Server:
 Engine:
  Version:	17.12.0-ce
  API version:	1.35 (minimum version 1.12)
  Go version:	go1.9.2
  Git commit:	c97c6d6
  Built:	Wed Dec 27 20:12:30 2017
  OS/Arch:	linux/amd64
  Experimental:	true

Output of docker info:

Containers: 19
 Running: 16
 Paused: 0
 Stopped: 3
Images: 52
Server Version: 17.12.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: awslogs
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: onoskypeib1tvsu0nheyjwdsg
 Is Manager: true
 ClusterID: mgn365vf830acn90tf81ftmvv
 Managers: 1
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: xxx
 Manager Addresses:
  xxx
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 89623f28b87a6004d4b785663257362d1658a729
runc version: b2567b37d7b75eb4cf325b77297b140ea686ce8f
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.75-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.853GiB
Name: xxx
ID: xxx
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 249
 Goroutines: 429
 System Time: 2018-03-10T09:14:34.020756034Z
 EventsListeners: 12
Registry: https://index.docker.io/v1/
Labels:
 availability_zone=xxx
 instance_type=xxx
 node_type=manager
 os=linux
 region=xxx
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):
Swarm cluster with only one manager in this occurence.

The text was updated successfully, but these errors were encountered:

rnataraja · 2018-03-17T17:43:02Z

Seen on 17.06.2-ce too. Potentially the trigger was NTP time sync difference of about 15mins.

vce-xx · 2018-03-23T16:46:22Z

In this case it happened with Docker for AWS.
Can time desync happen with Docker for AWS ?
Does Docker for AWS use Amazon Time Sync Service ?

GordonTheTurtle added the area/swarm label Mar 10, 2018

JnMik mentioned this issue Mar 19, 2018

Can't stop docker container #35933

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swarm mode : number of replicas exceeding the desired count #36553

Swarm mode : number of replicas exceeding the desired count #36553

vce-xx commented Mar 10, 2018

rnataraja commented Mar 17, 2018

vce-xx commented Mar 23, 2018

Swarm mode : number of replicas exceeding the desired count #36553

Swarm mode : number of replicas exceeding the desired count #36553

Comments

vce-xx commented Mar 10, 2018

rnataraja commented Mar 17, 2018

vce-xx commented Mar 23, 2018