Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.12.0] docker volume (nfs) is not being mounted in all service replicas [swarm-mode] #25202

Closed
tiagoalves83 opened this issue Jul 29, 2016 · 21 comments

Comments

@tiagoalves83
Copy link

Output of docker version:

Client:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 23:54:00 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 23:54:00 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 3
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host null overlay
Swarm: active
 NodeID: 4ib33hyv2maivyi76a7xloffs
 Is Manager: true
 ClusterID: ci7yzh2bt1xw848mn9mx3g56k
 Managers: 1
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot interval: 10000
  Heartbeat tick: 1
  Election tick: 3
 Dispatcher:
  Heartbeat period: 5 seconds
 CA configuration:
  Expiry duration: 3 months
 Node Address: 192.168.99.103
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 4.4.16-boot2docker
Operating System: Boot2Docker 1.12.0 (TCL 7.2); HEAD : e030bab - Fri Jul 29 00:29:14 UTC 2016
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 492.6 MiB
Name: node-1
ID: PI2H:ZR5O:M65T:UOSX:XXYE:NMGO:P3G4:HBNT:F5AX:6ATC:WRAU:TLBA
Docker Root Dir: /mnt/sda1/var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 35
 Goroutines: 136
 System Time: 2016-07-29T01:59:55.599976175Z
 EventsListeners: 1
Registry: https://index.docker.io/v1/
Labels:
 provider=virtualbox
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):
docker-machine v 0.8.0
virtualbox
Mac OSX

Steps to reproduce the issue:

  1. Create a 3 node swarm mode cluster (1 Manager 2 Workers)
  2. Create a Volume (local nfs)
  3. Create a Service with 3 replicas that uses the created Volume

Describe the results you received:

$ showmount -e 192.168.99.1
Exports list on 192.168.99.1:
/Volumes/HDD/tmp                    192.168.99.0

$ docker-machine ls
NAME     ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER    ERRORS
node-1   *        virtualbox   Running   tcp://192.168.99.103:2376           v1.12.0   
node-2   -        virtualbox   Running   tcp://192.168.99.104:2376           v1.12.0   
node-3   -        virtualbox   Running   tcp://192.168.99.105:2376           v1.12.0   

$ docker node ls
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
0hppa4e08zz5f0m4yntmq5r2f    node-2    Ready   Active        
4ib33hyv2maivyi76a7xloffs *  node-1    Ready   Active        Leader
523g37vwkhjgzc5h1tmt7rtj5    node-3    Ready   Active        

# Testing manual NFS mounting in each Docker Host.
$ docker-machine ssh node-1 sudo mkdir -p /mnt/nfstest
$ docker-machine ssh node-1 sudo mount -t nfs 192.168.99.1:/Volumes/HDD/tmp /mnt/nfstest
$ docker-machine ssh node-1 sudo ls -la /mnt/nfstest # OK LISTS ALL FILES

$ docker-machine ssh node-2 sudo mkdir -p /mnt/nfstest
$ docker-machine ssh node-2 sudo mount -t nfs 192.168.99.1:/Volumes/HDD/tmp /mnt/nfstest
$ docker-machine ssh node-2 sudo ls -la /mnt/nfstest # OK LISTS ALL FILES

$ docker-machine ssh node-3 sudo mkdir -p /mnt/nfstest
$ docker-machine ssh node-3 sudo mount -t nfs 192.168.99.1:/Volumes/HDD/tmp /mnt/nfstest
$ docker-machine ssh node-3 sudo ls -la /mnt/nfstest # OK LISTS ALL FILES

$ docker-machine ssh node-1 sudo umount /mnt/nfstest
$ docker-machine ssh node-2 sudo umount /mnt/nfstest
$ docker-machine ssh node-3 sudo umount /mnt/nfstest

# Now, with Docker [Swarm Mode]
$ docker volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfstest # OK

$ docker service create --endpoint-mode dnsrr --replicas 3 --mount type=volume,source=nfstest,target=/mount --name nfstest alpine /bin/sh -c "while true; do echo 'OK'; sleep 3; done" # OK

$ docker service ps nfstest
ID                         NAME       IMAGE   NODE    DESIRED STATE  CURRENT STATE           ERROR
5z0vj3fa3wjwzzpjbmgj5wyzi  nfstest.1  alpine  node-1  Running        Running 38 minutes ago  
6p2dqcw4ufv2qzn5eikyge5tg  nfstest.2  alpine  node-3  Running        Running 38 minutes ago  
ewm7097mwmelkddhwq4k6m4up  nfstest.3  alpine  node-2  Running        Running 38 minutes ago  

# Testing /mount inside each container created.
# First NODE (MANAGER)
$ docker $(docker-machine config node-1) ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
a0dec4cea427        alpine:latest       "/bin/sh -c 'while tr"   39 minutes ago      Up 37 minutes                           nfstest.1.5z0vj3fa3wjwzzpjbmgj5wyzi

$ docker $(docker-machine config node-1) exec -it a0dec4cea427  /bin/sh -c "ls -la /mount" 
# OK LISTS ALL FILES

# Second NODE (Worker)
$ docker $(docker-machine config node-2) ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
eae8ca49b762        alpine:latest       "/bin/sh -c 'while tr"   40 minutes ago      Up 38 minutes                           nfstest.3.ewm7097mwmelkddhwq4k6m4up

$ docker $(docker-machine config node-2) exec -it eae8ca49b762  /bin/sh -c "ls -la /mount" 
# BUG # NOT OK - EMPTY FOLDER

# Third NODE (Worker)
$ docker $(docker-machine config node-3) ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
bf667e19969f        alpine:latest       "/bin/sh -c 'while tr"   41 minutes ago      Up 38 minutes                           nfstest.2.6p2dqcw4ufv2qzn5eikyge5tg

$ docker $(docker-machine config node-3) exec -it bf667e19969f /bin/sh -c "ls -la /mount" 
# BUG # NOT OK - EMPTY FOLDER

Describe the results you expected:

At # Second NODE (Worker) and # Third NODE (Worker), I expected to list NFS Volume Files as # First NODE (MANAGER)

Additional information you deem important (e.g. issue happens only occasionally):

Same issue happens in 1.12.0-rc2 , 1.12.0-rc4 and 1.12.0-rc5

@icecrime
Copy link
Contributor

Ping @stevvooe!

@cpuguy83
Copy link
Member

docker volume create is done at the node level, not at the cluster level.

Here is the syntax you should use:

docker service create --mount type=volume,volume-opt=o=addr=192.168.99.1,volume-opt=device=:/Volumes/HDD/tmp,volume-opt=type=nfs ...

@stevvooe
Copy link
Contributor

To clarify, docker volume create must be run on each node you intend to use it on.

Optionally, you can use the syntax provided by @cpuguy83 to auto-create the volume before the task runs on the target node.

Since the volume did not exist on Second and Third, I'm surprised this didn't fail. There may be a bug there.

@tiagoalves83
Copy link
Author

@cpuguy83 How should I use "source" option ?

$ docker service create --mount type=volume,volume-opt=o=addr=192.168.99.1,volume-opt=device=:/Volumes/HDD/tmp,volume-opt=type=nfs,target=/mount  --endpoint-mode dnsrr --replicas 3 --name nfstest alpine /bin/sh -c "while true; do echo 'OK'; sleep 3; done"
invalid argument "type=volume,volume-opt=o=addr=192.168.99.1,volume-opt=device=:/Volumes/HDD/tmp,volume-opt=type=nfs,target=/mount" for --mount: source is required when specifying volume-* options
See 'docker service create --help'.

@tiagoalves83
Copy link
Author

@stevvooe The volume is being created in all nodes that get a replica, but the NFS share is only being mounted in the node-1 host.

$ docker volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfstest
nfstest

$ docker $(docker-machine config node-1) volume ls
DRIVER              VOLUME NAME
local               nfstest

$ docker $(docker-machine config node-2) volume ls
DRIVER              VOLUME NAME

$ docker $(docker-machine config node-3) volume ls
DRIVER              VOLUME NAME

$ docker service create --endpoint-mode dnsrr --replicas 3 --mount type=volume,source=nfstest,target=/mount --name nfstest alpine /bin/sh -c "while true; do echo 'OK'; sleep 3; done" # OK

$ docker service ps nfstest
ID                         NAME       IMAGE   NODE    DESIRED STATE  CURRENT STATE               ERROR
0op97cw5413rzg2bky1wrk5g2  nfstest.1  alpine  node-1  Running        Running about a minute ago  
7vy9kinba7iccb3210u3s3py6  nfstest.2  alpine  node-3  Running        Running 2 minutes ago       
2aw92kh1r68hn3wc806cuck7x  nfstest.3  alpine  node-2  Running        Running about a minute ago  

$ docker $(docker-machine config node-1) volume ls
DRIVER              VOLUME NAME
local               nfstest

$ docker $(docker-machine config node-2) volume ls
DRIVER              VOLUME NAME
local               nfstest

$ docker $(docker-machine config node-3) volume ls
DRIVER              VOLUME NAME
local               nfstest

@tiagoalves83
Copy link
Author

@stevvooe After service rm the volumes still available for the nodes.

$ docker service rm nfstest
nfstest

$ docker service ps nfstest
Error: No such service: nfstest

$ docker $(docker-machine config node-1) volume ls
DRIVER              VOLUME NAME
local               nfstest

$ docker $(docker-machine config node-2) volume ls
DRIVER              VOLUME NAME
local               nfstest

$ docker $(docker-machine config node-3) volume ls
DRIVER              VOLUME NAME
local               nfstest

And I can re-create them with the same name, over and over ... (Is it the right behavior ?)

$ docker $(docker-machine config node-2) volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfstest
nfstest

$ docker $(docker-machine config node-3) volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfstest
nfstest

# Repeating ...
$ docker $(docker-machine config node-3) volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfstest
nfstest

# Repeating ...
$ docker $(docker-machine config node-3) volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfstest
nfstest

But even so, still no luck to list files in node-2 or node-3

$ docker service create --endpoint-mode dnsrr --replicas 3 --mount type=volume,source=nfstest,target=/mount --name nfstest-2 alpine /bin/sh -c "while true; do echo 'OK'; sleep 3; done"

$ docker service ls
ID            NAME       REPLICAS  IMAGE   COMMAND
13ckh5eo8gkk  nfstest-2  3/3       alpine  /bin/sh -c while true; do echo 'OK'; sleep 3; done

$ CONTAINER_ID=$(docker $(docker-machine config node-1) ps --format={{.ID}})
$ docker $(docker-machine config node-1) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# LISTS FILES OK

$ CONTAINER_ID=$(docker $(docker-machine config node-2) ps --format={{.ID}})
$ docker $(docker-machine config node-2) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# ERROR - EMPTY FOLDER

$ CONTAINER_ID=$(docker $(docker-machine config node-3) ps --format={{.ID}})
$ docker $(docker-machine config node-3) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# ERROR - EMPTY FOLDER

I will create a new volume with different name in every node and after that I will create the service with 3 replicas ...

@tiagoalves83
Copy link
Author

@stevvooe Good news! It works when creating the volume in each node, and after that create the service with 3 replicas ...

$ docker $(docker-machine config node-1) volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfsshare
nfsshare

$ docker $(docker-machine config node-2) volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfsshare
nfsshare

$ docker $(docker-machine config node-3) volume create --driver local --opt type=nfs --opt o=addr=192.168.99.1,rw --opt device=:/Volumes/HDD/tmp --name nfsshare
nfsshare

$ docker service create --mount type=volume,source=nfsshare,target=/mount --endpoint-mode dnsrr --replicas 3 --name nfstest-3 alpine /bin/sh -c "while true; do echo 'OK'; sleep 3; done"
emlmvblmhrupuvo4fhutb21p5

$ docker service ps nfstest-3
ID                         NAME         IMAGE   NODE    DESIRED STATE  CURRENT STATE           ERROR
2oazipu12so9qk4q4owwrl69p  nfstest-3.1  alpine  node-2  Running        Running 19 minutes ago  
bycti2i1pr7el7a9ld1q175q2  nfstest-3.2  alpine  node-3  Running        Running 17 seconds ago  
8chmhdfypehulq41ez0q3q54b  nfstest-3.3  alpine  node-1  Running        Running 19 minutes ago  

$ CONTAINER_ID=$(docker $(docker-machine config node-1) ps --format={{.ID}})
$ docker $(docker-machine config node-1) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# LISTS FILES OK

$ CONTAINER_ID=$(docker $(docker-machine config node-2) ps --format={{.ID}})
$ docker $(docker-machine config node-2) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# LISTS FILES OK

$ CONTAINER_ID=$(docker $(docker-machine config node-3) ps --format={{.ID}})
$ docker $(docker-machine config node-3) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# LISTS FILES OK

I believe the behavior for services with volumes should be one of the following:
a) swarm creates the nfs volume (like is already doing) and mount the nfs share (it is not working) before it schedules the replicas;
b) swarm only schedules replicas where the nfs volume is already created; (I believe this is how it should work at the moment, right ?)

Thank you for patience and help ...

@cpuguy83
Copy link
Member

This works if you use the syntax I have you. Source is the volume name.

@tiagoalves83
Copy link
Author

Thank you @cpuguy83, I got it now. "source" and "target" are mandatory options when using volume-*. The volume will be created in each node with the "source" name.

$ docker service create --mount type=volume,volume-opt=o=addr=192.168.99.1,volume-opt=device=:/Volumes/HDD/tmp,volume-opt=type=nfs,source=xyz,target=/mount --replicas 3 --name nfstest-4 alpine /bin/sh -c "while true; do echo 'OK'; sleep 3; done"
174pgiyl3luycrdanit2gbyyg

$ docker $(docker-machine config node-1) volume ls
DRIVER              VOLUME NAME
local               nfsshare
local               xyz

$ docker $(docker-machine config node-2) volume ls
DRIVER              VOLUME NAME
local               nfsshare
local               xyz

$ docker $(docker-machine config node-3) volume ls
DRIVER              VOLUME NAME
local               nfsshare
local               xyz

$ docker service ps nfstest-4
ID                         NAME         IMAGE   NODE    DESIRED STATE  CURRENT STATE          ERROR
3wzlwclenmi5sgkq03wcwq6sq  nfstest-4.1  alpine  node-2  Running        Running 4 minutes ago  
d4qb7tujrn2oxp0he44507s4s  nfstest-4.2  alpine  node-3  Running        Running 4 minutes ago  
1zxav7hnq3hdev3wlfu0dcag8  nfstest-4.3  alpine  node-1  Running        Running 4 minutes ago  

$ CONTAINER_ID=$(docker $(docker-machine config node-1) ps --format={{.ID}})
$ docker $(docker-machine config node-1) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# LISTS FILES OK

$ CONTAINER_ID=$(docker $(docker-machine config node-2) ps --format={{.ID}})
$ docker $(docker-machine config node-2) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# LISTS FILES OK

$ CONTAINER_ID=$(docker $(docker-machine config node-3) ps --format={{.ID}})
$ docker $(docker-machine config node-3) exec -it $CONTAINER_ID /bin/sh -c "ls -la /mount"
# LISTS FILES OK

@manast
Copy link

manast commented Sep 30, 2016

@cpuguy83 your proposed solution is a workaround or the expected way of working?
I would expect that when I create a volume in the master node, that volume behaves identically on all the other nodes in the swarm, and also I should not need to create the service with a lot of volume specific details, it defeats the purpose of having volumes imho.

@cpuguy83
Copy link
Member

@manast Why would it defeat the purpose of the volume? You still have to tell the service to go get the volume, this just allows you to also tell it how to create the volume as well.

There is no built-in cluster-wide driver for volumes like there is for networking.

Note @tiagoalves83 Source is being removed as a requirement for 1.13.

I'm going to go ahead and close this as it is resolved.
Thanks all!

@manast
Copy link

manast commented Sep 30, 2016

@cpuguy83 because, as I understand, one of the purposes of volumes is to allow separation of concerns. I define my volumes elsewhere, and then I just use them in my services. I should not need to know anything about how a volume is created in order to use it from a service. Maybe I am missing something because this seems quite important to me.

@cpuguy83
Copy link
Member

@manast The service object is an orchestrator. I think it's quite important here to define how the service is to operate.

Using the local volume driver is always going to be local to the node it's created on, no matter what other features we may add in the future.
There are some drivers that provide cluster-wide access (where "cluster" is really defined by the driver, not docker/swarm) and nothing you can indeed use these with services.

Also specifically, services have mounts, not volumes. Mounts can be provided by a volume. Mounts can also be updated/changed by the operator after a service is created.
I say this because it's a very clear distinction we've chosen to make.
Where if/when a service defined volumes, these volumes would be orchestrated objects as well, where as mounts are something the operator (sort of) orchestrates.

@manast
Copy link

manast commented Sep 30, 2016

@cpuguy83 I really don't get it. I am probably missing something very fundamental here, but I see many others having the same issues and concerns, so probably the current behaviour is not so intuitive. For instance, why if I create a nfs volume using netshare plugin, when I create a service using that said volume, all the nodes where the service start containers on also get a volume, with the same name, but it is not using the nfs driver, it uses the local driver, which means the container starts and run, but using wrong storage. Either you give an error, or it works, because otherwise this silent failing, or the almost-working-but-not-really kind of behaviour makes very difficult to understand what is the correct way of doing things...

@cpuguy83
Copy link
Member

@manast Feel free to make a proposal in a separate issue/PR.
This is unfortunately not too much different than how docker run works.
We could error out if the volume doesn't exist on the node and no create options were supplied.

I still think you should be defining your volumes in the mount spec if it is important to the service.

@manast
Copy link

manast commented Sep 30, 2016

@cpuguy83 ok, I will, no problem :). Its just that until very recently (today?) I did not know that volumes behave the way they do, although I have been setting up a swarm cluster for several weeks now, which indicates me that either the documentation needs some completion, or the functionality is not as intuitive as it could be (or that I am stupid, but I need to rule that out for my own sake).

@gabegarz
Copy link

@tiagoalves83 when you did your test for the nfs mount was the nfs mounted on the host. I have a NAS with an ip 192.168.12.52 which is mounted on all three nodes. I want the swarm container to be able to access it from where ever they are deployed. So far I cannot get it to work by replicating your steps.

Create this on all three nodes

docker volume create --driver local --opt type=nfs --opt o=addr=192.168.12.52,rw --opt device=:/media/dataShare/BAS --name nfsshare

docker service create --mount type=volume,source=nfsshare,target=/media --replicas 3 --name nfstest-3 alpine /bin/sh -c "while true; do echo 'OK'; sleep 3; done"

where /media/dataShare/BAS is the mapped drive on the host. Does this look correct? Aslo we have a usern and passwrod set up do I need to enter that as well in the driver-opts?

@manast
Copy link

manast commented Feb 23, 2017

@gabegarz as mentioned above by @cpuguy83 you should define the volume in service create --mount (i.e. do not use volume create at all).

@brainet
Copy link

brainet commented May 3, 2017

In my case, NFS server's share folder permission is nfs:nfs (uid=1001,gid=1001)
and the nfs config /etc/exports is

/home/nfs/share/redis 192.168.0.100(rw,insecure,all_squash,anonuid=1001,anongid=1001,no_subtree_check)

In client, I have some docker machine nodes, and services.
this is my nfs service:

docker service create \
  --name redis \
  --replicas 4 \
  --mount type=volume,source=test-volume,destination=/var/lib/redis,volume-driver=local,volume-opt=type=nfs,volume-opt=o=addr=192.168.0.99,volume-opt=device=:/home/nfs/share/redis\
  redis:latest

The service will create nfs volume on node.

[
    {
        "Driver": "local",
        "Labels": {},
        "Mountpoint": "/mnt/sda1/var/lib/docker/volumes/test-volume/_data",
        "Name": "test-volume",
        "Options": {
            "device": ":/home/nfs/share/redis",
            "o": "addr=192.168.0.99",
            "type": "nfs"
        },
        "Scope": "local"
    }
]

Then get errors

chown /mnt/sda1/var/lib/docker/volumes/test-volume/_data: operation not permitted

this is volume folder

$ sudo ls  -lah /mnt/sda1/var/lib/docker/volumes/test-volume
total 44
drwx------    3 root     root        4.0K May  3 09:25 .
drwx--x--x   11 root     root        4.0K May  2 05:20 ..
-rw-------    1 root     root       64.0K May  3 09:25 metadata.db
drwxr-xr-x    3 root     root        4.0K May  3 09:25 test-volume

change the NFS server's folder to root:root is ok. but not safe.

so what should I do with swarm mode + NFS ?

@cpuguy83
Copy link
Member

cpuguy83 commented May 3, 2017

You need to use volume-nocopy or allow chowning by setting no_root_squash

@thaJeztah
Copy link
Member

I'm locking the conversation on this issue. Please keep in mind that the GitHub issue tracker is not intended as a general support forum, but for reporting bugs and feature requests. For other type of questions, consider using one of;

@moby moby locked and limited conversation to collaborators May 4, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants