Swarm Mode #28

sstubbs · 2019-07-13T04:21:10Z

Hi,
So I've been running some tests on creating it using a stack like we discussed in #27. I have managed to get it working and when I manually exec the containers they are working. However the healthchecks do not seem to work I will have a look again but just so you can test it if possible at some point here is how to run it.

run docker stack deploy --compose-file docker-compose.image.swarm.yml postgresxl
then run init-eg-swarm postgresxl

I was having issues with two overlay networks so am just using one for the time being.

Per discussions in: - #27 - #28

tiredpixel

Although broadly on the right path, I'm afraid I have various concerns about this approach: it carries security risks, particularly with the use of a single network rather than two; it won't detect cluster failure, particularly important when operating a multi-server Docker Swarm; it uses global strategy, which would lead to multiple conflicting containers being started and corrupting data on a multi-server Swarm; much of the approach can be simplified by use of docker stack deploy. I realise that the problem you were having with healthchecks was almost certainly because of how Swarm routes traffic: for healthchecked services, containers are only routable through the overlay network once they have passing healthcheck, meaning that the cluster can never stablise and bootstrap properly (in contract to Docker Compose, which by default operates differently).

I've put together an example Docker Swarm config which should solve all these problems. Note, however, that it has certain caveats, including: the subnets allocated to the 2 networks is critical (db_a before db_b); the node executed on must be a manager; all containers are assumed to live on a single host (since there is no docker service exec); a second deployment is needed in order to restore the healthchecks after the initialisation (although this could be worked around); the node is expected to be tagged with grp=dbxl (e.g. docker node update --label-add grp=dbxl). As I say, it's not possible to give a single deployment with works for everyone (not least because there's no requirements to run 2 Coordinators and 2 Datanodes; you could easily run 3 Coordinators and 8 Datanodes, using these images, or something else entirely). However, it should be enough to get you started (since you can simply adapt the new bin/init-eg-swarm script and docker-compose.swarm.yml), and should overcome the various risks above. I was successful just now bootstrapping a Postgres-XL cluster on Swarm using Docker 18.09.7, including with healthchecks, and the two networks, with secure defaults.

Please see 9f5f61c, and let me know how you get on.

tiredpixel · 2019-07-14T10:54:23Z

bin/execute-service

+CONTAINER_ID=$(${DOCKER_CMD} inspect --format '{{ .Status.ContainerStatus.ContainerID }}' $TASK_ID)
+TASK_NAME=swarm_exec_${RANDOM}
+
+TASK_ID=$(${DOCKER_CMD} service create \


None of this is necessary in recent Docker versions, since you can use Docker Stack, e.g. docker stack deploy tmp.yml.

OK great thank you I will try it now I didn't realise there was a better way now. I found this code in this thread https://www.reddit.com/r/docker/comments/a5kbte/run_docker_exec_over_a_docker_swarm/

The method in the Reddit link requires that the Docker Engine is accessible remotely via a port. This isn't the default, and indeed is rather dangerous, unless secured very carefully. The default is to bind to a socket; hence, I'm pretty sure this wouldn't work.

tiredpixel · 2019-07-14T10:55:17Z

bin/execute-service

+  --detach \
+  --name=${TASK_NAME} \
+  --restart-condition=none \
+  --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \


I don't understand this. Why are you binding the Docker socket? Do you not have Docker installed and accessible in the host? If so, no socket-binding should be necessary.

It runs a creates a temporary container on a foreign host. This was the workaround I have used before to exec into a container on a foreign swarm node. As I wasn't aware there is a better way.

Aha, interesting… I'll try to think about this more carefully at some point. I'll admit I haven't really had much use for execution of temporary commands on remote Swarm nodes, so far; usually, I just control everything from manager nodes, and make simple services or even images for anything else. Given your explanation, it might well be that this method is fine. Certainly, it would relax some of the caveats in my own script—at the expense of loss of immediate status feedback, and indeed of guaranteeing the commands are even executed. I'll admit, when I prepare Postgres-XL on a Swarm, I don't use this method at all; I simply paste in the SQL clustering commands manually, after checking the pg_hba.conf files on the Coordinators and Datanodes. This round of work we've been doing is nice, though, in supplying automated setup examples, so I am pleased you asked. :)

Another method, of course, would be to override the bootstrapping entrypoints with your own setup code. If you placed your files to be executed in Docker configs, then they would get auto-distributed to the nodes, even for remote worker nodes. I feel this is a little out-of-scope for an example, though (although it likely wouldn't be too hard).

tiredpixel · 2019-07-14T10:55:45Z

bin/execute-service

+done
+
+${DOCKER_CMD} service logs --raw ${TASK_ID}
+${DOCKER_CMD} service rm ${TASK_ID} > /dev/null


It's not actually necessary to destroy the service, but admittedly it depends on the approach.

tiredpixel · 2019-07-14T10:57:19Z