Skip to content
This repository has been archived by the owner on Nov 19, 2021. It is now read-only.

Swarm Mode #28

Closed
wants to merge 2 commits into from
Closed

Swarm Mode #28

wants to merge 2 commits into from

Conversation

sstubbs
Copy link
Contributor

@sstubbs sstubbs commented Jul 13, 2019

Hi,
So I've been running some tests on creating it using a stack like we discussed in #27. I have managed to get it working and when I manually exec the containers they are working. However the healthchecks do not seem to work I will have a look again but just so you can test it if possible at some point here is how to run it.

run docker stack deploy --compose-file docker-compose.image.swarm.yml postgresxl
then run init-eg-swarm postgresxl

I was having issues with two overlay networks so am just using one for the time being.

@sstubbs sstubbs changed the title Hi, Swarm Mode Jul 13, 2019
tiredpixel added a commit that referenced this pull request Jul 14, 2019
Copy link
Owner

@tiredpixel tiredpixel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although broadly on the right path, I'm afraid I have various concerns about this approach: it carries security risks, particularly with the use of a single network rather than two; it won't detect cluster failure, particularly important when operating a multi-server Docker Swarm; it uses global strategy, which would lead to multiple conflicting containers being started and corrupting data on a multi-server Swarm; much of the approach can be simplified by use of docker stack deploy. I realise that the problem you were having with healthchecks was almost certainly because of how Swarm routes traffic: for healthchecked services, containers are only routable through the overlay network once they have passing healthcheck, meaning that the cluster can never stablise and bootstrap properly (in contract to Docker Compose, which by default operates differently).

I've put together an example Docker Swarm config which should solve all these problems. Note, however, that it has certain caveats, including: the subnets allocated to the 2 networks is critical (db_a before db_b); the node executed on must be a manager; all containers are assumed to live on a single host (since there is no docker service exec); a second deployment is needed in order to restore the healthchecks after the initialisation (although this could be worked around); the node is expected to be tagged with grp=dbxl (e.g. docker node update --label-add grp=dbxl). As I say, it's not possible to give a single deployment with works for everyone (not least because there's no requirements to run 2 Coordinators and 2 Datanodes; you could easily run 3 Coordinators and 8 Datanodes, using these images, or something else entirely). However, it should be enough to get you started (since you can simply adapt the new bin/init-eg-swarm script and docker-compose.swarm.yml), and should overcome the various risks above. I was successful just now bootstrapping a Postgres-XL cluster on Swarm using Docker 18.09.7, including with healthchecks, and the two networks, with secure defaults.

Please see 9f5f61c, and let me know how you get on.

CONTAINER_ID=$(${DOCKER_CMD} inspect --format '{{ .Status.ContainerStatus.ContainerID }}' $TASK_ID)
TASK_NAME=swarm_exec_${RANDOM}

TASK_ID=$(${DOCKER_CMD} service create \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of this is necessary in recent Docker versions, since you can use Docker Stack, e.g. docker stack deploy tmp.yml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK great thank you I will try it now I didn't realise there was a better way now. I found this code in this thread https://www.reddit.com/r/docker/comments/a5kbte/run_docker_exec_over_a_docker_swarm/

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method in the Reddit link requires that the Docker Engine is accessible remotely via a port. This isn't the default, and indeed is rather dangerous, unless secured very carefully. The default is to bind to a socket; hence, I'm pretty sure this wouldn't work.

--detach \
--name=${TASK_NAME} \
--restart-condition=none \
--mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. Why are you binding the Docker socket? Do you not have Docker installed and accessible in the host? If so, no socket-binding should be necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It runs a creates a temporary container on a foreign host. This was the workaround I have used before to exec into a container on a foreign swarm node. As I wasn't aware there is a better way.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, interesting… I'll try to think about this more carefully at some point. I'll admit I haven't really had much use for execution of temporary commands on remote Swarm nodes, so far; usually, I just control everything from manager nodes, and make simple services or even images for anything else. Given your explanation, it might well be that this method is fine. Certainly, it would relax some of the caveats in my own script—at the expense of loss of immediate status feedback, and indeed of guaranteeing the commands are even executed. I'll admit, when I prepare Postgres-XL on a Swarm, I don't use this method at all; I simply paste in the SQL clustering commands manually, after checking the pg_hba.conf files on the Coordinators and Datanodes. This round of work we've been doing is nice, though, in supplying automated setup examples, so I am pleased you asked. :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another method, of course, would be to override the bootstrapping entrypoints with your own setup code. If you placed your files to be executed in Docker configs, then they would get auto-distributed to the nodes, even for remote worker nodes. I feel this is a little out-of-scope for an example, though (although it likely wouldn't be too hard).

done

${DOCKER_CMD} service logs --raw ${TASK_ID}
${DOCKER_CMD} service rm ${TASK_ID} > /dev/null
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not actually necessary to destroy the service, but admittedly it depends on the approach.

- db_gtm_1:/var/lib/postgresql
networks:
- db
# healthcheck:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much recommend running with healthchecks; Postgres-XL doesn't always detect unhealthy clusters very well (at least, this used to be the case a year or so ago), and it's possible for a cluster to seem up and healthy, but to fail. The recent healthchecks work I did detects and handles this automatically, restarting nodes within a Postgres-XL cluster until it becomes stable.

# healthcheck:
# test: ["CMD", "docker-healthcheck-gtm"]
deploy:
mode: global
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right, to me. If you have a multi-node Swarm cluster, this will cause multiple deployments of the services, and since they require the data directory and can only run one copy of the service at once, it will almost certainly cause data corruption and a very broken cluster.

Copy link
Contributor Author

@sstubbs sstubbs Jul 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I misunderstood global mode. All I was trying to limit replicas to be safe but realise now --replicas=1 would do what I was expecint global too. I have only tested it on a single node swarm currently but will on a multi node one as soon as it's working properly on one.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--replicas=1 would indeed do what you expect—however, it's the default, so in fact you don't need it. You don't actually need the constraints section at all; I just include it because I'm presuming you're running non-trivial clusters (i.e. greater than 1-node). However, if I were actually to use this, I'd likely change the constraints to constrain to dbxl_coord_1 etc., so the containers would only ever be launched on nodes containing the data volume. Usually, this would constrain each container to be on a specific node, although of course if you had shared storage, it could also allow for safe failover of a specific Coordinator or Datanode still whilst assuming the 1-replica. Again, I think this is likely a bit out-of-scope, for this, especially as it would require you having a backend shared storage solution configured separately. Perfectly possible, though.

- PG_HOST=0.0.0.0
- PG_NODE=data_2
- PG_PORT=5432
image: pavouk0/postgres-xl:latest
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember latest is for testing, only; production should ping a specific tag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks will do.

image: pavouk0/postgres-xl:latest
command: docker-cmd-data
entrypoint: docker-entrypoint-data
depends_on:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if you used Stack deploy, this would be ignored.

db_data_1: {}
db_data_2: {}
networks:
db:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates a huge security risk. Since the Postgres-XL cluster doesn't support authentication for inter-node communictions, it needs to use the trust in pg_hba.conf. But using the same network for both cluster communication and access into the cluster from outside means that any roles and passwords you set will be silently ignored, and anything will be allowed direct access from anywhere. Not only that, but this create a cluster-corruption risk, since non-Postgres-XL services could talk directly to the Datanodes, rather than being forced to go through the Coordinators. This in turn means that Postgres-XL would not maintain its metadata, and the cluster would be highly like to corrupt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I didn't realise this. That is a big problem. Thanks for the clarification.

networks:
db:
driver: overlay
attachable: true
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for attachable, unless you're deploying as a standalone cluster that other Swarm services connect to. If attachable is desired, make sure it's for the Coordinator network only (db_b), not the inter-cluster network (db_a). These networks are named in order, since in fact because of how Docker Swarm sets up default routes within the container, another order will result in db_b being the default for Coordinators, leading to an authentication failure since they're not routing through the trust backend network. This can also happen even with Stack deploy, in some cases—safest is in fact to create the networks manually, or to check the networks to ensure that db_a has a lower IP subnet than db_b (given how the default healthchecks work, which rely on this, so as no longer to require variables passing in the network values).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you correct with this. I was just using attachable as I have another container that i was running psql from to test but I definitely wouldn't run it like this production.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attachable is fine, if this is your use-case. But it's not necessary if you have services launched in the same stack as your Postgres-XL cluster. If you'd rather configure separately, though, or are setting up a single Postgres-XL cluster for multiple services, then attachable is fine—or perhaps even publishing ports, if you're sure you're behind a properly-restricted firewall (or perhaps have multiple NICs, even). In the latter case, however, be very careful to only expose the Coordinator-only (db_b) overlay network; otherwise, you'll grant public access without any authentication, like noted above for pg_hba.conf trust vs md5 (if you're doing this, I highly recommend you test it carefully, too, before going live, in case there's a bug in this program somewhere—no warranty, etc. etc. etc. :) ).

@tiredpixel
Copy link
Owner

Closing as part of #27 .

@tiredpixel tiredpixel closed this Jul 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants